Frequently Asked Questions: Assembly Releases and Versions

Topics


Return to FAQ Table of Contents

List of UCSC genome releases

How do UCSC's release numbers correspond to those of other organizations, such as NCBI?

The first release of an assembly is given a name using the first three characters of the organism's genus and species classification in the format gggSss#, with subsequent assemblies incrementing the number. Assemblies predating the 2003 introduction of the six-letter naming system were given two-letter names in a similar gs# format and human assemblies are named hg# for human genome.

SPECIES UCSC VERSION RELEASE DATE RELEASE NAME STATUS
MAMMALS
Humanhs1Jan. 2022T2T Consortium CHM13v2.0Available
hg38Dec. 2013Genome Reference Consortium GRCh38Available
hg19Feb. 2009Genome Reference Consortium GRCh37Available
hg18Mar. 2006NCBI Build 36.1Available
hg17May 2004NCBI Build 35Available
hg16Jul. 2003NCBI Build 34Available
hg15Apr. 2003NCBI Build 33Archived
hg13Nov. 2002NCBI Build 31Archived
hg12Jun. 2002NCBI Build 30Archived
hg11Apr. 2002NCBI Build 29Archived (data only)
hg10Dec. 2001NCBI Build 28Archived (data only)
hg8Aug. 2001UCSC-assembledArchived (data only)
hg7Apr. 2001UCSC-assembledArchived (data only)
hg6Dec. 2000UCSC-assembledArchived (data only)
hg5Oct. 2000UCSC-assembledArchived (data only)
hg4Sep. 2000UCSC-assembledArchived (data only)
hg3Jul. 2000UCSC-assembledArchived (data only)
hg2Jun. 2000UCSC-assembledArchived (data only)
hg1May 2000UCSC-assembledArchived (data only)
AlpacavicPac2Mar. 2013Broad Institute Vicugna_pacos-2.0.1Available
vicPac1Jul. 2008Broad Institute VicPac1.0Available
ArmadillodasNov3Dec. 2011Broad Institute DasNov3Available
BaboonpapAnu4Apr. 2017Human Genome Sequencing CenterAvailable
papAnu2Mar. 2012Baylor College of Medicine Panu_2.0Available
papHam1Nov. 2008Baylor College of Medicine HGSC Pham_1.0Available
BisonbisBis1Oct. 2014Univ. of Maryland Bison_UMD1.0Available
BonobopanPan3May 2020University of WashingtonAvailable
panPan2Dec. 2015Max-Planck Institute for Evolutionary Anthropology panpan1.1Available
panPan1May 2012Max-Planck Institute panpan1Available
Brown kiwiaptMan1Jun. 2015Max-Planck Institute for Evolutionary Anthropology AptMant0Available
BushbabyotoGar3Mar. 2011Broad Institute OtoGar3Available
CatfelCat9Nov. 2017 Genome Sequencing Center (GSC) at Washington University (WashU) School of Medicine Felis_catus_9.0Available
felCat8Nov. 2014ICGSC Felis_catus_8.0Available
felCat5Sep. 2011ICGSC Felis_catus-6.2Available
felCat4Dec. 2008NHGRI catChrV17eAvailable
felCat3Mar. 2006Broad Institute Release 3Available
ChimppanTro6Jan. 2018Clint_PTRv2Available
panTro5May 2016CGSC Build 3.0Available
panTro4Feb. 2011CGSC Build 2.1.4Available
panTro3Oct. 2010CGSC Build 2.1.3Available
panTro2Mar. 2006CGSC Build 2.1Available
panTro1Nov. 2003CGSC Build 1.1Available
Chinese hamstercriGri1Jul. 2013Beijing Genomics Institution-Shenzhen C_griseus_v1.0Available
Chinese hamster ovary cell linecriGriChoV2Jun. 2017Eagle Genomics Ltd CHOK1S_HZDv1Available
criGriChoV1Aug. 2011Beijing Genomics Institute CriGri_1.0Available
Chinese pangolinmanPen1Aug. 2014Washington University (WashU) M_pentadactyla-1.1.1Available
CowbosTau9Apr. 2018USDA ARSAvailable
bosTau8Jun. 2014University of Maryland v3.1.1Available
bosTau7Oct. 2011Baylor College of Medicine HGSC Btau_4.6.1Available
bosTau6Nov. 2009University of Maryland v3.1Available
bosTau4Oct. 2007Baylor College of Medicine HGSC Btau_4.0Available
bosTau3Aug. 2006Baylor College of Medicine HGSC Btau_3.1Available
bosTau2Mar. 2005Baylor College of Medicine HGSC Btau_2.0Available
bosTau1Sep. 2004Baylor College of Medicine HGSC Btau_1.0Archived
Crab-eating macaquemacFas5Jun. 2013Washington University Macaca_fascicularis_5.0Available
DogcanFam6Oct. 2020Dog Genome Sequencing Consortium Dog10K_Boxer_TashaAvailable
canFam5May 2019University of MichiganAvailable
canFam4Mar. 2020Uppsala UniversityAvailable
canFam3Sep. 2011Broad Institute v3.1Available
canFam2May 2005Broad Institute v2.0Available
canFam1Jul. 2004Broad Institute v1.0Available
DolphinturTru2Oct. 2011Baylor College of Medicine Ttru_1.4Available
ElephantloxAfr3Jul. 2009Broad Institute LoxAfr3Available
FerretmusFur1Apr. 2011Ferret Genome Sequencing Consortium MusPutFur1.0Available
Garter snakethaSir1Jun. 2015Washington University Thamnophis_sirtalis-6.0Available
GibbonnomLeu3Oct. 2012Gibbon Genome Sequencing Consortium Nleu3.0Available
nomLeu2Jun. 2011Gibbon Genome Sequencing Consortium Nleu1.1Available
nomLeu1Jan. 2010Gibbon Genome Sequencing Consortium Nleu1.0Available
Golden eagleaquChr2Oct. 2014University of Washington aquChr2-1.0.2Available
Golden snub-nosed monkeyrhiRox1Oct. 2014Novogene Rrox_v1Available
GorillagorGor6Aug. 2019University of WashingtonAvailable
gorGor5Mar. 2016University of Washington GSMRT3Available
gorGor4Dec. 2014Wellcome Trust Sanger Institute gorGor4Available
gorGor3May 2011Wellcome Trust Sanger Institute gorGor3.1Available
Green MonkeychlSab2Mar. 2014Vervet Genomics Consortium 1.1Available
Guinea pigcavPor3Feb. 2008Broad Institute cavPor3Available
Hawaiian monk sealneoSch1Jun. 2017Johns Hopkins University ASM220157v1Available
HedgehogeriEur2May 2012Broad Institute EriEur2.0Available
eriEur1Jun. 2006Broad Institute Draft_v1Available
HorseequCab3Jan. 2018University of LouisvilleAvailable
equCab2Sep. 2007Broad Institute EquCab2Available
equCab1Jan. 2007Broad Institute EquCab1Available
Kangaroo ratdipOrd1Jul. 2008Baylor/Broad Institute DipOrd1.0Available
Little brown batmyoLuc2Jul. 2010Broad Institute MyoLuc2.0Available
Malayan flying lemurgalVar1Jul. 2014WashU G_variegatus-3.0.2Available
ManateetriMan1Oct. 2011Broad Institute TriManLat1.0Available
MarmosetcalJac4May 2020Washington University Callithrix_jacchus_cj1700_1.1Available
MarmosetcalJac3Mar. 2009WUSTL Callithrix_jacchus-v3.2Available
calJac1Jun. 2007WUSTL Callithrix_jacchus-v2.0.2Available
MegabatpteVam1Jul. 2008Broad Institute Ptevap1.0Available
Minke whalebalAcu1Oct. 2013KORDI BalAcu1.0Available
Mousemm39Jun. 2020Genome Reference Consortium Mouse Build 39Available
mm10Dec. 2011Genome Reference Consortium GRCm38Available
mm9Jul. 2007NCBI Build 37Available
mm8Feb. 2006NCBI Build 36Available
mm7Aug. 2005NCBI Build 35Available
mm6Mar. 2005NCBI Build 34Archived
mm5May 2004NCBI Build 33Archived
mm4Oct. 2003NCBI Build 32Archived
mm3Feb. 2003NCBI Build 30Archived
mm2Feb. 2002MGSCv3Archived
mm1Nov. 2001MGSCv2Archived (data only)
Mouse lemurmicMur2May 2015Baylor/Broad Institute Mmur_2.0Available
micMur1Jul. 2007Broad Institute MicMur1.0Available
Naked mole-rathetGla2Jan. 2012Broad Institute HetGla_female_1.0Available
hetGla1Jul. 2011Beijing Genomics Institute HetGla_1.0Available
OpossummonDom5Oct. 2006Broad Institute release MonDom5Available
monDom4Jan. 2006Broad Institute release MonDom4Available
monDom1Oct. 2004Broad Institute release MonDom1Available
OrangutanponAbe2Jul. 2007WUSTL Pongo_albelii-2.0.2Available
ponAbe3Jan. 2018Susie_PABv2/ponAbe3Available
PandaailMel1Dec. 2009BGI-Shenzhen AilMel 1.0Available
PigsusScr11Feb. 2017Swine Genome Sequencing Consortium Sscrofa11.1Available
susScr3Aug. 2011Swine Genome Sequencing Consortium Sscrofa10.2Available
susScr2Nov. 2009Swine Genome Sequencing Consortium Sscrofa9.2Available
PikaochPri3May 2012Broad Institute OchPri3.0Available
ochPri2Jul. 2008Broad Institute OchPri2Available
PlatypusornAna2Feb. 2007WUSTL v5.0.1Available
ornAna1Mar. 2007WUSTL v5.0.1Available
Proboscis MonkeynasLar1Nov. 2014Proboscis Monkey Functional Genome Consortium Charlie1.0Available
RabbitoryCun2Apr. 2009Broad Institute release OryCun2Available
Ratrn7Nov. 2020Wellcome Sanger Institute mRatBN7.2Available
rn6Jul. 2014RGSC Rnor_6.0Available
rn5Mar. 2012RGSC Rnor_5.0Available
rn4Nov. 2004Baylor College of Medicine HGSC v3.4Available
rn3Jun. 2003Baylor College of Medicine HGSC v3.1Available
rn2Jan. 2003Baylor College of Medicine HGSC v2.1Archived
rn1Nov. 2002Baylor College of Medicine HGSC v1.0Archived
RhesusrheMac10Feb. 2019The Genome Institute at Washington University School of Medicine Mmul_10Available
rheMac8Nov. 2015Baylor College of Medicine HGSC Mmul_8.0.1Available
rheMac3Oct. 2010Beijing Genomics Institute CR_1.0Available
rheMac2Jan. 2006Baylor College of Medicine HGSC v1.0 Mmul_051212Available
rheMac1Jan. 2005Baylor College of Medicine HGSC Mmul_0.1Archived
Rock hyraxproCap1Jul. 2008Baylor College of Medicine HGSC Procap1.0Available
SheepoviAri4Dec. 2015ISGC Oar_v4.0Available
oviAri3Aug. 2012ISGC Oar_v3.1Available
oviAri1Feb. 2010ISGC Ovis aries 1.0Available
ShrewsorAra2Aug. 2008Broad Institute SorAra2.0Available
sorAra1Jun. 2006Broad Institute SorAra1.0Available
SlothchoHof1Jul. 2008Broad Institute ChoHof1.0Available
SquirrelspeTri2Nov. 2011Broad Institute SpeTri2.0Available
Squirrel monkeysaiBol1Oct. 2011Broad Institute SaiBol1.0Available
TarsiertarSyr2Sep. 2013WashU Tarsius_syrichta-2.0.1Available
tarSyr1Aug. 2008WUSTL/Broad Institute Tarsyr1.0Available
Tasmanian devilsarHar1Feb. 2011Wellcome Trust Sanger Institute Devil_refv7.0Available
TenrecechTel2Nov. 2012Broad Institute EchTel2.0Available
echTel1Jul. 2005Broad Institute echTel1Available
Tree shrewtupBel1Dec. 2006Broad Institute Tupbel1.0Available
WallabymacEug2Sep. 2009Tammar Wallaby Genome Sequencing Consortium Meug_1.1Available
White rhinoceroscerSim1May 2012Broad Institute CerSimSim1.0Available
VERTEBRATES
African clawed frogxenLae2Aug. 2016Int. Xenopus Sequencing ConsortiumAvailable
American alligatorallMis1Aug. 2012Int. Crocodilian Genomes Working Group allMis0.2Available
Atlantic codgadMor1May 2010Genofisk GadMor_May2010Available
BudgerigarmelUnd1Sep. 2011WUSTL v6.3Available
ChickengalGal6Mar. 2018GRCg6 Gallus-gallus-6.0Available
galGal5Dec. 2015ICGC Gallus-gallus-5.0Available
galGal4Nov. 2011ICGC Gallus-gallus-4.0Available
galGal3May 2006WUSTL Gallus-gallus-2.1Available
galGal2Feb. 2004WUSTL Gallus-gallus-1.0Available
CoelacanthlatCha1Aug. 2011Broad Institute LatCha1Available
Elephant sharkcalMil1Dec. 2013IMCB Callorhinchus_milli_6.1.3Available
Fugufr3Oct. 2011JGI v5.0Available
fr2Oct. 2004JGI v4.0Available
fr1Aug. 2002JGI v3.0Available
LampreypetMar3Dec. 2017University of Kentucky Pmar_germline 1.0Available
petMar2Sep. 2010WUGSC 7.0Available
petMar1Mar. 2007WUSTL v3.0Available
LizardanoCar2May 2010Broad Institute AnoCar2Available
anoCar1Feb. 2007Broad Institute AnoCar1Available
MedakaoryLat2Oct. 2005NIG v1.0Available
Medium ground finchgeoFor1Apr. 2012BGI GeoFor_1.0 / NCBI 13302Available
Nile tilapiaoreNil2Jan. 2011Broad Institute Release OreNil1.1Available
Painted turtlechrPic1Dec. 2011IPTGSC Chrysemys_picta_bellii-3.0.1Available
SticklebackgasAcu1Feb. 2006Broad Institute Release 1.0Available
TetraodontetNig2Mar. 2007Genoscope v7Available
tetNig1Feb. 2004Genoscope v7Available
Tibetan frognanPar1Mar. 2015Beijing Genomics Institute BGI_ZX_20015Available
TurkeymelGal5Nov. 2014Turkey Genome Consortium v5.0Available
melGal1Dec. 2009Turkey Genome Consortium v2.01Available
X. tropicalisxenTro10Nov. 2019University of California, Berkeley UCB_Xtro_10.0Available
xenTro9Jul. 2016JGI v.9.1Available
xenTro7Sep. 2012JGI v.7.0Available
xenTro3Nov. 2009JGI v.4.2Available
xenTro2Aug. 2005JGI v.4.1Available
xenTro1Oct. 2004JGI v.3.0Available
Zebra finchtaeGut2Feb. 2013WashU taeGut324Available
taeGut1Jul. 2008WUSTL v3.2.4Available
ZebrafishdanRer11May 2017Genome Reference Consortium GRCz11 Available
danRer10Sep. 2014Genome Reference Consortium GRCz10 Available
danRer7Jul. 2010Sanger Institute Zv9 Available
danRer6Dec. 2008Sanger Institute Zv8 Available
danRer5Jul. 2007Sanger Institute Zv7 Available
danRer4Mar. 2006Sanger Institute Zv6 Available
danRer3May 2005Sanger Institute Zv5 Available
danRer2Jun. 2004Sanger Institute Zv4 Archived
danRer1Nov. 2003Sanger Institute Zv3 Archived
DEUTEROSTOMES
C. intestinalisci3Apr. 2011Kyoto KHAvailable
C. intestinalisci2Mar. 2005JGI v2.0Available
ci1Dec. 2002JGI v1.0Available
LanceletbraFlo1Mar. 2006JGI v1.0Available
S. purpuratusstrPur2Sep. 2006Baylor College of Medicine HGSC v. Spur 2.1Available
strPur1Apr. 2005Baylor College of Medicine HGSC v. Spur_0.5Available
INSECTS
A. melliferaapiMel2Jan. 2005Baylor College of Medicine HGSC v.Amel_2.0 Available
apiMel1Jul. 2004Baylor College of Medicine HGSC v.Amel_1.2 Available
A. gambiaeanoGam3Oct. 2006International Consortium for the Sequencing of Anopheles Genome AgamP3Available
anoGam1Feb. 2003IAGP v.MOZ2Available
D. ananassaedroAna2Aug. 2005Agencourt Arachne releaseAvailable
droAna1Jul. 2004TIGR Celera releaseAvailable
D. erectadroEre1Aug. 2005Agencourt Arachne releaseAvailable
D. grimshawidroGri1Aug. 2005Agencourt Arachne releaseAvailable
D. melanogasterdm6Aug. 2014BDGP Release 6 + ISO1 MTAvailable
dm3Apr. 2006BDGP Release 5Available
dm2Apr. 2004BDGP Release 4Available
dm1Jan. 2003BDGP Release 3Available
D. mojavensisdroMoj2Aug. 2005Agencourt Arachne releaseAvailable
droMoj1Aug. 2004Agencourt Arachne releaseAvailable
D. persimilisdroPer1Oct. 2005Broad Institute releaseAvailable
D. pseudoobscuradp3Nov. 2004FlyBase Release 1.0Available
dp2Aug. 2003Baylor College of Medicine HGSC Freeze 1Available
D. sechelliadroSec1Oct. 2005Broad Institute Release 1.0Available
D. simulansdroSim1Apr. 2005WUSTL Release 1.0Available
D. virilisdroVir2Aug. 2005Agencourt Arachne releaseAvailable
droVir1Jul. 2004Agencourt Arachne releaseAvailable
D. yakubadroYak2Nov. 2005WUSTL Release 2.0Available
droYak1Apr. 2004WUSTL Release 1.0Available
NEMATODES
C. brennericaePb2Feb. 2008WUSTL 6.0.1Available
caePb1Jan. 2007WUSTL 4.0Available
C. briggsaecb3Jan. 2007WUSTL Cb3Available
cb1Jul. 2002WormBase v. cb25.agp8Available
C. elegansce11Feb. 2013C. elegans Sequencing Consortium WBcel235Available
ce10Oct. 2010WormBase v. WS220Available
ce6May 2008WormBase v. WS190Available
ce4Jan. 2007WormBase v. WS170Available
ce2Mar. 2004WormBase v. WS120Available
ce1May 2003WormBase v. WS100Archived
C. japonicacaeJap1Mar. 2008WUSTL 3.0.2Available
C. remaneicaeRem3May 2007WUSTL 15.0.1Available
caeRem2Mar. 2006WUSTL 1.0Available
P. pacificuspriPac1Feb. 2007WUSTL 5.0Available
OTHER
Sea HareaplCal1Sep. 2008Broad Release Aplcal2.0Available
YeastsacCer3April 2011SGD April 2011 sequenceAvailable
sacCer2June 2008SGD June 2008 sequenceAvailable
sacCer1Oct. 2003SGD 1 Oct 2003 sequenceAvailable
VIRUSES
Ebola ViruseboVir3June 2014Sierra Leone 2014 (G3683/KM034562.1)Available
Monkeypox VirusmpxvRiversMay 2022MPXV-M5312_HM12_Rivers (MT903340.1/GCF_014621545.1)Available
SARS-CoV-2wuhCor1Jan. 2020SARS-CoV-2 ASM985889v3Available

Initial assembly release dates

When will the next assembly be out?

UCSC does not produce its own genome assemblies, but instead obtains them from standard sources. Because of this, you can expect us to release a new version of a genome soon after the assembling organization has released the version. A new assembly release initially consists of the genome sequence and a small set of aligned annotation tracks. Additional annotation tracks are added as they are obtained or generated. Bulk downloads of the data are typically available in the first week after the assembly is released in the browser.

Patch sequences for genome assemblies

Why am I seeing chr_alt or chr_fix chromosomes on the Genome Browser?

Since the intial Genome Reference Consortium (GRC) release of the human and mouse genome assemblies, there have been updates to these assemblies known as "patches". Patches are accessioned scaffold sequences that add information to the assembly without disrupting the chromosome coordinates. Patches are given chromosome context by aligning the sequence to the current assembly. Together the scaffod sequence and the alignment define the patch to the genome assembly. The GRC patch releases do not change any previously existing sequences; they simply add new sequences for fix patches or alternate haplotypes that correspond to specific regions of the main chromosome sequences. For most users, the patches are unlikely to make a difference and may complicate the analysis as they can introduce duplication.

There are two kinds of GRC patch sequences, chr_alt and chr_fix sequences:

The Patching up the Genome blog post contains more information on how patch sequences are incorporated into the UCSC Genome Browser. For more information about how patch sequences can affects BLAT and isPCR results, please refer to the following BLAT FAQ.

Data sources - UCSC assemblies

Where does UCSC obtain the assembly and annotation data displayed in the Genome Browser?

All the assembly data displayed in the UCSC Genome Browser are obtained from external sequencing centers. To determine the data source and version for a given assembly, see the assembly's description on the Genome Browser Gateway page or the List of UCSC Genome Releases.

The annotations accompanying an assembly are obtained from a variety of sources. The UCSC Genome Bioinformatics Group generates several of the tracks; the remainder are contributed by collaborators at other sites. Each track has an associated description page that credits the authors of the annotation.

For detailed information about the individuals and organizations who contributed to a specific assembly, see the Credits page.

Which UCSC assemblies are equivalent to Ensembl or NCBI assemblies?

The asmEquivalent table on the hgFixed database is available on the public MySQL server to show which assemblies versions are identical (or almost identical) to each other between UCSC, Ensembl, Genbank, and RefSeq assemblies.

mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -e 'desc asmEquivalent;' hgFixed
+----------------------+-------------------------------------------+
| Field                | Type                                      |
+----------------------+-------------------------------------------+
| source               | varchar(255)                              |
| destination          | varchar(255)                              |
| sourceAuthority      | enum('ensembl','ucsc','genbank','refseq') |
| destinationAuthority | enum('ensembl','ucsc','genbank','refseq') |
| matchCount           | bigint(20)                                |
| sourceCount          | bigint(20)                                |
| destinationCount     | bigint(20)                                |
+----------------------+-------------------------------------------+

The "Count" indications are the count of individual sequences in the assembly. When all three counts are identical, matchCount == sourceCount == destinationCount, then the match between genome assemblies is perfectly identical.

Non-perfect matches can be due to a number of factors:

  1. different or not included chrMT genome sequences in an assembly
  2. identical duplicated sequences present or absent from an assembly
  3. some smaller contigs not included in an assembly
  4. slight differences in versions of assemblies where some contain sequences not in the other assembly

Comparison of UCSC and NCBI human assemblies

How do the human assemblies displayed in the UCSC Genome Browser differ from the NCBI human assemblies?

Human assemblies displayed in the Genome Browser (hg10 and higher) are near identical to the NCBI assemblies when it comes to primary sequence. Minor differences may be present, however. Sources include:

Looking for a genome assembly not shown in the tree?

When looking for a specific assembly, the best place to start is the Gateway page. If you begin to type the common name, species name, or NCBI RefSeq accession number in the search box on the left side of the screen, suggestions will appear if any matches are found. This search will also match any assembly hubs that are listed in UCSC's Public Hubs. Nearly every NCBI RefSeq assembly and Vertebrate Genomes Project assembly is included here within the GenArk hubs. NCBI RefSeq assemblies can be loaded with direct links such as http://genome.ucsc.edu/h/GCF_001984765.1 with the GCF accession. These assembly hubs are automatically updated, but not reviewed by UCSC. The species tree shows all genomes reviewed by UCSC.

If the assembly of interest is not found, please visit our assembly request page. Search that page for your assembly. If there is a "view" link you can launch the existing genome browser. Otherwise, click the "request" button to fill out a form to add your genome of interest. An existing GCA_ or GCF_ identifier must exist, reflecting that the assembly has been deposited into Genbank at NCBI, before we can process it. See the Assembly Submission Guidelines page at NCBI for directions on their submission process if your genome needs to be deposited. Also, review the UCSC GenArk Blog posts for examples of accessing and reviewing technical details about GenArk hubs.

Another option available to all users is to create an assembly hub. These are assemblies created and hosted by users and displayed on the Genome Browser. This requires no intervention by the UCSC Genome Browser and can be done for any assembly. See our Quick Start Guide to Assembly Hubs page for additional information and resources. If you create an assembly hub, consider sharing it with others as a Public Hub.

If you would like information about creating a track hub for an existing assembly hub, please refer to the following FAQ entry.

Differences between UCSC and NCBI mouse assemblies

Is the mouse genome assembly displayed in the UCSC Genome Browser the same as the one on the NCBI website?

The mouse genome assemblies featured in the UCSC Genome Browser are the same as those on the NCBI web site with one difference: the UCSC versions contain only the reference strain data (C57BL/6J). NCBI provides data for several additional strains in their builds.

Accessing older assembly versions

I need to access an older version of a genome assembly that's no longer listed in the Genome Browser menu. What should I do?

In addition to the assembly versions currently available in the Genome Browser, you can access the data for older assemblies of the browser through our Downloads page.

Frequency of GenBank data updates

How frequently does UCSC update its databases with new data from GenBank?

GenBank updates for mRNA, RefSeq, and EST data occur on a semi-quarterly basis, following major NCBI releases. These updates are in place for most Genome Browser assemblies. Assemblies that are not on an incremental update schedule are updated whenever we load a new assembly or make a major revision to a table.

Coordinate changes between assemblies

I noticed that the chromosomal coordinates for a particular gene that I'm looking at have changed since the last time I used your browser. What happened?

A common source of confusion for users arises from mixing up different assemblies. It is very important to be aware of which assembly you are looking at. Within the Genome Browser display, assemblies are labeled by organism and date. To look up the corresponding UCSC database name or NCBI build number, use the release table.

UCSC database labels are of the form hg#, panTro#, etc. The letters designate the organism, e.g. hg for human genome or panTro for Pan troglodytes. The number denotes the UCSC assembly version for that organism. For example, ce1 refers to the first UCSC assembly of the C. elegans genome.

The coordinates of your favorite gene in one assembly may not be the same as those in the next release of the assembly unless the gene happens to lie on a completely sequenced and unrevised chromosome. For information on integrating data from one assembly into another, see the Converting positions between assembly versions section.

Converting positions between assembly versions

I've been researching a specific area of the human genome on the current assembly, and now you've just released a new version. Is there an easy way to locate my area of interest on the new assembly?

See the section on converting coordinates for information on assembly migration tools.

Converting SNPs between assembly versions

How can I convert SNP annotation coordinates between assembly versions?

While the LiftOver tool exists to convert coordinates between assemblies, it is NOT recommended to use LiftOver to convert SNPs between two assembly versions. Using LiftOver to convert a very small region, especially a single base SNP, is not always a trivial task as the alignment may not get a high enough score to be considered a successful conversion.

The mapping from one genome assembly to the next is automatically generated from sequence alignments and is not complete nor perfect, just a best attempt. The stability of rsIDs is why we recommend mapping across assemblies using rsIDs first and then falling back to using cross-assembly mappings to look up the positions of items that didn't map. When an rsID in hg18 is not found in the hg38 dbSNP data, that rsID has been withdrawn or replaced by a different rsID. In that case, mapping the hg18 position of the disappeared rsID to hg38 and looking for a new rsID at the corresponding position may help to find the new rsID.

dbSNP started assigning rsIDs in the 1990s, before the human genome was assembled. At that point, SNPs were defined by flanking sequences with the varying base(s) in the middle -- local sequence context, with no genomic location because there was no genome. rsIDs are supposed to be stable across genome assemblies despite the changing genomic positions. Regardless of the genome assembly version, "rs429358" refers to the same polymorphic variant in the 4th exon of the ApoE gene with T as the major allele and C as the minor allele. Its genomic position in hg18 is chr19:50103781, and its genomic position in hg38 is chr19:44908684.

Instead, the recommended process to convert a SNP's coordinates between assemblies is to use a SNP track to search for each rsID on the target genome assembly. For example, after creating a list of rsIDs for your conversion, you would then search for each rsID on the target's Genome Browser using the dbSNP track for human assemblies (i.e. hg19 or hg38) or the EVA SNP track on mouse assemblies (i.e. mm10 or mm39) to perform the conversion.

To summarize the setps:

  1. Create a file of all rsIDs
  2. Use the Table Browser to map the file of rsIDs to the other assembly's coordinates
  3. Create another file containing any rsIDs that were not mapped by the Table Browser
  4. Using the file from the previous step, use the Table Browser to create a BED4 file for the rsIDs that were not mapped by the Table Browser
  5. Run LiftOver on the BED4 file to get the new coordinates in the other assembly
  6. Use the Data Integrator to map the LiftOver results to new rsIDs where possible
  7. Combine the Table Browser rsID-mapped BED4 with the LiftOver/Data Integrator-mapped BED4. Beware duplicates that will cause downstream problems. You will need to decide whether to remove duplicates as unreliable or resolve duplicates
How can I convert a large set of SNP annotations?

For bulk conversions, the Table Browser can be used to extract the coordinates for the rsIDs on the target assembly. More information about performing batch queries on the Table Browser can be found on the following Table Browser help page. An example of using the Table Browser to convert SNP between assemblies can be found on a previously answered question available on the mailing list archive.

If you are using versions dbSNP 153 and above, the data are formatted as bigBed files instead of being stored as a MariaDb table. For very large queries, this may cause the Table Browser to timeout before the query finishes as dbSNP has grown to include over 700 million variants. If you find that your Table Browser query timesout for your list of rsIDs, you can use the bigBedNamedItems command-line tool to extract the rsID coordinates directly from the bigBed file instead of using the Table Browser. More information and examples using the bigBedNamedItems utility can be found on the following FAQ entry. As a reminder, you can run any Kent command-line tool without arguments to get the usage statement.

Missing annotation tracks

Why is my favorite annotation track missing from your latest release?

The initial release of a new genome assembly typically contains a small subset of core annotation tracks. New tracks are added as they are generated. In many cases, our annotation tracks are contributed by scientists not affiliated with UCSC who must first obtain the sequence, repeatmasked data, etc. before they can produce their tracks. If you have need of an annotation that has not appeared on an assembly within a month or so of its release, feel free to send an inquiry to genome@soe.ucsc.edu. Messages sent to this address will be posted to the moderated genome mailing list, which is archived on a SEARCHABLE, PUBLIC Google Groups forum.

What next with the human genome?

Now that the human genome is "finished", will there be any more releases?

Rest assured that work will continue. There will be updates to the assembly over the next several years. This has been the case for all other finished (i.e. essentially complete) genome assemblies as gaps are closed. For example, the C. elegans genome has been "finished" for several years, but small bits of sequence are still being added and corrections are being made. NCBI will continue to coordinate the human genome assemblies in collaboration with the individual chromosome coordinators, and UCSC will continue to QC the assembly in conjunction with NCBI (and, to a lesser extent, Ensembl). UCSC, NCBI, Ensembl, and others will display the new releases on their sites as they become available.

Mouse strain used for mouse genome sequence

What strain of mouse was used for the Mus musculus genome?

C57BL/6J.