The NCBI RefSeq Genes composite track shows
16 Apr 2019 Clupea harengus/GCF_900700415.2_Ch_v2.0.2
protein-coding and non-protein-coding genes taken from the NCBI RNA reference
sequences collection (RefSeq). All subtracks use coordinates provided by RefSeq.
See the Methods section for more details about how
the different tracks were created.
Please visit NCBI's
Feedback for Gene and Reference Sequences (RefSeq)
page to make suggestions, submit additions and corrections, or ask for
help concerning RefSeq records.
For more information on the different gene tracks, see our Genes FAQ.
Display Conventions and Configuration
To show only a selected set of subtracks, uncheck the boxes next to the
tracks that you wish to hide.
The tracks available here can include (not all may be present):
- RefSeq annotations and alignments
- RefSeq All – all curated and predicted annotations
provided by RefSeq.
- RefSeq Curated – subset of RefSeq All that
includes only those annotations whose accessions begin with NM, NR,
NP or YP. (NP and YP are used only for protein-coding genes on
the mitochondrion; YP is used for human only.)
- RefSeq Predicted – subset of RefSeq All that includes
those annotations whose accessions begin with XM or XR.
- RefSeq Other – all other annotations produced by the
RefSeq group that do not fit the requirements for inclusion in
the RefSeq Curated or the RefSeq Predicted tracks.
- RefSeq Alignments – alignments of RefSeq RNAs to
the 16 Apr 2019 Clupea harengus/GCF_900700415.2_Ch_v2.0.2
genome provided by the RefSeq group.
The RefSeq All, RefSeq Curated and RefSeq Predicted,
tracks follow the display conventions for
gene prediction tracks.
The color shading indicates the level of review the RefSeq record has undergone:
predicted (light), provisional (medium), or reviewed (dark), as defined by
The RefSeq Alignments track follows the display conventions for
||Level of review
||Reviewed: the RefSeq record has been reviewed by NCBI
staff or by a collaborator. The NCBI review process includes assessing
available sequence data and the literature. Some RefSeq records may
incorporate expanded sequence and annotation information.
||Provisional: the RefSeq record has not yet been subject
to individual review. The initial sequence-to-gene association has been
established by outside collaborators or NCBI staff.
||Predicted: the RefSeq record has not yet been subject to
individual review, and some aspect of the RefSeq record is predicted.
The item labels and codon display properties for features within this track
can be configured through the controls at the top of the track description
page. To adjust the settings for an individual subtrack, click the wrench
icon next to the track name in the subtrack list.
- Label: By default, items are labeled by gene name. Click
the appropriate Label option to display the accession name or OMIM
identifier instead of the gene name, show all or a subset of these labels
including the gene name, OMIM identifier and accession names, or turn off
the label completely.
- Codon coloring: This track has an optional codon
coloring feature that allows users to quickly validate and compare gene
predictions. To display codon colors, select the genomic codons
option from the Color track by codons pull-down menu. For more
information about this feature, go to the Coloring Gene Predictions and Annotations by Codon page.
The RefSeq annotation and RefSeq RNA alignment tracks
were created at UCSC using data from the NCBI RefSeq project. GFF format
data files were downloaded from the file GCF_900700415.2_Ch_v2.0.2_genomic.gff.gz
delivered with the NCBI RefSeq genome assemblies at the FTP location:
The GFF file was converted to the
genePred and PSL table formats for display in the Genome Browser.
Information about the NCBI annotation pipeline can be found
Track statistics summary
Total genome size: 786,325,606 bases
Curated and Predicted Gene count: 50,946
Bases in these genes: 453,294,950
Percent genome coverage: % 57.647
Curated gene count: 22
Bases in curated genes: 35,388
Percent genome coverage: % 0.005
Predicted gene count: 50,924
Bases in genes: 453,270,592
Percent genome coverage: % 57.644
Other annotation count: 3,078
Bases in other annotations: 6,760,869
Percent genome coverage: % 0.860
This track was produced at UCSC from data generated by scientists worldwide
and curated by the NCBI RefSeq project.
Kent WJ. BLAT - the BLAST-like alignment tool.
Genome Res. 2002 Apr;12(4):656-64. PMID:
Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A,
Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM et al.
RefSeq: an update on mammalian reference sequences.
Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. PMID:
Pruitt KD, Tatusova T, Maglott DR.
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database
of genomes, transcripts and proteins.
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. PMID:
15608248; PMC: PMC539979