Schema for NCBI RefSeq - RefSeq gene predictions from NCBI
  Database: hs1    Primary Table: hub_567047_ncbiRefSeqOther Data last updated: 2023-05-29
Big Bed File Download: /gbdb/hs1/ncbiRefSeq/ncbiRefSeqOther.bb
Item Count: 16,056
The data is stored in the binary BigBed format.

Format description: Additional information for NCBI 'Other' annotation
fieldexampledescription
chromchr1Chromosome (or contig, scaffold, etc.)
chromStart165621497Start position in chromosome
chromEnd165623680End position in chromosome
nameLOC284685Name of item
score0Placeholder for BED format (score 0-1000)
strand-+ or -
thickStart165623680CDS start/end in chromosome if coding
thickEnd165623680CDS start/end in chromosome if coding
reserved0Placeholder for BED format (itemRgb)
blockCount1Number of alignment blocks
blockSizes2183,Comma separated list of block sizes
chromStarts0,Block start positions relative to chromStart
geneLOC284685Gene name
GeneID284685Entrez Gene
MIMOMIM
HGNCHGNC
miRBasemiRBase
descriptionEWS RNA binding protein 1 pseudogeneDescription
NoteNote
exceptionExceptional properties
productGene product
geneSynonymGene synonyms
modelEvidenceSupporting evidence for gene model
gbkeyGeneFeature type
geneBiotypepseudogeneGene biotype
pseudotrue'true' if pseudogene
partial'true' if partial
anticodonAnticodon position within tRNA
startRangeStart range on genome
endRangeEnd range on genome
IDgene-LOC284685Unique ID in RefSeq GFF3

Sample Rows
 
chromchromStartchromEndnamescorestrandthickStartthickEndreservedblockCountblockSizeschromStartsgeneGeneIDMIMHGNCmiRBasedescriptionNoteexceptionproductgeneSynonymmodelEvidencegbkeygeneBiotypepseudopartialanticodonstartRangeendRangeID
chr1165621497165623680LOC2846850-165623680165623680012183,0,LOC284685284685EWS RNA binding protein 1 pseudogeneGenepseudogenetruegene-LOC284685
chr1165820713165827534FMO7P0+165827534165827534016821,0,FMO7P100337589HGNC:32208flavin containing dimethylaniline monoxygenase 7, pseudogeneGenepseudogenetruegene-FMO7P
chr1165912094165926627FMO8P0+1659266271659266270114533,0,FMO8P100129007HGNC:32209flavin containing dimethylaniline monoxygenase 8, pseudogeneGenepseudogenetruegene-FMO8P
chr1165994990166045239FMO10P0+1660452391660452390150249,0,FMO10P100128181HGNC:32211flavin containing dimethylaniline monoxygenase 10, pseudogeneGenepseudogenetruegene-FMO10P
chr1166093163166094589RPL4P20+166094589166094589011426,0,RPL4P2646688HGNC:36836ribosomal protein L4 pseudogene 2RPL4_1_115Genepseudogenetruegene-RPL4P2
chr1166113396166138590FMO11P0+1661385901661385900125194,0,FMO11P100337590HGNC:32212flavin containing dimethylaniline monoxygenase 11, pseudogeneGenepseudogenetruegene-FMO11P
chr1166142067166143161CNN2P100+166143161166143161011094,0,CNN2P10646693HGNC:39535calponin 2 pseudogene 10Genepseudogenetruegene-CNN2P10
chr1166214567166215041DUTP60+16621504116621504101474,0,DUTP6100873912HGNC:39519deoxyuridine triphosphatase pseudogene 6Genepseudogenetruegene-DUTP6
chr1166351794166351913RNA5SP650+16635191316635191301119,0,RNA5SP65100873300HGNC:42842RNA, 5S ribosomal pseudogene 65RN5S65Genepseudogenetruegene-RNA5SP65
chr1166509417166509900RPS17P60-16650990016650990001483,0,RPS17P6391130HGNC:36945ribosomal protein S17 pseudogene 6RPS17_1_116Genepseudogenetruegene-RPS17P6

RefSeq Other (hub_567047_ncbiRefSeqOther) Track Description
 

Description

The NCBI RefSeq Genes composite track shows 24 Jan 2022 Homo sapiens/GCF_009914755.1_T2T-CHM13v2.0 protein-coding and non-protein-coding genes taken from the NCBI RNA reference sequences collection (RefSeq). All subtracks use coordinates provided by RefSeq. See the Methods section for more details about how the different tracks were created.

Please visit NCBI's Feedback for Gene and Reference Sequences (RefSeq) page to make suggestions, submit additions and corrections, or ask for help concerning RefSeq records.

For more information on the different gene tracks, see our Genes FAQ.

Display Conventions and Configuration

To show only a selected set of subtracks, uncheck the boxes next to the tracks that you wish to hide.

The tracks available here can include (not all may be present):
RefSeq annotations and alignments
  • RefSeq All – all curated and predicted annotations provided by RefSeq.
  • RefSeq Curated – subset of RefSeq All that includes only those annotations whose accessions begin with NM, NR, NP or YP. (NP and YP are used only for protein-coding genes on the mitochondrion; YP is used for human only.)
  • RefSeq Predicted – subset of RefSeq All that includes those annotations whose accessions begin with XM or XR.
  • RefSeq Other – all other annotations produced by the RefSeq group that do not fit the requirements for inclusion in the RefSeq Curated or the RefSeq Predicted tracks.
  • RefSeq Alignments – alignments of RefSeq RNAs to the 24 Jan 2022 Homo sapiens/GCF_009914755.1_T2T-CHM13v2.0 genome provided by the RefSeq group.

The RefSeq All, RefSeq Curated and RefSeq Predicted, tracks follow the display conventions for gene prediction tracks. The color shading indicates the level of review the RefSeq record has undergone: predicted (light), provisional (medium), or reviewed (dark), as defined by RefSeq.

Color Level of review
Reviewed: the RefSeq record has been reviewed by NCBI staff or by a collaborator. The NCBI review process includes assessing available sequence data and the literature. Some RefSeq records may incorporate expanded sequence and annotation information.
Provisional: the RefSeq record has not yet been subject to individual review. The initial sequence-to-gene association has been established by outside collaborators or NCBI staff.
Predicted: the RefSeq record has not yet been subject to individual review, and some aspect of the RefSeq record is predicted.

The RefSeq Alignments track follows the display conventions for PSL tracks.

The item labels and codon display properties for features within this track can be configured through the controls at the top of the track description page. To adjust the settings for an individual subtrack, click the wrench icon next to the track name in the subtrack list.

  • Label: By default, items are labeled by gene name. Click the appropriate Label option to display the accession name or OMIM identifier instead of the gene name, show all or a subset of these labels including the gene name, OMIM identifier and accession names, or turn off the label completely.
  • Codon coloring: This track has an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. For more information about this feature, go to the Coloring Gene Predictions and Annotations by Codon page.

Methods

The RefSeq annotation and RefSeq RNA alignment tracks were created at UCSC using data from the NCBI RefSeq project. GFF format data files were downloaded from the file GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz delivered with the NCBI RefSeq genome assemblies at the FTP location:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/009/914/755/GCF_009914755.1_T2T-CHM13v2.0/ The GFF file was converted to the genePred and PSL table formats for display in the Genome Browser. Information about the NCBI annotation pipeline can be found here.

Track statistics summary

Total genome size: 3,117,292,070 bases

Curated and Predicted Gene count: 108,944
Bases in these genes: 1,612,562,606
Percent genome coverage: % 51.730

Curated gene count: 82,572
Bases in curated genes: 1,377,848,543
Percent genome coverage: % 44.200

Predicted gene count: 26,372
Bases in genes: 287,621,756
Percent genome coverage: % 9.227

Other annotation count: 16,326
Bases in other annotations: 32,222,985
Percent genome coverage: % 1.034

Credits

This track was produced at UCSC from data generated by scientists worldwide and curated by the NCBI RefSeq project.

References

Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518

Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. PMID: 24259432; PMC: PMC3965018

Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. PMID: 15608248; PMC: PMC539979