Schema for NCBI RefSeq - RefSeq gene predictions from NCBI

JavaScript is disabled in your web browser

You must have JavaScript enabled in your web browser to use the Genome Browser

Database: hs1 Primary Table: hub_567047_ncbiRefSeqOther Data last updated: 2023-05-29
Big Bed File Download: /gbdb/hs1/ncbiRefSeq/ncbiRefSeqOther.bb
Item Count: 16,056
The data is stored in the binary BigBed format.

Format description: Additional information for NCBI 'Other' annotation

field	example	description
`chrom`	chr1	Chromosome (or contig, scaffold, etc.)
`chromStart`	165621497	Start position in chromosome
`chromEnd`	165623680	End position in chromosome
`name`	LOC284685	Name of item
`score`	0	Placeholder for BED format (score 0-1000)
`strand`	-	+ or -
`thickStart`	165623680	CDS start/end in chromosome if coding
`thickEnd`	165623680	CDS start/end in chromosome if coding
`reserved`	0	Placeholder for BED format (itemRgb)
`blockCount`	1	Number of alignment blocks
`blockSizes`	2183,	Comma separated list of block sizes
`chromStarts`	0,	Block start positions relative to chromStart
`gene`	LOC284685	Gene name
`GeneID`	284685	Entrez Gene
`MIM`		OMIM
`HGNC`		HGNC
`miRBase`		miRBase
`description`	EWS RNA binding protein 1 pseudogene	Description
`Note`		Note
`exception`		Exceptional properties
`product`		Gene product
`geneSynonym`		Gene synonyms
`modelEvidence`		Supporting evidence for gene model
`gbkey`	Gene	Feature type
`geneBiotype`	pseudogene	Gene biotype
`pseudo`	true	'true' if pseudogene
`partial`		'true' if partial
`anticodon`		Anticodon position within tRNA
`startRange`		Start range on genome
`endRange`		End range on genome
`ID`	gene-LOC284685	Unique ID in RefSeq GFF3

Sample Rows

chrom	chromStart	chromEnd	name	strand	thickStart	thickEnd	blockCount	blockSizes	chromStarts	gene	GeneID	HGNC	description	geneSynonym	gbkey	geneBiotype	pseudo	ID
chr1	165621497	165623680	LOC284685	-	165623680	165623680	1	2183,	0,	LOC284685	284685		EWS RNA binding protein 1 pseudogene		Gene	pseudogene	true	gene-LOC284685
chr1	165820713	165827534	FMO7P	+	165827534	165827534	1	6821,	0,	FMO7P	100337589	HGNC:32208	flavin containing dimethylaniline monoxygenase 7, pseudogene		Gene	pseudogene	true	gene-FMO7P
chr1	165912094	165926627	FMO8P	+	165926627	165926627	1	14533,	0,	FMO8P	100129007	HGNC:32209	flavin containing dimethylaniline monoxygenase 8, pseudogene		Gene	pseudogene	true	gene-FMO8P
chr1	165994990	166045239	FMO10P	+	166045239	166045239	1	50249,	0,	FMO10P	100128181	HGNC:32211	flavin containing dimethylaniline monoxygenase 10, pseudogene		Gene	pseudogene	true	gene-FMO10P
chr1	166093163	166094589	RPL4P2	+	166094589	166094589	1	1426,	0,	RPL4P2	646688	HGNC:36836	ribosomal protein L4 pseudogene 2	RPL4_1_115	Gene	pseudogene	true	gene-RPL4P2
chr1	166113396	166138590	FMO11P	+	166138590	166138590	1	25194,	0,	FMO11P	100337590	HGNC:32212	flavin containing dimethylaniline monoxygenase 11, pseudogene		Gene	pseudogene	true	gene-FMO11P
chr1	166142067	166143161	CNN2P10	+	166143161	166143161	1	1094,	0,	CNN2P10	646693	HGNC:39535	calponin 2 pseudogene 10		Gene	pseudogene	true	gene-CNN2P10
chr1	166214567	166215041	DUTP6	+	166215041	166215041	1	474,	0,	DUTP6	100873912	HGNC:39519	deoxyuridine triphosphatase pseudogene 6		Gene	pseudogene	true	gene-DUTP6
chr1	166351794	166351913	RNA5SP65	+	166351913	166351913	1	119,	0,	RNA5SP65	100873300	HGNC:42842	RNA, 5S ribosomal pseudogene 65	RN5S65	Gene	pseudogene	true	gene-RNA5SP65
chr1	166509417	166509900	RPS17P6	-	166509900	166509900	1	483,	0,	RPS17P6	391130	HGNC:36945	ribosomal protein S17 pseudogene 6	RPS17_1_116	Gene	pseudogene	true	gene-RPS17P6

RefSeq Other (hub_567047_ncbiRefSeqOther) Track Description

Description

The NCBI RefSeq Genes composite track shows 24 Jan 2022 Homo sapiens/GCF_009914755.1_T2T-CHM13v2.0 protein-coding and non-protein-coding genes taken from the NCBI RNA reference sequences collection (RefSeq). All subtracks use coordinates provided by RefSeq. See the Methods section for more details about how the different tracks were created.

Please visit NCBI's Feedback for Gene and Reference Sequences (RefSeq) page to make suggestions, submit additions and corrections, or ask for help concerning RefSeq records.

For more information on the different gene tracks, see our Genes FAQ.

Display Conventions and Configuration

To show only a selected set of subtracks, uncheck the boxes next to the tracks that you wish to hide.

The tracks available here can include (not all may be present):

RefSeq annotations and alignments

The RefSeq All, RefSeq Curated and RefSeq Predicted, tracks follow the display conventions for gene prediction tracks. The color shading indicates the level of review the RefSeq record has undergone: predicted (light), provisional (medium), or reviewed (dark), as defined by RefSeq.

Color	Level of review
	Reviewed: the RefSeq record has been reviewed by NCBI staff or by a collaborator. The NCBI review process includes assessing available sequence data and the literature. Some RefSeq records may incorporate expanded sequence and annotation information.
	Provisional: the RefSeq record has not yet been subject to individual review. The initial sequence-to-gene association has been established by outside collaborators or NCBI staff.
	Predicted: the RefSeq record has not yet been subject to individual review, and some aspect of the RefSeq record is predicted.

The RefSeq Alignments track follows the display conventions for PSL tracks.

The item labels and codon display properties for features within this track can be configured through the controls at the top of the track description page. To adjust the settings for an individual subtrack, click the wrench icon next to the track name in the subtrack list.

Label: By default, items are labeled by gene name. Click the appropriate Label option to display the accession name or OMIM identifier instead of the gene name, show all or a subset of these labels including the gene name, OMIM identifier and accession names, or turn off the label completely.
Codon coloring: This track has an optional codon coloring feature that allows users to quickly validate and compare gene predictions. To display codon colors, select the genomic codons option from the Color track by codons pull-down menu. For more information about this feature, go to the Coloring Gene Predictions and Annotations by Codon page.

Methods

The RefSeq annotation and RefSeq RNA alignment tracks were created at UCSC using data from the NCBI RefSeq project. GFF format data files were downloaded from the file GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz delivered with the NCBI RefSeq genome assemblies at the FTP location:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/009/914/755/GCF_009914755.1_T2T-CHM13v2.0/ The GFF file was converted to the genePred and PSL table formats for display in the Genome Browser. Information about the NCBI annotation pipeline can be found here.

Track statistics summary

Total genome size: 3,117,292,070 bases

Curated and Predicted Gene count: 108,944
Bases in these genes: 1,612,562,606
Percent genome coverage: % 51.730

Curated gene count: 82,572
Bases in curated genes: 1,377,848,543
Percent genome coverage: % 44.200

Predicted gene count: 26,372
Bases in genes: 287,621,756
Percent genome coverage: % 9.227

Other annotation count: 16,326
Bases in other annotations: 32,222,985
Percent genome coverage: % 1.034

Credits

This track was produced at UCSC from data generated by scientists worldwide and curated by the NCBI RefSeq project.

References

Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518

Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. PMID: 24259432; PMC: PMC3965018

Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. PMID: 15608248; PMC: PMC539979