Schema for NCBI Proteins - NCBI Proteins: annotated mature peptide products
|
|
Database: wuhCor1 Primary Table: ncbiProducts Data last updated: 2020-05-13
Big Bed File Download: /gbdb/wuhCor1/bbi/ncbi/peptides.bb Item Count: 16 The data is stored in the binary BigBed format.
Format description: bigGenePred gene models parsed from Genbank files
field | example | description |
chrom | NC_045512v2 | Reference sequence chromosome or scaffold | chromStart | 8554 | Start position in chromosome | chromEnd | 10054 | End position in chromosome | name | nsp4 | Name or ID of item, ideally both human readable and unique | score | 0 | Score (0-1000) | strand | + | + or - for strand | thickStart | 8554 | Start of where display should be thick (start codon) | thickEnd | 10054 | End of where display should be thick (stop codon) | reserved | 0 | RGB value (use R,G,B string in input file) | blockCount | 1 | Number of blocks | blockSizes | 1500 | Comma separated list of block sizes | chromStarts | 0 | Start positions relative to chromStart | name2 | nsp4 | Alternative/human readable name | cdsStartStat | cmpl | Status of CDS start annotation (none, unknown, incomplete, or complete) | cdsEndStat | cmpl | Status of CDS end annotation (none, unknown, incomplete, or complete) | exonFrames | 0 | Exon frame {0,1,2}, or -1 if no frame for exon | type | N.a. | Transcript type | geneName | nsp4 | Primary identifier for gene | geneName2 | nsp4 | Alternative/human readable gene name | geneType | N.a. | Gene type | note | nsp4B_TM; contains transmembrane domain 2 (TM2); produced by both pp1a and pp1ab | Notes | product | YP_009725300.1 | Protein Product | geneId | | NCBI Gene ID | _cdnaSeq | AAAATTGTTAATAATTGGTTGAAGCAGTTAATTAAAGTTACACTTGTGTTCCTTTTTGTTGCTGCTATTTTCTATTTAATAACACCTGTTCATGTCATGTCTAAACATACTGACTTTTCAAGTGAAATCATAGGATACAAGGCTATTGATGGTGGTGTCACTCGTGACATAGCATCTACAGATACTTGTTTTGCTAACAAACATGCTGATTTTGACACATGGTTTAGCCAGCGTGGTGGTAGTTATACTAATGACAAAGCTTGCCCATTGATTGCTGCAGTCATAACAAGAGAAGTGGGTTTTGTCGTGCCTGGTTTGCCTGGCACGATATTACGCACAACTAATGGTGACTTTTTGCATTTCTTACCTAGAGTTTTTAGTGCAGTTGGTAACATCTGTTACACACCATCAAAACTTATAGAGTACACTGACTTTGCAACATCAGCTTGTGTTTTGGCTGCTGAATGTACAATTTTTAAAGATGCTTCTGGTAAGCCAGTACCATATTGTTATGATACCAATGTACTAGAAGGTTCTGTTGCTTATGAAAGTTTACGCCCTGACACACGTTATGTGCTCATGGATGGCTCTATTATTCAATTTCCTAACACCTACCTTGAAGGTTCTGTTAGAGTGGTAACAACTTTTGATTCTGAGTACTGTAGGCACGGCACTTGTGAAAGATCAGAAGCTGGTGTTTGTGTATCTACTAGTGGTAGATGGGTACTTAACAATGATTATTACAGATCTTTACCAGGAGTTTTCTGTGGTGTAGATGCTGTAAATTTACTTACTAATATGTTTACACCACTAATTCAACCTATTGGTGCTTTGGACATATCAGCATCTATAGTAGCTGGTGGTATTGTAGCTATCGTAGTAACATGCCTTGCCTACTATTTTATGAGGTTTAGAAGAGCTTTTGGTGAATACAGTCATGTAGTTGCCTTTAATACTTTACTATTCCTTATGTCATTCACTGTACTCTGTTTAACACCAGTTTACTCATTCTTACCTGGTGTTTATTCTGTTATTTACTTGTACTTGACATTTTATCTTACTAATGATGTTTCTTTTTTAGCACATATTCAGTGGATGGTTATGTTCACACCTTTAGTACCTTTCTGGATAACAATTGCTTATATCATTTGTATTTCCACAAAGCATTTCTATTGGTTCTTTAGTAATTACCTAAAGAGACGTGTAGTCTTTAATGGTGTTTCCTTTAGTACTTTTGAAGAAGCTGCGCTGTGCACCTTTTTGTTAAATAAAGAAATGTATCTAAAGTTGCGTAGTGATGTGCTATTACCTCTTACGCAATATAATAGATACTTAGCTCTTTATAATAAGTACAAGTATTTTAGTGGAGCAATGGATACAACTAGCTACAGAGAAGCTGCTTGTTGTCATCTCGCAAAGGCTCTCAATGACTTCAGTAACTCAGGTTCTGATGTTCTTTACCAACCACCACAAACCTCTATCACCTCAGCTGTTTTGCAG | cDNA Sequence | _cdnaPsl | | cDNA to genome PSL alignment (or empty) | _protSeq | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['nsp4']), ('note', ['nsp4B_TM; contains transmembrane domain 2 (TM2); produced by both pp1a and pp1ab']), ('protein_id', ['YP_009725300.1'])]) | Protein Sequence | _protPsl | | protein to cDNA PSL alignment (or empty) |
|
| |
|
|
Sample Rows
|
|
chrom | chromStart | chromEnd | name | score | strand | thickStart | thickEnd | reserved | blockCount | blockSizes | chromStarts | name2 | cdsStartStat | cdsEndStat | exonFrames | type | geneName | geneName2 | geneType | note | product | geneId | _cdnaSeq | _cdnaPsl | _protSeq | _protPsl |
NC_045512v2 | 8554 | 10054 | nsp4 | 0 | + | 8554 | 10054 | 0 | 1 | 1500 | 0 | nsp4 | cmpl | cmpl | 0 | N.a. | nsp4 | nsp4 | N.a. | nsp4B_TM; contains transmembrane domain 2 (TM2); produced by both pp1a and pp1ab | YP_009725300.1 | | AAAATTGTTAATAATTGGTTGAAGCAGTTAATTAAAGTTACACTTGTGTTCCTTTTTGTTGCTGCTATTTTCTATTTAATAACACCTGTTCATGTCATGTCTAAACATACTGACTTTTCAAGTGAAAT ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['nsp4']), ('note', ['nsp4B_TM; contains transmemb ... | |
NC_045512v2 | 10054 | 10972 | 3C-like proteinase | 0 | + | 10054 | 10972 | 0 | 1 | 918 | 0 | 3C-like proteinase | cmpl | cmpl | 0 | N.a. | 3C-like proteinase | 3C-like proteinase | N.a. | nsp5A_3CLpro and nsp5B_3CLpro; main proteinase (Mpro); mediates cleavages downstream of nsp4. 3D structure of the SARSr-CoV homo ... | YP_009725301.1 | | AGTGGTTTTAGAAAAATGGCATTCCCATCTGGTAAAGTTGAGGGTTGTATGGTACAAGTAACTTGTGGTACAACTACACTTAACGGTCTTTGGCTTGATGACGTAGTTTACTGTCCAAGACATGTGAT ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['3C-like proteinase']), ('note', ['nsp5A_3CLpro a ... | |
NC_045512v2 | 10972 | 11842 | nsp6 | 0 | + | 10972 | 11842 | 0 | 1 | 870 | 0 | nsp6 | cmpl | cmpl | 0 | N.a. | nsp6 | nsp6 | N.a. | nsp6_TM; putative transmembrane domain; produced by both pp1a and pp1ab | YP_009725302.1 | | AGTGCAGTGAAAAGAACAATCAAGGGTACACACCACTGGTTGTTACTCACAATTTTGACTTCACTTTTAGTTTTAGTCCAGAGTACTCAATGGTCTTTGTTCTTTTTTTTGTATGAAAATGCCTTTTT ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['nsp6']), ('note', ['nsp6_TM; putative transmembr ... | |
NC_045512v2 | 11842 | 12091 | nsp7 | 0 | + | 11842 | 12091 | 0 | 1 | 249 | 0 | nsp7 | cmpl | cmpl | 0 | N.a. | nsp7 | nsp7 | N.a. | produced by both pp1a and pp1ab | YP_009725303.1 | | TCTAAAATGTCAGATGTAAAGTGCACATCAGTAGTCTTACTCTCAGTTTTGCAACAACTCAGAGTAGAATCATCATCTAAATTGTGGGCTCAATGTGTCCAGTTACACAATGACATTCTCTTAGCTAA ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['nsp7']), ('note', ['produced by both pp1a and pp ... | |
NC_045512v2 | 12091 | 12685 | nsp8 | 0 | + | 12091 | 12685 | 0 | 1 | 594 | 0 | nsp8 | cmpl | cmpl | 0 | N.a. | nsp8 | nsp8 | N.a. | produced by both pp1a and pp1ab | YP_009725304.1 | | GCTATAGCCTCAGAGTTTAGTTCCCTTCCATCATATGCAGCTTTTGCTACTGCTCAAGAAGCTTATGAGCAGGCTGTTGCTAATGGTGATTCTGAAGTTGTTCTTAAAAAGTTGAAGAAGTCTTTGAA ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['nsp8']), ('note', ['produced by both pp1a and pp ... | |
NC_045512v2 | 12685 | 13024 | nsp9 | 0 | + | 12685 | 13024 | 0 | 1 | 339 | 0 | nsp9 | cmpl | cmpl | 0 | N.a. | nsp9 | nsp9 | N.a. | ssRNA-binding protein; produced by both pp1a and pp1ab | YP_009725305.1 | | AATAATGAGCTTAGTCCTGTTGCACTACGACAGATGTCTTGTGCTGCCGGTACTACACAAACTGCTTGCACTGATGACAATGCGTTAGCTTACTACAACACAACAAAGGGAGGTAGGTTTGTACTTGC ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['nsp9']), ('note', ['ssRNA-binding protein; produ ... | |
NC_045512v2 | 13024 | 13441 | nsp10 | 0 | + | 13024 | 13441 | 0 | 1 | 417 | 0 | nsp10 | cmpl | cmpl | 0 | N.a. | nsp10 | nsp10 | N.a. | nsp10_CysHis; formerly known as growth-factor-like protein (GFL); produced by both pp1a and pp1ab | YP_009725306.1 | | GCTGGTAATGCAACAGAAGTGCCTGCCAATTCAACTGTATTATCTTTCTGTGCTTTTGCTGTAGATGCTGCTAAAGCTTACAAAGATTATCTAGCTAGTGGGGGACAACCAATCACTAATTGTGTTAA ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['nsp10']), ('note', ['nsp10_CysHis; formerly know ... | |
NC_045512v2 | 13441 | 13480 | nsp11 | 0 | + | 13441 | 13480 | 0 | 1 | 39 | 0 | nsp11 | cmpl | cmpl | 0 | N.a. | nsp11 | nsp11 | N.a. | produced by pp1a only | YP_009725312.1 | | TCAGCTGATGCACAATCGTTTTTAAACGGGTTTGCGGTG | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['nsp11']), ('note', ['produced by pp1a only']), ( ... | |
NC_045512v2 | 13441 | 16236 | RNA-dependent RNA polymerase | 0 | + | 13441 | 16236 | 0 | 2 | 27,2769 | 0,26 | RNA-dependent RNA polymerase | cmpl | cmpl | 0,0 | N.a. | RNA-dependent RNA polymerase | RNA-dependent RNA polymerase | N.a. | nsp12; NiRAN and RdRp; produced by pp1ab only | YP_009725307.1 | | TCAGCTGATGCACAATCGTTTTTAAACCGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCACAGGCACTAGTACTGATGTCGTATACAGGGCTTTTGACATCTACAATGATAAAGTAGC ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['RNA-dependent RNA polymerase']), ('note', ['nsp1 ... | |
NC_045512v2 | 16236 | 18039 | helicase | 0 | + | 16236 | 18039 | 0 | 1 | 1803 | 0 | helicase | cmpl | cmpl | 0 | N.a. | helicase | helicase | N.a. | nsp13_ZBD, nsp13_TB, and nsp_HEL1core; zinc-binding domain (ZD), NTPase/helicase domain (HEL), RNA 5'-triphosphatase; produced b ... | YP_009725308.1 | | GCTGTTGGGGCTTGTGTTCTTTGCAATTCACAGACTTCATTAAGATGTGGTGCTTGCATACGTAGACCATTCTTATGTTGTAAATGCTGTTACGACCATGTCATATCAACATCACATAAATTAGTCTT ... | | OrderedDict([('gene', ['ORF1ab']), ('locus_tag', ['GU280_gp01']), ('product', ['helicase']), ('note', ["nsp13_ZBD, nsp13_TB, and ... | |
|
| |
|
|
NCBI Proteins (ncbiProducts) Track Description
|
|
Description
The NCBI Mature Proteins track for the 13 Jan 2020
SARS-CoV-2 virus/GCF_009858895.2 genome assembly is
constructed from the NCBI nuccore entry for NC_045512.2
https://www.ncbi.nlm.nih.gov/nuccore/NC_045512.2
It shows the mature peptides, after cleavage, as annotated on the Genbank record.
Data Access
The raw data can be explored interactively with the
Table Browser, or the
Data Integrator.
For automated analysis, the genome annotation is stored in
a bigBed file that can be downloaded from
the download server.
Annotations can
be converted to ASCII text by our tool bigBedToBed which can be compiled from
the source code or downloaded as a precompiled binary for your system.
Instructions for downloading source code and binaries can be found on our
utilities page.
The tool can also be used to obtain features within a given range,
for example:
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/wuhCor1/ncbiGene.bb -chrom=NC_045512v2 -start=0 -end=29902 stdout
Please refer to our
mailing list archives
for questions, or our
Data Access FAQ
for more information.
Credits
This track was created by Max Haeussler and Brian Raney at UCSC, with help from Daniel Schmelter
and many others. Thanks to NCBI and the US National Institutes of Health
for making all data available for download.
| |
|
|
|