Schema for Human Proteins - Human Proteins Mapped by Chained tBLASTn
|
|
Database: ailMel1 Primary Table: blastHg18KG Row Count: 64,037   Data last updated: 2010-04-20
Format description: Summary info about a patSpace alignment On download server: MariaDB table dump directory
field | example | SQL type | info | description |
bin | 73 | smallint(5) unsigned | range | Indexing field to speed chromosome range queries. |
matches | 2768 | int(10) unsigned | range | Number of bases that match that aren't repeats |
misMatches | 112 | int(10) unsigned | range | Number of bases that don't match |
repMatches | 0 | int(10) unsigned | range | Number of bases that match but are part of repeats |
nCount | 0 | int(10) unsigned | range | Number of 'N' bases |
qNumInsert | 0 | int(10) unsigned | range | Number of inserts in query |
qBaseInsert | 0 | int(10) unsigned | range | Number of bases inserted in query |
tNumInsert | 3 | int(10) unsigned | range | Number of inserts in target |
tBaseInsert | 1340 | int(10) unsigned | range | Number of bases inserted in target |
strand | ++ | char(2) | values | + or - for strand. First character query, second target (optional) |
qName | AY647157 | varchar(255) | values | Query sequence name |
qSize | 2881 | int(10) unsigned | range | Query sequence size |
qStart | 0 | int(10) unsigned | range | Alignment start position in query |
qEnd | 2881 | int(10) unsigned | range | Alignment end position in query |
tName | GL192338.1 | varchar(255) | values | Target sequence name |
tSize | 6047896 | int(10) unsigned | range | Target sequence size |
tStart | 74837 | int(10) unsigned | range | Alignment start position in target |
tEnd | 213165 | int(10) unsigned | range | Alignment end position in target |
blockCount | 41 | int(10) unsigned | range | Number of blocks in alignment |
blockSizes | 200,284,111,36,50,23,18,39,... | longblob | | Size of each block |
qStarts | 0,200,484,595,631,681,704,7... | longblob | | Start of each block in query. |
tStarts | 74837,75443,118198,129381,1... | longblob | | Start of each block in target. |
|
| |
|
|
Sample Rows
|
|
bin | matches | misMatches | repMatches | nCount | qNumInsert | qBaseInsert | tNumInsert | tBaseInsert | strand | qName | qSize | qStart | qEnd | tName | tSize | tStart | tEnd | blockCount | blockSizes | qStarts | tStarts |
---|
73 | 2768 | 112 | 0 | 0 | 0 | 0 | 3 | 1340 | ++ | AY647157 | 2881 | 0 | 2881 | GL192338.1 | 6047896 | 74837 | 213165 | 41 | 200,284,111,36,50,23,18,39,29,46,40,82,59,47,86,70,66,56,60,37,66,66,54,29,35,40,19,14,73,68,242,60,54,75,56,47,74,43,35,222,69, | 0,200,484,595,631,681,704,723,762,791,837,877,959,1018,1065,1151,1221,1287,1343,1403,1440,1506,1572,1626,1655,1690,1730,1749,176 ... | 74837,75443,118198,129381,133170,134214,134368,135081,138896,139187,143451,147068,151156,152453,152714,154100,156372,158995,1598 ... |
585 | 100 | 17 | 0 | 0 | 0 | 0 | 0 | 0 | ++ | BC033770 | 119 | 2 | 119 | GL192338.1 | 6047896 | 75084 | 75435 | 1 | 117, | 2, | 75084, |
585 | 100 | 17 | 0 | 0 | 0 | 0 | 0 | 0 | ++ | NM_025134 | 119 | 2 | 119 | GL192338.1 | 6047896 | 75084 | 75435 | 1 | 117, | 2, | 75084, |
586 | 172 | 54 | 0 | 0 | 0 | 0 | 0 | 0 | ++ | AK000368 | 251 | 12 | 242 | GL192338.1 | 6047896 | 139187 | 151342 | 4 | 43,40,81,62, | 12,58,99,180, | 139187,143451,147071,151156, |
586 | 1944 | 51 | 0 | 0 | 0 | 0 | 2 | 934 | ++ | AB002306 | 1995 | 0 | 1995 | GL192338.1 | 6047896 | 147095 | 213165 | 30 | 73,59,47,86,70,66,56,60,37,66,66,54,29,35,40,19,14,73,68,242,60,54,75,56,47,74,43,35,222,69, | 0,73,132,179,265,335,401,457,517,554,620,686,740,769,804,844,863,877,950,1018,1260,1320,1374,1449,1505,1552,1626,1669,1704,1926, | 147095,151156,152453,152714,154100,156372,158995,159859,166380,170732,171401,175768,176364,181607,182216,182955,184742,185947,18 ... |
586 | 1944 | 51 | 0 | 0 | 0 | 0 | 2 | 934 | ++ | AY243500 | 1995 | 0 | 1995 | GL192338.1 | 6047896 | 147095 | 213165 | 30 | 73,59,47,86,70,66,56,60,37,66,66,54,29,35,40,19,14,73,68,242,60,54,75,56,47,74,43,35,222,69, | 0,73,132,179,265,335,401,457,517,554,620,686,740,769,804,844,863,877,950,1018,1260,1320,1374,1449,1505,1552,1626,1669,1704,1926, | 147095,151156,152453,152714,154100,156372,158995,159859,166380,170732,171401,175768,176364,181607,182216,182955,184742,185947,18 ... |
586 | 1944 | 51 | 0 | 0 | 0 | 0 | 2 | 934 | ++ | DQ059482 | 1995 | 0 | 1995 | GL192338.1 | 6047896 | 147095 | 213165 | 30 | 73,59,47,86,70,66,56,60,37,66,66,54,29,35,40,19,14,73,68,242,60,54,75,56,47,74,43,35,222,69, | 0,73,132,179,265,335,401,457,517,554,620,686,740,769,804,844,863,877,950,1018,1260,1320,1374,1449,1505,1552,1626,1669,1704,1926, | 147095,151156,152453,152714,154100,156372,158995,159859,166380,170732,171401,175768,176364,181607,182216,182955,184742,185947,18 ... |
586 | 335 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | ++ | AK022240 | 336 | 0 | 336 | GL192338.1 | 6047896 | 147095 | 154313 | 5 | 73,59,47,86,71, | 0,73,132,179,265, | 147095,151156,152453,152714,154100, |
73 | 1079 | 39 | 0 | 0 | 0 | 0 | 1 | 41 | ++ | NM_005611 | 1139 | 15 | 1139 | GL192338.1 | 6047896 | 239494 | 289328 | 23 | 10,55,43,66,23,43,54,21,62,55,34,35,46,55,38,89,94,59,24,35,67,54,56, | 15,25,80,123,189,212,255,309,331,393,448,485,520,566,621,659,748,842,901,925,961,1029,1083, | 239494,239545,242218,244650,249003,253044,256783,257036,257708,262057,265769,266768,268956,270212,271118,274041,274482,274854,27 ... |
586 | 180 | 69 | 0 | 0 | 0 | 0 | 0 | 0 | +- | NM_000969 | 296 | 1 | 296 | GL192338.1 | 6047896 | 254620 | 255502 | 7 | 22,9,43,53,60,29,33, | 1,40,62,107,174,234,263, | 5792394,5792510,5792575,5792710,5792910,5793090,5793177, |
|
Note: all start coordinates in our database are 0-based, not
1-based. See explanation
here.
| |
|
|
Human Proteins (blastHg18KG) Track Description
|
|
Description
This track contains tBLASTn alignments of the peptides from the predicted and
known genes identified in the hg18 UCSC Genes track.
Methods
First, the predicted proteins from the human UCSC Genes track were aligned
with the human genome using the Blat program to discover exon boundaries.
Next, the amino acid sequences that make up each exon were aligned with the
panda sequence using the tBLASTn program.
Finally, the putative panda exons were chained together using an
organism-specific maximum gap size but no gap penalty. The single best exon
chains extending over more than 60% of the query protein were included. Exon
chains that extended over 60% of the query and matched at least 60% of the
protein's amino acids were also included.
Credits
tBLASTn is part of the NCBI BLAST tool set. For more information on BLAST, see
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ.
Basic local alignment search tool.
J Mol Biol. 1990 Oct 5;215(3):403-410.
Blat was written by Jim Kent. The remaining utilities
used to produce this track were written by Jim Kent or Brian Raney.
| |
|
|
|