N-SCAN Gene Predictions (scaffold_15.103.1)
  Position: scaffold_15:19047390-19048663
Genomic Size: 1274
Strand: -
Gene Symbol: scaffold_15.103
CDS Start: complete
CDS End: complete

Links to sequence:

Data schema/format description and download

Go to N-SCAN track controls

Data last updated at UCSC: 2008-05-30


This track shows gene predictions using the N-SCAN gene structure prediction software provided by the Computational Genomics Lab at Washington University in St. Louis, MO, USA.



N-SCAN combines biological-signal modeling in the target genome sequence along with information from a multiple-genome alignment to generate de novo gene predictions. It extends the TWINSCAN target-informant genome pair to allow for an arbitrary number of informant sequences as well as richer models of sequence evolution. N-SCAN models the phylogenetic relationships between the aligned genome sequences, context-dependent substitution rates, insertions, and deletions.

For creating predictions on guinea pig, N-SCAN uses human (hg18) as the informant.

N-SCAN EST combines EST alignments into N-SCAN. Similar to the conservation sequence models in TWINSCAN, separate probability models are developed for EST alignments to genomic sequence in exons, introns, splice sites and UTRs, reflecting the EST alignment patterns in these regions. N-SCAN PASA-EST is more accurate than N-SCAN while retaining the ability to discover novel genes to which no ESTs align.

There are currently not enough guinea pig ESTs to create an estseq track. Instead, the mm9 EST and mRNAs transmap tracks were used to generate estseq tracks. cDNAs with invalid splice sites were removed.

No manual annotation was performed to generate any of the gene models. Pseudogenes were not masked.


Thanks to Michael Brent's Computational Genomics Group at Washington University St. Louis for providing these data.

Special thanks for this implementation of N-SCAN to Aaron Tenney in the Brent lab, and Robert Zimmermann, currently at Max F. Perutz Laboratories in Vienna, Austria.


Gross SS, Brent MR. Using multiple alignments to improve gene prediction. J Comput Biol. 2006 Mar;13(2):379-93. PMID: 16597247

Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003 Oct 1;31(19):5654-66. PMID: 14500829; PMC: PMC206470

Korf I, Flicek P, Duan D, Brent MR. Integrating genomic homology into gene structure prediction. Bioinformatics. 2001;17 Suppl 1:S140-8. PMID: 11473003

van Baren MJ, Brent MR. Iterative gene prediction and pseudogene removal improves genome annotation. Genome Res. 2006 May;16(5):678-85. PMID: 16651666; PMC: PMC1457044