Note: lifted from hg18
The polyA_DB database is a set of human mRNA polyadenlyation sites based
on EST/cDNA evidence.
A site is a single base denoting the beginning of a poly(A) tail in a nascent
mRNA transcript and is typically 10-30 nucleotides downstream of a
polyadenylation signal (most commonly AAUAAA).
The polyA_DB web server is found at
The Poly(A) composite track consists of two subtracks: a polyA_DB
subtrack that displays reported poly(A) sites, and a poly(A)
prediction subtrack that displays poly(A) sites predicted using a
support vector machine (SVM).
The poly(A) predictions are made using 1500-base DNA sequences centered at
the end of each RefSeq gene. The sequences serve as input into the
SVM described in Cheng et al., 2006. The SVM scores
using a model derived from 15 different cis-elements and reports an E-value for
a region of DNA between 0 (excellent) and 0.5 (worst). This E-value is then
normalized to an integer value between 0 (worst) and 1000 (excellent).
regions are highlighted, with the highest-scoring base indicated by a thicker
line. The median length of these regions is 48 bases.
Cheng Y, Miura RM, Tian B.
Prediction of mRNA polyadenylation sites by support vector machine.
Bioinformatics. 2006 Oct 1;22(19):2320-5.
Zhang H, Hu J, Recce M, Tian B.
PolyA_DB: a database for mammalian mRNA polyadenylation.
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D116-20.
PMID: 15608159; PMC: PMC540009