Schema for SwitchGear TSS - SwitchGear Genomics Transcription Start Sites
  Database: hg19    Primary Table: switchDbTss    Row Count: 131,780   Data last updated: 2010-12-20
Format description: Switchgear Genomics TSS DB table
fieldexampleSQL type info description
bin 585smallint(5) unsigned range Indexing field to speed chromosome range queries.
chrom chr1varchar(255) values Reference sequence chromosome or scaffold
chromStart 10464int(10) unsigned range Start in Chromosome
chromEnd 10465int(10) unsigned range End in Chromosome
name CHR1_P0001_R1varchar(255) values Name
score 1000int(10) unsigned range Score
strand +char(1) values Strand
confScore 16double range Confidence score
gmName CHR1_P0001varchar(255) values Gene model UID/name
gmChromStart 10464int(10) unsigned range Gene model chromStart
gmChromEnd 14409int(10) unsigned range Gene model chromEnd
isPseudo 0tinyint(3) unsigned range 0 if not a pseudogene TSS, 1 if it is

Sample Rows
 
binchromchromStartchromEndnamescorestrandconfScoregmNamegmChromStartgmChromEndisPseudo
585chr11046410465CHR1_P0001_R11000+16CHR1_P000110464144090
585chr11187311874CHR1_P0001_R21000+5CHR1_P000110464144091
585chr11342013421CHR1_P0001_R31000+5CHR1_P000110464144091
585chr11594515946CHR1_M0001_R101000-5CHR1_M000114359300000
585chr11748017481CHR1_M0001_R21000-41CHR1_M000114359300000
585chr11800318004CHR1_M0001_R91000-5CHR1_M000114359300000
585chr11873318734CHR1_M0001_R81000-5CHR1_M000114359300000
585chr11917219173CHR1_M0001_R61000-10CHR1_M000114359300000
585chr11973519736CHR1_M0001_R51000-26CHR1_M000114359300000
585chr12051020511CHR1_M0001_R71000-5CHR1_M000114359300000

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

SwitchGear TSS (switchDbTss) Track Description
 

Description

This track describes the location of transcription start sites (TSS) throughout the human genome along with a confidence measure for each TSS based on experimental evidence. The TSSs of a gene are important landmarks that help define the promoter regions of a gene. These TSSs were determined by SwitchGear Genomics by integrating experimental data using an empirically derived scoring function. Each TSS has a unique identifier that associates it with a gene model (see details below), and each TSS is color-coded to reflect its confidence score.

These TSSs are also available in a searchable format at SwitchDB, an open-access online database of human TSSs. Expermental tools are available through SwitchGear to study the function of the promoter regions associated with these TSSs.

Methods

The predicted TSSs are associated with a genome-wide set of gene models. SwitchGear gene models are defined as clusters of cDNA alignments that have overlapping exons on the same strand. These gene models were created from over 250,000 human cDNA alignments to construct a genome-wide set of ~37,000 gene models. Each gene model is identified by its chromosome number, strand, and unique identifier. For example, ID CHR7_P0362 indicates a cDNA cluster (0362) aligning to the plus strand (P) of chromosome 7 (CHR7). Existing gene annotation is mapped to the gene models through the NCBI annotation associated with Refseq accession numbers.

The SwitchGear TSS prediction algorithm identifies the most likely sites of transcription initiation for each gene model. The algorithm employs a scoring metric to assign a confidence level to each TSS prediction based on existing experimental evidence. In addition to the ~250,000 human cDNAs listed in Genbank, more than 5 million additional 5' human cDNA sequence tags have been generated using a combination of approaches. While these short sequence reads do not reveal gene structure, they provide a significant amount of experimental evidence for identifying transcript start sites. For each gene model, the algorithm counts the number of TSSs (defined as the 5' end of a cDNA) within 200 bp of one another. The TSS score is based on the total number of TSSs identified within this window, with each TSS weighted according to several discriminating features: cDNA library source, relative location within the gene model, and exon structure of the transcript. Furthermore, the TSSs for each gene model are ranked to identify the TSS representing the most likely transcription initiation site for a gene model. Rankings are indicated in the TSS unique identifier by the addition of a suffix (i.e. CHR7_P0362_R1 or CHR7_P0362_R2).

Using the Filter

This track has a filter that can be used to change the TSS elements displayed by the browser. This filter is based on the score of the TSS element. The filter is located at the top of the track description page, which is accessed via the small button to the left of the track's graphical display or through the link on the track's control menu. By default the track displays only those TSSs with a score of 10 or above.

By default, the TSSs for predicted pseudogenes are not displayed. If you would like to display them, check the box next to the Include TSSs for predicted pseudogenes label.

When you have finished configuring the filter, click the Submit button.

Credits

This track was created by Nathan Trinklein and Shelley Force Aldred of SwitchGear Genomics.