Schema for SYDH TFBS - Transcription Factor Binding Sites by ChIP-seq from ENCODE/Stanford/Yale/USC/Harvard

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

example

SQL type

info

description

bin

592

smallint(5) unsigned

range

Indexing field to speed chromosome range queries.

chrom

chr1

varchar(255)

values

Reference sequence chromosome or scaffold

chromStart

999253

int(10) unsigned

range

Start position in chromosome

chromEnd

999951

int(10) unsigned

range

End position in chromosome

name

varchar(255)

values

Name given to a region (preferably unique). Use . if no name is assigned

score

135

int(10) unsigned

range

Indicates how dark the peak will be displayed in the browser (0-1000)

strand

char(2)

values

+ or - or . for unknown

signalValue

3.561

float

range

Measurement of average enrichment for the region

pValue

14.818

float

range

Statistical significance of signal value (-log10). Set to -1 if not used.

qValue

0.0000000000000476522

float

range

Statistical significance with multiple-test correction applied (FDR -log10). Set to -1 if not used.

peak

476

int(11)

range

Point-source called for this peak; 0-based offset from chromStart. Set to -1 if no point-source called.

bin

chrom

chromStart

chromEnd

name

score

strand

signalValue

pValue

qValue

peak

592

chr1

999253

999951

135

3.561

14.818

0.0000000000000476522

476

593

chr1

1136779

1137109

212

7.767

11.17

0.000000000135765

160

593

chr1

1150418

1150925

234

8.986

34.301

7.36225e-33

250

602

chr1

2231741

2233009

117

2.539

11.08

0.000000000164931

239

603

chr1

2477497

2480512

113

2.319

19.249

2.76624e-18

2783

603

chr1

2487022

2488671

133

3.434

25.365

3.51586e-24

509

604

chr1

2508825

2510512

147

4.188

41.336

1.09191e-39

1126

612

chr1

3589616

3591111

169

5.444

55.993

766

612

chr1

3593082

3595174

122

2.849

22.992

7.00667e-22

639

630

chr1

5976027

5976486

148

4.268

14.561

0.0000000000000833679

198

Description

This track shows probable binding sites of the specified transcription factors (TFs) in the given cell types as determined by chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq). Included for each cell type is the input signal, which represents the control condition where no antibody targeting was performed. For each experiment (cell type vs. antibody) this track shows a graph of enrichment for TF binding (Signal), along with sites that have the greatest evidence of transcription factor binding (Peaks).

The sequence reads, quality scores, and alignment coordinates from these experiments are available for download.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here. ENCODE tracks typically contain one or more of the following views:

Peaks: Regions of signal enrichment based on processed data (normalized data from pooled replicates). ENCODE Peaks tables contain fields for statistical significance, including the minimum false discovery rate (FDR) threshold at which the test may be called significant (qValue).
Signal: Density graph (wiggle) of signal enrichment based on processed data.

Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.

Methods

Cells were grown according to the approved ENCODE cell culture protocols. Further preparations were similar to those previously published (Euskirchen et al., 2007) with the exceptions that the cells were unstimulated and sodium orthovanadate was omitted from the buffers. For details on the chromatin immunoprecipitation protocol used, see (Euskirchen et al., 2007) and (Rozowsky et al., 2009).

DNA recovered from the precipitated chromatin was sequenced on the Illumina (Solexa) sequencing platform and mapped to the genome using the Eland alignment program. ChIP-seq data was scored based on sequence reads (length ~30 bp) that align uniquely to the human genome. From the mapped tags a signal map of ChIP DNA fragments (average fragment length ~ 200 bp) was constructed where the signal height is the number of overlapping fragments at each nucleotide position in the genome.

For each 1 Mb segment of each chromosome a peak height threshold was determined by requiring a false discovery rate less than or equal to 0.05 when comparing the number of peaks above threshold as compared the number obtained from multiple simulations of a random null background with the same number of mapped reads (also accounting for the fraction of mapable bases for sequence tags in that 1 Mb segment). The number of mapped tags in a putative binding region is compared to the normalized (normalized by correlating tag counts in genomic 10 kb windows) number of mapped tags in the same region from an input DNA control. Using a binomial test, only regions that have a p-value less than or equal to 0.05 are considered to be significantly enriched compared to the input DNA control.

Release Notes

This is Release 3 (August 2012). This release adds in 37 new experiments including 1 new cell line and 7 new antibodies.

Credits

These data were generated and analyzed by the labs of Michael Snyder at Stanford University; Mark Gerstein and Sherman Weissman at Yale University; Peggy Farnham at University of Southern California; and Kevin Struhl at Harvard.

Contact: Philip Cayting.

References

Cao AR, Rabinovich R, Xu M, Xu X, Jin VX, Farnham PJ. Genome-wide analysis of transcription factor E2F1 mutant proteins reveals that N- and C-terminal protein interaction domains do not participate in targeting E2F1 to the human genome. J Biol Chem. 2011 Apr 8;286(14):11985-96.

Euskirchen G, Royce TE, Bertone P, Martone R, Rinn JL, Nelson FK, Sayward F, Luscombe NM, Miller P, Gerstein M et al. CREB binds to multiple loci on human chromosome 22. Mol Cell Biol. 2004 May;24(9):3804-14.

Euskirchen GM, Rozowsky JS, Wei CL, Lee WH, Zhang ZD, Hartman S, Emanuelsson O, Stolc V, Weissman S, Gerstein MB et al. Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res. 2007 Jun;17(6):898-909.

Iyengar S, Ivanov AV, Jin VX, Rauscher FJ 3rd, Farnham PJ. Functional analysis of KAP1 genomic recruitment. Mol Cell Biol. 2011 May;31(9):1833-47.

Martone R, Euskirchen G, Bertone P, Hartman S, Royce TE, Luscombe NM, Rinn JL, Nelson FK, Miller P, Gerstein M et al. Distribution of NF-kappaB-binding sites across human chromosome 22. Proc Natl Acad Sci U S A. 2003 Oct 14;100(21):12247-52.

Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods. 2007 Aug;4(8):651-7.

Rozowsky J, Euskirchen G, Auerbach RK, Zhang ZD, Gibson T, Bjornson R, Carriero N, Snyder M, Gerstein MB. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol. 2009 Jan;27(1):66-75.

Publications

Kang YA, Sanalkumar R, O'Geen H, Linnemann AK, Chang CJ, Bouhassira EE, Farnham PJ, Keles S, Bresnick EH. Autophagy driven by a master regulator of hematopoiesis. Mol Cell Biol. 2012 Jan;32(1):226-39.

Krebs AR, Karmodiya K, Lindahl-Allen M, Struhl K, Tora L. SAGA and ATAC histone acetyl transferase complexes regulate distinct sets of genes and ATAC defines a class of p300-independent enhancers. Mol Cell. 2011 Nov 4;44(3):410-23.

Linnemann AK, O'Geen H, Keles S, Farnham PJ, Bresnick EH. Genetic framework for GATA factor function in vascular biology. Proc Natl Acad Sci U S A. 2011 Aug 16;108(33):13641-6.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column on the track configuration page and the download page. The full data release policy for ENCODE is available here.