ENC TF Binding UChicago TFBS Track Settings
Transcription Factor Binding Sites by Epitope-Tag from ENCODE/UChicago

Track collection: ENCODE Transcription Factor Binding

+  Description
+  All tracks in this collection (7)

Maximum display mode:       Reset to defaults   
Select views (Help):
Peaks ▾       Signal ▾      
Select subtracks by cell line and factor:
 All Cell Line K562 (Tier 1) 
List subtracks: only selected/visible    all    ()
  Cell Line↓1 Factor↓2 views↓3   Track Name↓4    Restricted Until↓5
 K562  FOS  Peaks  K562 FOS GFP-tag TFBS Peaks from ENCODE/UChicago    Schema   2011-10-27 
 K562  FOS  Signal  K562 FOS GFP-tag TFBS Signal from ENCODE/UChicago    Schema   2011-10-27 
 K562  GATA2  Peaks  K562 GATA2 GFP-tag TFBS Peaks from ENCODE/UChicago    Schema   2011-10-27 
 K562  GATA2  Signal  K562 GATA2 GFP-tag TFBS Signal from ENCODE/UChicago    Schema   2011-10-27 
 K562  HDAC8  Peaks  K562 HDAC8 GFP-tag TFBS Peaks from ENCODE/UChicago    Schema   2011-10-27 
 K562  HDAC8  Signal  K562 HDAC8 GFP-tag TFBS Signal from ENCODE/UChicago    Schema   2011-10-27 
 K562  JunB  Peaks  K562 JunB GFP-tag TFBS Peaks from ENCODE/UChicago    Schema   2011-10-27 
 K562  JunB  Signal  K562 JunB GFP-tag TFBS Signal from ENCODE/UChicago    Schema   2011-10-27 
 K562  JunD  Peaks  K562 JunD GFP-tag TFBS Peaks from ENCODE/UChicago    Schema   2011-10-27 
 K562  JunD  Signal  K562 JunD GFP-tag TFBS Signal from ENCODE/UChicago    Schema   2011-10-27 
 K562  NR4A1  Peaks  K562 NR4A1 GFP-tag TFBS Peaks from ENCODE/UChicago    Schema   2011-10-27 
 K562  NR4A1  Signal  K562 NR4A1 GFP-tag TFBS Signal from ENCODE/UChicago    Schema   2011-10-27 
     Restriction Policy


This track maps genome-wide human transcription factor binding sites using second-generation massively parallel sequencing. This mapping uses expressed transcription factors as GFP-tagged fusion proteins after bacterial artificial chromosome (BAC) recombineering (recombination-mediated genetic engineering).

The University of Chicago and Max Planck Institute (Dresden) pipeline generates recombineered BACs for the production of cell lines or animals that express fusion proteins from epitope-tagged transgenes.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here.

For each cell type, this track contains the following views:

Regions of signal enrichment based on processed data (usually normalized data from pooled replicates).
Density graph (wiggle) of signal enrichment based on aligned read density.

Peaks and signals displayed in this track are the results of pooled replicate sequence. Alignment files for each replicate are available for download.


Cells were grown according to the approved ENCODE cell culture protocols.

Recombineering Strategy

To facilitate high-throughput production of the transgenic constructs, the program BACFinder (Crowe et al., 2002) automatically selects the most suitable BAC clone for any given human gene and generates the sets of PCR primers required for tagging and verification (Poser et al., 2008). Recombineering is used for tagging cassettes at either the N or C terminus of the protein. The N-terminal cassette has a dual eukaryotic-prokaryotic promoter (PGK-gb2) driving a neomycin-kanamycin resistance gene within an artificial intron inside the tag coding sequence. The selection cassette is flanked by two loxP sites and can be permanently removed by Cre Recombinase-mediated excision. The C-terminal cassette contains the sequence encoding the tag followed by an internal ribosome entry site (IRES) in front of the neomycin resistance gene. In addition, a short bacterial promoter (Gb3) drives the expression of the neomycin-kanamycin resistance gene in E. coli.

The tagging cassettes, containing 50 nucleotides of PCR-introduced homology arms, were inserted into the BAC by recombineering, either behind the start codon (for the N-terminal tag) or in front of the stop codon (for the C-terminal tag) of the gene. E. coli cells that had successfully recombined the cassette were selected for kanamycin resistance in liquid culture. Each saturated culture from a specific recombineering reaction derived 10-200 independent recombination events.

Two independent clones were checked for each PCR through the tag insertion point and 97% (85/88) yielded a PCR product of the expected size. Most of the clones that failed to grow were missing the targeted genomic region. An estimated 10% of the BACs used were chimeric, rearranged or wrongly mapped. Thus, initial results indicated that the necessary recombineering steps could be carried out with high fidelity.

The White lab produced all epitope tagged transcription and chromatin factor BACs, as well as the genome-wide ChIP data and analysis. An application of this approach to the analysis of closely related paralogs (RARa and RARg) yielded transcription factors, chromatin factors, cell lines, ChIP-chip data and ChIP-seq data (Hua et al., 2009). Such paralogous transcription factors often cannot otherwise be distinguished by antibodies.

Sample Preparation

ChIP DNA from samples were sheared to approximately 800 bp using a nebulizer. The ends of the DNA were polished and two unique adapters were ligated to the fragments. Ligated fragments of 150-200 bp were isolated by gel extraction and amplified using limited cycles of PCR.

Sequencing System

Illumina GAIIx and HySeq next-generation sequencing were used to produce all ChIP-seq data.

Processing and Analysis Software

Raw sequencing reads were aligned using Bowtie 0.12.5 (Langmead et al., 2009). The "-m 1" parameter was applied to suppress alignments mapping more than once in the genome. Reads were aligned to the UCSC hg19 assembly. Wiggle format signal files were generated with SPP 2.7.1 (Kharchenko et al., 2008) for R 2.7.1. MACS 1.3.7 was used to call peaks. The MACS parameters used varied by experiment.

The White lab used goat anti-GFP antibody to perform ChIP in untagged K562 cells as a background control. The test IP was performed in the same manner as the background control. Results were expressed as values of the test normalized to the background.


These data and annotations were created by a collaboration of University of Chicago and Argonne National Laboratory:


Crowe ML, Rana D, Fraser F, Bancroft I, Trick M. BACFinder: genomic localisation of large insert genomic clones based on restriction fingerprinting. Nucleic Acids Res. 2002 Nov 1;30(21):e118.

Hua S, Kittler R, White KP. Genomic antagonism between retinoic acid and estrogen signaling in breast cancer. Cell. 2009 Jun 26;137(7):1259-71.

Kharchenko PV, Tolstorukov MY, Park PJ. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol. 2008 Dec;26(12):1351-9.

Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3):R25.

Poser I, Sarov M, Hutchins JR, Hériché JK, Toyoda Y, Pozniakovsky A, Weigl D, Nitzsche A, Hegemann B, Bird AW et al. BAC TransgeneOmics: a high-throughput method for exploration of protein function in mammals. Nat Methods. 2008 May;5(5):409-15.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.