Schema for HAIB Methyl RRBS - DNA Methylation by Reduced Representation Bisulfite Seq from ENCODE/HudsonAlpha
  Database: hg19    Primary Table: wgEncodeHaibMethylRrbsH1hescHaibSitesRep1    Row Count: 1,288,079   Data last updated: 2012-07-26
Format description: BED9+2 Number of reads + percent methylation
On download server: MariaDB table dump directory
fieldexampleSQL type description
bin 585smallint(5) unsigned Indexing field to speed chromosome range queries.
chrom chr1varchar(255) Reference chromosome or scaffold
chromStart 88704int(10) unsigned Start position in chromosome
chromEnd 88705int(10) unsigned End position in chromosome
name SL716_RRBSvarchar(255) Name of item
score 3int(10) unsigned Score from 0-1000. Capped number of reads
strand +char(1) + or - or . for unknown
thickStart 88704int(10) unsigned Start of where display should be thick (start codon)
thickEnd 88705int(10) unsigned End of where display should be thick (stop codon)
itemRgb 16751360int(10) unsigned Color value R,G,B
readCount 3int(10) unsigned Number of reads or coverage
percentMeth 67int(10) unsigned Percentage of reads that show methylation at this position in the genome

Sample Rows
 
binchromchromStartchromEndnamescorestrandthickStartthickEnditemRgbreadCountpercentMeth
585chr18870488705SL716_RRBS3+8870488705255,155,0367
586chr1137976137977SL716_RRBS2+137976137977255,0,02100
586chr1137985137986SL716_RRBS2+137985137986255,0,02100
590chr1713375713376SL716_RRBS11+713375713376255,105,01182
590chr1713387713388SL716_RRBS11+713387713388255,55,01191
590chr1713399713400SL716_RRBS11+713399713400255,55,01191
590chr1714565714566SL716_RRBS62-7145657145660,255,0620
590chr1714583714584SL716_RRBS62-71458371458455,255,06210
590chr1731153731154SL716_RRBS2+731153731154255,0,02100
590chr1783245783246SL716_RRBS1-783245783246255,0,01100

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

HAIB Methyl RRBS (wgEncodeHaibMethylRrbs) Track Description
 

Description

This track was produced as part of the ENCODE project. The track reports the percentage of DNA molecules that exhibit cytosine methylation at specific CpG dinucleotides. In general, DNA methylation within a gene's promoter is associated with gene silencing and DNA methylation within the exons and introns of a gene is associated with gene expression. Proper regulation of DNA methylation is essential during development and aberrant DNA methylation is a hallmark of cancer. DNA methylation status was assayed at more than 500,000 CpG dinucleotides in the genome using Reduced Representation Bisulfite Sequencing (RRBS). Genomic DNA was digested with the methyl-insensitive restriction enzyme MspI and then small genomic DNA fragments were purified by gel electrophoresis and used to construct an Illumina sequencing library. The library fragments were treated with sodium bisulfite and amplified by PCR to convert every unmethylated cytosine to a thymidine while leaving methylated cytosines intact. The sequenced fragments were aligned to a customized reference genome sequence. For each assayed CpG, the number of sequencing reads covering that CpG and the percentage of those reads that were methylated were reported.

Display Conventions and Configuration

Methylation status is represented with an 11-color gradient using the following convention:

  • red = 100% of molecules sequenced are methylated
  • yellow = 50% of molecules sequenced are methylated
  • green = 0% of molecules sequenced are methylated

The score in this track reports the number of sequencing reads obtained for each CpG, which is often called 'coverage'. The score is capped at 1000, so any CpGs that were covered by more than 1000 sequencing reads have a score of 1000. The BED files available for download contain two extra columns: one with the uncapped coverage (number of reads at that site) and one with the percentage of those reads that show methylation. High reproducibility was obtained, with correlation coefficients greater than 0.9 between biological replicates, when only considering CpGs represented by at least 10 sequencing reads (10X coverage, score=10). Therefore, the default view for this track is set to 10X coverage, or a score of 10.

Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.

Methods

DNA methylation at CpG sites was assayed with a modified version of Reduced Representation Bisulfite Sequencing (Meissner et al., 2008). RRBS was performed on cell lines grown by many ENCODE production groups. The production group that grew the cells and isolated genomic DNA is indicated in the "obtainedBy" field of the metadata. When a cell type was provided by more than one lab, the data from only one lab are available for immediate display. However, the data for every cell type from every lab is available from the Downloads page. RRBS was also performed on genomic DNA from tissue samples provided by BioChain. The replicates for the BioChain tissues are technical replicates (rather than biological replicates) beginning at the bisulfite treatment step. RRBS was carried out by the Myers production group at the HudsonAlpha Institute for Biotechnology.

Isolation of Genomic DNA

Genomic DNA was isolated from biological replicates of each cell line using the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by the manufacturer. DNA concentrations for each genomic DNA preparation were determined using fluorescent DNA binding dye and a fluorometer (Invitrogen Quant-iT dsDNA High Sensitivity Kit and Qubit Fluorometer). Typically, 1 µg of DNA is used to make an RRBS library; however, there has been success in making libraries with 200 ng genomic DNA from rare or precious samples.

RRBS Library Construction and Sequencing

RRBS library construction started with MspI digestion of genomic DNA, which cut at every CCGG regardless of methylation status. Klenow exo- DNA Polymerase was then used to fill in the recessed end of the genomic DNA and add an adenosine as a 3' overhang. Next, a methylated version of the Illumina paired-end adapters was ligated onto the DNA. Adapter-ligated genomic DNA fragments between 105 and 185 base pairs were selected using agarose gel electrophoresis and a Qiagen Qiaquick Gel Extraction Kit. The selected adapter-ligated fragments were treated with sodium bisulfite using the Zymo Research EZ DNA Methylation Gold Kit, which converts unmethylated cytosines to uracils and leaves methylated cytosines unchanged. Bisulfite treated DNA was amplified in a final PCR reaction which was optimized to uniformly amplify diverse fragment sizes and sequence contexts in the same reaction. During this final PCR reaction, uracils were copied as thymines resulting in a thymine in the PCR products wherever an unmethylated cytosine existed in the genomic DNA. The sample was then ready for sequencing on the Illumina sequencing platform. These libraries were sequenced with an Illumina Genome Analyzer IIx according to the manufacturer's recommendations. The full RRBS protocol can be found here.

Data Analysis

To analyze the sequence data, a reference genome was created that contained only the 36 base pairs adjacent to every MspI site and in which every C was changed to a T. A converted sequence read file was then created by changing each C in the original sequence reads to a T. The converted sequence reads were aligned to the converted reference genome and only reads that mapped uniquely to the reference genome were kept. Once the reads were aligned, the percent methylation was calculated for each CpG using the original sequence reads. The percent methylation and number of reads were reported for each CpG.

Release Notes

This is Release 3 (July 2012) of this track which adds the MCF-7 cell line with shRNA knockdowns obtained from the Crawford Lab at Duke University.

Credits

These data were produced by the Dr. Richard Myers Lab at the HudsonAlpha Institute for Biotechnology.

Cells were grown by the Myers Lab and other ENCODE production groups.

Contact: Dr. Florencia Pauli

References

Meissner A, Mikkelsen TS, Gu H, Wernig M, Hanna J, Sivachenko A, Zhang X, Bernstein BE, Nusbaum C, Jaffe DB et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature. 2008 Aug 7;454(7205):766-70.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.