Schema for CRISPR Targets - CRISPR/Cas9 -NGG Targets, whole genome

JavaScript is disabled in your web browser

You must have JavaScript enabled in your web browser to use the Genome Browser

Database: mm39 Primary Table: crisprAllTargets Data last updated: 2020-08-13
Big Bed File Download: /gbdb/mm39/crisprAll/crispr.bb
Item Count: 276,331,386
The data is stored in the binary BigBed format.

Format description: crispr targets

field	example	description
`chrom`	chr1	Reference sequence chromosome or scaffold
`chromStart`	130102843	Start position in chromosome
`chromEnd`	130102866	End position in chromosome
`name`		Name or ID of item, ideally both human readable and unique
`score`	72	Score (0-1000)
`strand`	-	+ or - for strand
`thickStart`	130102846	Start of where display should be thick (start codon)
`thickEnd`	130102866	End of where display should be thick (stop codon)
`reserved`	0,200,0	Doench 2016 / Fusi et al. Score
`_crisprScanColor`	255,100,100	Moreno-Mateos Score
`_specColor`	0,255,0	MIT Specificity Score
`guideSeq`	ATAAAACTAAGAATACGGAA	Guide Sequence
`pam`	GGG	Protospacer Adjacent Motif (PAM)
`scoreDesc`	72	MIT Guide Specificity Score
`fusi`	65	Efficiency: Doench et al. 2016 Score (raw)
`fusiPerc`	92	Efficiency: Doench et al. 2016 Score (percentile)
`crisprScan`	24	Efficiency: Moreno-Mateos T7 Score (raw)
`crisprScanPerc`	12	Efficiency: Moreno-Mateos T7 Score (percentile)
`doench`	80	Efficiency: Doench et al 2014 Score
`oof`	68	Bae et al. Out-of-Frame Score
`_mouseOver`	MIT Spec. Score: 72, Doench 2016: 92%, Moreno-Mateos: 12%	Label for Mouse-over
`_offset`	81079578683	Offset into tab-sep file for details page

Sample Rows

chrom	chromStart	chromEnd	score	strand	thickStart	thickEnd	reserved	_crisprScanColor	_specColor	guideSeq	pam	scoreDesc	fusi	fusiPerc	crisprScan	crisprScanPerc	doench	oof	_mouseOver	_offset
chr1	130102843	130102866	72	-	130102846	130102866	0,200,0	255,100,100	0,255,0	ATAAAACTAAGAATACGGAA	GGG	72	65	92	24	12	80	68	MIT Spec. Score: 72, Doench 2016: 92%, Moreno-Mateos: 12%	81079578683
chr1	130102844	130102867	68	-	130102847	130102867	0,200,0	255,100,100	128,128,0	AATAAAACTAAGAATACGGA	AGG	68	55	66	22	10	14	69	MIT Spec. Score: 68, Doench 2016: 66%, Moreno-Mateos: 10%	63743546505
chr1	130102848	130102871	37	-	130102851	130102871	80,80,80	80,80,80	255,0,0	GTAAAATAAAACTAAGAATA	CGG	37	45	35	40	39	43	63	MIT Spec. Score: 37, Doench 2016: 35%, Moreno-Mateos: 39%	261811081887
chr1	130102856	130102879	27	+	130102856	130102876	80,80,80	80,80,80	255,0,0	TTAGTTTTATTTTACAGAAC	AGG	27	39	21	23	11	17	65	MIT Spec. Score: 27, Doench 2016: 21%, Moreno-Mateos: 11%	375407445104
chr1	130102873	130102896	65	+	130102873	130102893	0,200,0	255,255,0	128,128,0	AACAGGAAATTCATGTTCAG	AGG	65	62	86	41	41	16	63	MIT Spec. Score: 65, Doench 2016: 86%, Moreno-Mateos: 41%	158324190361
chr1	130102942	130102965	69	+	130102942	130102962	0,200,0	255,100,100	128,128,0	TATCTCCAACTACAGATGAC	TGG	69	58	75	18	6	18	74	MIT Spec. Score: 69, Doench 2016: 75%, Moreno-Mateos: 6%	53923644579
chr1	130102947	130102970	76	-	130102950	130102970	0,200,0	0,200,0	0,255,0	TGGCACCAGTCATCTGTAGT	TGG	76	54	63	52	65	14	56	MIT Spec. Score: 76, Doench 2016: 63%, Moreno-Mateos: 65%	143971467036
chr1	130102967	130102990	51	-	130102970	130102990	0,200,0	255,100,100	128,128,0	CAGTGGAAAGAAACTGGGCT	TGG	51	62	86	31	22	14	68	MIT Spec. Score: 51, Doench 2016: 86%, Moreno-Mateos: 22%	245004475569
chr1	130102972	130102995	52	-	130102975	130102995	0,200,0	255,255,0	128,128,0	AGGCACAGTGGAAAGAAACT	GGG	52	63	89	45	50	37	75	MIT Spec. Score: 52, Doench 2016: 89%, Moreno-Mateos: 50%	246834807905
chr1	130102973	130102996	43	-	130102976	130102996	80,80,80	80,80,80	255,0,0	GAGGCACAGTGGAAAGAAAC	TGG	43	34	13	43	46	5	70	MIT Spec. Score: 43, Doench 2016: 13%, Moreno-Mateos: 46%	225135050728

CRISPR Targets (crisprAllTargets) Track Description

Description

This track shows the DNA sequences targetable by CRISPR RNA guides using the Cas9 enzyme from S. pyogenes (PAM: NGG) over the entire mouse (mm39) genome. CRISPR target sites were annotated with predicted specificity (off-target effects) and predicted efficiency (on-target cleavage) by various algorithms through the tool CRISPOR. Sp-Cas9 usually cuts double-stranded DNA three or four base pairs 5' of the PAM site.

Display Conventions and Configuration

The track "CRISPR Targets" shows all potential -NGG target sites across the genome. The target sequence of the guide is shown with a thick (exon) bar. The PAM motif match (NGG) is shown with a thinner bar. Guides are colored to reflect both predicted specificity and efficiency. Specificity reflects the "uniqueness" of a 20mer sequence in the genome; the less unique a sequence is, the more likely it is to cleave other locations of the genome (off-target effects). Efficiency is the frequency of cleavage at the target site (on-target efficiency).

Shades of gray stand for sites that are hard to target specifically, as the 20mer is not very unique in the genome:

	impossible to target: target site has at least one identical copy in the genome and was not scored
	hard to target: many similar sequences in the genome that alignment stopped, repeat?
	hard to target: target site was aligned but results in a low specificity score <= 50 (see below)

Colors highlight targets that are specific in the genome (MIT specificity > 50) but have different predicted efficiencies:

	unable to calculate Doench/Fusi 2016 efficiency score
	low predicted cleavage: Doench/Fusi 2016 Efficiency percentile <= 30
	medium predicted cleavage: Doench/Fusi 2016 Efficiency percentile > 30 and < 55
	high predicted cleavage: Doench/Fusi 2016 Efficiency > 55

Mouse-over a target site to show predicted specificity and efficiency scores:

The MIT Specificity score summarizes all off-targets into a single number from 0-100. The higher the number, the fewer off-target effects are expected. We recommend guides with an MIT specificity > 50.
The efficiency score tries to predict if a guide leads to rather strong or weak cleavage. According to (Haeussler et al. 2016), the Doench 2016 Efficiency score should be used to select the guide with the highest cleavage efficiency when expressing guides from RNA PolIII Promoters such as U6. Scores are given as percentiles, e.g. "70%" means that 70% of mammalian guides have a score equal or lower than this guide. The raw score number is also shown in parentheses after the percentile.
The Moreno-Mateos 2015 Efficiency score should be used instead of the Doench 2016 score when transcribing the guide in vitro with a T7 promoter, e.g. for injections in mouse, zebrafish or Xenopus embryos. The Moreno-Mateos score is given in percentiles and the raw value in parentheses, see the note above.

Click onto features to show all scores and predicted off-targets with up to four mismatches. The Out-of-Frame score by Bae et al. 2014 is correlated with the probability that mutations induced by the guide RNA will disrupt the open reading frame. The authors recommend out-of-frame scores > 66 to create knock-outs with a single guide efficiently.

Off-target sites are sorted by the CFD (Cutting Frequency Determination) score (Doench et al. 2016). The higher the CFD score, the more likely there is off-target cleavage at that site. Off-targets with a CFD score < 0.023 are not shown on this page, but are available when following the link to the external CRISPOR tool. When compared against experimentally validated off-targets by Haeussler et al. 2016, the large majority of predicted off-targets with CFD scores < 0.023 were false-positives. For storage and performance reasons, on the level of individual off-targets, only CFD scores are available.

Methods

Relationship between predictions and experimental data

Like most algorithms, the MIT specificity score is not always a perfect predictor of off-target effects. Despite low scores, many tested guides caused few and/or weak off-target cleavage when tested with whole-genome assays (Figure 2 from Haeussler et al. 2016), as shown below, and the published data contains few data points with high specificity scores. Overall though, the assays showed that the higher the specificity score, the lower the off-target effects.

Similarly, efficiency scoring is not very accurate: guides with low scores can be efficient and vice versa. As a general rule, however, the higher the score, the less likely that a guide is very inefficient. The following histograms illustrate, for each type of score, how the share of inefficient guides drops with increasing efficiency scores:

When reading this plot, keep in mind that both scores were evaluated on their own training data. Especially for the Moreno-Mateos score, the results are too optimistic, due to overfitting. When evaluated on independent datasets, the correlation of the prediction with other assays was around 25% lower, see Haeussler et al. 2016. At the time of writing, there is no independent dataset available yet to determine the Moreno-Mateos accuracy for each score percentile range.

Track methods

The entire mouse (mm39) genome was scanned for the -NGG motif. Flanking 20mer guide sequences were aligned to the genome with BWA and scored with MIT Specificity scores using the command-line version of crispor.org. Non-unique guide sequences were skipped. Flanking sequences were extracted from the genome and input for Crispor efficiency scoring, available from the Crispor downloads page, which includes the Doench 2016, Moreno-Mateos 2015 and Bae 2014 algorithms, among others.

Note that the Doench 2016 scores were updated by the Broad institute in 2017 ("Azimuth" update). As a result, earlier versions of the track show the old Doench 2016 scores and this version of the track shows new Doench 2016 scores. Old and new scores are almost identical, they are correlated to 0.99 and for more than 80% of the guides the difference is below 0.02. However, for very few guides, the difference can be bigger. In case of doubt, we recommend the new scores. Crispor.org can display both scores and many more with the "Show all scores" link.

Data Access

Positional data can be explored interactively with the Table Browser or the Data Integrator. For small programmatic positional queries, the track can be accessed using our REST API. For genome-wide data or automated analysis, CRISPR genome annotations can be downloaded from our download server as a bigBedFile.

The files for this track are called crispr.bb, which lists positions and scores, and crisprDetails.tab, which has information about off-target matches. Individual regions or whole genome annotations can be obtained using our tool bigBedToBed, which can be compiled from the source code or downloaded as a pre-compiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain only features within a given range, e.g.

bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/mm39/crisprAllTargets/crispr.bb -chrom=chr21 -start=0 -end=1000000 stdout

Credits

Track created by Maximilian Haeussler, with helpful input from Jean-Paul Concordet (MNHN Paris) and Alberto Stolfi (NYU).

References

Haeussler M, Schönig K, Eckert H, Eschstruth A, Mianné J, Renaud JB, Schneider-Maunoury S, Shkumatava A, Teboul L, Kent J et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 2016 Jul 5;17(1):148. PMID: 27380939; PMC: PMC4934014

Bae S, Kweon J, Kim HS, Kim JS. Microhomology-based choice of Cas9 nuclease target sites. Nat Methods. 2014 Jul;11(7):705-6. PMID: 24972169

Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C, Orchard R et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol. 2016 Feb;34(2):184-91. PMID: 26780180; PMC: PMC4744125

Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013 Sep;31(9):827-32. PMID: 23873081; PMC: PMC3969858

Moreno-Mateos MA, Vejnar CE, Beaudoin JD, Fernandez JP, Mis EK, Khokha MK, Giraldez AJ. CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo. Nat Methods. 2015 Oct;12(10):982-8. PMID: 26322839; PMC: PMC4589495