Schema for Cell expression - Single Cell RNA-Seq Gene Expression from Tabula Muris
Database: mm10 Primary Table: tabulamurisBarChart Data last updated: 2018-09-12|
Big Bed File Download: /gbdb/mm10/tabulamuris/barChart.bb
Item Count: 20,999
The data is stored in the binary BigBed format.
Format description: BED6+5 with additional fields for category count and median values, and sample matrix fields
|chrom||chr1||Reference sequence chromosome or scaffold|
|chromStart||130439026||Start position in chromosome|
|chromEnd||130462744||End position in chromosome|
|score||14||Score from 0-1000, typically derived from total of median value from all categories|
|strand||-||+ or - for strand. Use . if not applicable|
|expCount||81||Number of categories|
|expScores||34,0,0,0,0,0,0,0,0,0,0,0,12,302,0,0,0,503,0,0,0,0,498,0,0,79,0,0,0,0,0,7,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,21,67,0,0,0,0,196,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,0,0||Comma separated list of category values|
|_dataOffset||438672278||Offset of sample data in data matrix file, for boxplot on details page|
|_dataLen||105656||Length of sample data row in data matrix file|
Cell expression (tabulamurisBarChart) Track Description
is a compendium of single cell transcriptome data from the model organism Mus
musculus, containing nearly 100,000 cells from 20 organs and tissues. The data
allow for direct and controlled comparison of gene expression in cell types
shared between tissues, such as immune cells from distinct anatomical
This track shows the results from FACS sorted cells sequenced with the
SmartSeq2 protocol, as it has much higher transcript coverage. The sequencing
data comprises more than 2TB and was summarized into a track at UCSC.
Display Conventions and Configuration
As indicated by the "..." after its name, this is a 'super track', a container for
subtracks. There are three different subtracks:
- Cell type expression:
A rectangle on the genome, at the location of a gene, filled with a bar graph that indicates the
gene's expression by single cell cluster. The term "cluster" refers to a cluster of
single cells, which usually represents a cell or tissue type. The height of the bar graph on the
genome is the median expression level and a click-through on the bar chart displays a boxplot of
expression level quartiles with outliers, per cluster. On the boxplot, the number of cells from
each experiment is shown.
Bar graphs indicate the number of reads at this base pair. You may want to switch on
auto-scaling of the y-axis. For configuration options, see the graph tracks
configuration help page. These tracks are shown in "dense" by default, set any of
the tracks to "full" to see the detailed coverage plot.
- Splice Junctions:
Thick rectangles show exons around a splice site, connected by a line that indicates the
intron. These gaps are shown and are annotated with the number of reads, in the 'score' field.
You can use the 'score' filter on the track configuration page to show only introns with a
certain number of supporting reads. The maximum number of reads that are shown is 1,000, even if
more reads support an intron. These tracks are shown in dense by default, set this track to
"pack" to see. Then click the splice junctions to see their score.
BAM files were provided by the data submitters, one (single end) or two files (paired end) per
cell. The BAM alignments were used as submitted. They were merged with "samtools merge"
into a single BAM file per cluster. The readgroup (RG) BAM tag indicates the original cell.
From the resulting merged BAM file, coverage was obtained using "wiggletools coverage" a
tool written by Daniel Zerbino and the result was converted with the UCSC tool
Also on the merged BAM file, the software IntronProspector was run with default settings. It
retains reads with a gap longer than 70 bp and shorter than 500 kbp and merges them into annotated
The merged BAM files, coverage bigWig files and splice junctions in bigBed format can be
downloaded from the /gbdb fileserver.
Since the splice junction .bigBed files have their scores capped at 1000, the original
IntronProspector .bed files are available in the same
track hub directory. You can
also find there *.calls.tsv files with more details about each junction, e.g. the number of
uniquely mapping reads.
WiggleTools was written by Daniel Zerbino, IntronProspector was written by Mark Diekhans, track
hubs were written to a large extent by Brian Raney and colleages at the UCSC Genome Browser. Track
creation was done by Max Haeussler and tested by Jairo Navarro.
Zerbino DR, Johnson N, Juettemann T, Wilder SP, Flicek P.
WiggleTools: parallel processing of large collections of genome-wide datasets for
visualization and statistical analysis. Bioinformatics. 2014 Apr 1;30(7):1008-9.
PMID: 24363377; PMC: PMC3967112
Mark Diekhans, IntronProspector GitHub Repository. Github 2018
The Tabula Muris Consortium, Stephen R. Quake, Tony Wyss-Coray, Spyros Darmanis:
transcriptomic characterization of 20 organs and tissues from individual mice creates a Tabula
Muris. bioRxiv preprint March 2018, accepted paper in
Nature 2018 (562) p.367-372