Schema for Cell expression - Single Cell RNA-Seq Gene Expression from Tabula Muris
  Database: mm10    Primary Table: tabulamurisBarChart Data last updated: 2018-09-12
Big Bed File Download: /gbdb/mm10/tabulamuris/barChart.bb
Item Count: 20,999
The data is stored in the binary BigBed format.

Format description: BED6+5 with additional fields for category count and median values, and sample matrix fields
fieldexampledescription
chromchr1Reference sequence chromosome or scaffold
chromStart130439026Start position in chromosome
chromEnd130462744End position in chromosome
nameENSMUSG00000026399.12Gene identifier
score14Score from 0-1000, typically derived from total of median value from all categories
strand-+ or - for strand. Use . if not applicable
name2Cd55Gene name
expCount81Number of categories
expScores34,0,0,0,0,0,0,0,0,0,0,0,12,302,0,0,0,503,0,0,0,0,498,0,0,79,0,0,0,0,0,7,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,21,67,0,0,0,0,196,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,0,0Comma separated list of category values
_dataOffset438672278Offset of sample data in data matrix file, for boxplot on details page
_dataLen105656Length of sample data row in data matrix file

Sample Rows
 
chromchromStartchromEndnamescorestrandname2expCountexpScores_dataOffset_dataLen
chr1130439026130462744ENSMUSG00000026399.1214-Cd558134,0,0,0,0,0,0,0,0,0,0,0,12,302,0,0,0,503,0,0,0,0,498,0,0,79,0,0,0,0,0,7,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,21,67,0,0,0 ...438672278105656
chr1130576712130629621ENSMUSG00000042554.100-Zp3r810,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, ...225746442289589
chr1130634772130661632ENSMUSG00000026405.141-C4bp810,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,523,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, ...38498566490667
chr1130670260130684071ENSMUSG00000100257.114-C4bp-ps1810,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, ...38507633189624
chr1130690277130729253ENSMUSG00000026409.14396-Pfkfb2810,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ...156398787894319
chr1130717326130724358ENSMUSG00000046404.565+Yod1810,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, ...220068617491891
chr1130731975130744622ENSMUSG00000042510.710+AA986860810,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,7,56,0,0,0,0,0,0,0,0,0,0,0,0,74,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, ...19031310194521
chr1130800901130814740ENSMUSG00000026415.110+Fcamr810,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, ...74581081489857
chr1130826683130852249ENSMUSG00000026417.1343+Pigr810,0,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,89,19,0,0,993,0,0,0,0,0,0,0,0,252,0,0,0,0,0,0,0,1345,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ...157730482199032
chr1130882073130887454ENSMUSG00000026420.160-Il24810,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, ...103866064289953

Cell expression (tabulamurisBarChart) Track Description
 

Description

Tabula Muris is a compendium of single cell transcriptome data from the model organism Mus musculus, containing nearly 100,000 cells from 20 organs and tissues. The data allow for direct and controlled comparison of gene expression in cell types shared between tissues, such as immune cells from distinct anatomical locations.

This track shows the results from FACS sorted cells sequenced with the SmartSeq2 protocol, as it has much higher transcript coverage. The sequencing data comprises more than 2TB and was summarized into a track at UCSC.

Display Conventions and Configuration

As indicated by the "..." after its name, this is a 'super track', a container for subtracks. There are three different subtracks:

Cell type expression:
A rectangle on the genome, at the location of a gene, filled with a bar graph that indicates the gene's expression by single cell cluster. The term "cluster" refers to a cluster of single cells, which usually represents a cell or tissue type. The height of the bar graph on the genome is the median expression level and a click-through on the bar chart displays a boxplot of expression level quartiles with outliers, per cluster. On the boxplot, the number of cells from each experiment is shown.
Coverage:
Bar graphs indicate the number of reads at this base pair. You may want to switch on auto-scaling of the y-axis. For configuration options, see the graph tracks configuration help page. These tracks are shown in "dense" by default, set any of the tracks to "full" to see the detailed coverage plot.
Splice Junctions:
Thick rectangles show exons around a splice site, connected by a line that indicates the intron. These gaps are shown and are annotated with the number of reads, in the 'score' field. You can use the 'score' filter on the track configuration page to show only introns with a certain number of supporting reads. The maximum number of reads that are shown is 1,000, even if more reads support an intron. These tracks are shown in dense by default, set this track to "pack" to see. Then click the splice junctions to see their score.

Methods

BAM files were provided by the data submitters, one (single end) or two files (paired end) per cell. The BAM alignments were used as submitted. They were merged with "samtools merge" into a single BAM file per cluster. The readgroup (RG) BAM tag indicates the original cell.

From the resulting merged BAM file, coverage was obtained using "wiggletools coverage" a tool written by Daniel Zerbino and the result was converted with the UCSC tool "wigToBigWig".

Also on the merged BAM file, the software IntronProspector was run with default settings. It retains reads with a gap longer than 70 bp and shorter than 500 kbp and merges them into annotated splice junctions.

Data Access

The merged BAM files, coverage bigWig files and splice junctions in bigBed format can be downloaded from the /gbdb fileserver.

Since the splice junction .bigBed files have their scores capped at 1000, the original IntronProspector .bed files are available in the same track hub directory. You can also find there *.calls.tsv files with more details about each junction, e.g. the number of uniquely mapping reads.

Credits

WiggleTools was written by Daniel Zerbino, IntronProspector was written by Mark Diekhans, track hubs were written to a large extent by Brian Raney and colleages at the UCSC Genome Browser. Track creation was done by Max Haeussler and tested by Jairo Navarro.

References

Zerbino DR, Johnson N, Juettemann T, Wilder SP, Flicek P. WiggleTools: parallel processing of large collections of genome-wide datasets for visualization and statistical analysis. Bioinformatics. 2014 Apr 1;30(7):1008-9. PMID: 24363377; PMC: PMC3967112

Mark Diekhans, IntronProspector GitHub Repository. Github 2018

The Tabula Muris Consortium, Stephen R. Quake, Tony Wyss-Coray, Spyros Darmanis: Single-cell transcriptomic characterization of 20 organs and tissues from individual mice creates a Tabula Muris. bioRxiv preprint March 2018, accepted paper in Nature 2018 (562) p.367-372