Schema for ORF predictions - Weizman ORF predictions
  Database: wuhCor1    Primary Table: ORFs Data last updated: 2020-09-28
Big Bed File Download: /gbdb/wuhCor1/bbi/weizmanOrfs/ORFs.bb
Item Count: 23
The data is stored in the binary BigBed format.

Format description: Browser Extensible Data
fieldexampledescription
chromNC_045512v2Reference sequence chromosome or scaffold
chromStart21743Start position in chromosome
chromEnd21863End position in chromosome
nameS.iORF1Name of item.
score0Score (0-1000)
strand++ or - for strand
thickStart21743Start of where display should be thick (start codon)
thickEnd21863End of where display should be thick (stop codon)
reserved0,204,0Used as itemRgb as of 2004-11-22
blockCount1Number of blocks
blockSizes120,Comma separated list of block sizes
chromStarts0,Start positions relative to chromStart

Sample Rows
 
chromchromStartchromEndnamescorestrandthickStartthickEndreservedblockCountblockSizeschromStarts
NC_045512v22174321863S.iORF10+21743218630,204,01120,0,
NC_045512v22176721863S.iORF20+21767218630,204,0196,0,
NC_045512v225456255823a.iORF1 (ORF3c)0+25456255820,204,01126,0,
NC_045512v225595256973a.iORF20+25595256970,204,01102,0,
NC_045512v22643626472E.iORF0+2643626472255,128,0136,0,
NC_045512v22648327191M.ext0+2648327191255,128,01708,0,
NC_045512v22715027195M.iORF0+2715027195255,128,0145,0,
NC_045512v227255273876.iORF0+27255273870,204,01132,0,
NC_045512v227399277597a.iORF10+2739927759255,128,01360,0,
NC_045512v227580276077a.iORF20+2758027607255,128,0127,0,

ORF predictions (ORFs) Track Description
 

Description

The Weizman ORFs (Open Reading Frames) track shows previously unannotated ORF predictions based on Ribo-Seq and RNA-seq data. It is a collection of tracks (super track) that contains not only the predicted gene models, but also data supporting them.

Display Conventions and Configuration

The Predicted ORFs track shows the predicted exons. All other tracks show the signal as a x-y plot with bars.

Methods

Methods from Finkel et al:

To capture the full SARS-CoV-2 coding capacity, we applied a suite of ribosome profiling approaches to Vero cells infected with SARS-CoV-2 for 5 and 24 hours, and Calu3 cells infected for 7 hours. For each time point we prepared three different ribosome-profiling libraries, each one in two biological replicates. Two Ribo-seq libraries facilitate mapping of translation initiation sites, by treating cells with lactimidomycin (LTM) or harringtonine (Harr), two drugs with distinct mechanisms that prevent 80S ribosomes at translation initiation sites from elongating. The third Ribo-seq library was prepared from cells treated with the translation elongation inhibitor cycloheximide (CHX), and gives a snap-shot of actively translating ribosomes across the body of the translated ORF. In parallel, RNA-sequencing was applied to map viral transcripts.

The ORF prediction was done by using two computational tools, PRICE and ORF-RATER, that rely on different features of ribosome profiling data, and by manual inspection of the data. The predictions are based on Ribo-seq libraries from two time points (5 and 7 hpi) of two different cell lines (Vero E6 and Calu3 cells), infected with separate virus isolates.

The Ribo-Seq data of the 24 hours samples do not show the expected profile of read distribution on viral genes and therefore were not used for the procedure of ORF predictions.

For more details see the paper in the References section below.

Data Access

The raw data can be explored interactively with the Table Browser, or combined with other datasets in the Data Integrator tool.

Please refer to our mailing list archives for questions, or our Data Access FAQ for more information.

References

Finkel Y, Mizrahi O, Nachshon A, Weingarten-Gabbay S, Morgenstern D, Yahalom-Ronen Y, Tamir H, Achdout H, Stein D, Israeli O et al. The coding capacity of SARS-CoV-2. Nature. 2020 Sep 9;. PMID: 32906143