Schema for CHM13 unique - CHM13 unique in comparison to GRCh38/hg38 and GRCh37/hg19
  Database: hub_567047_hs1    Primary Table: hub_567047_hgUniquehg19 Data last updated: 2022-04-09
Big Bed File Download: /gbdb/hs1/hgUnique/hgUnique.hg19.bb
Item Count: 546
The data is stored in the binary BigBed format.

Format description: Browser Extensible Data
fieldexampledescription
chromchr1Reference sequence chromosome or scaffold
chromStart103168801Start position in chromosome
chromEnd103220196End position in chromosome

Sample Rows
 
chromchromStartchromEnd
chr1103168801103220196
chr1103546781103735057
chr1108417231108482805
chr1108482833108516786
chr1120167856120400804
chr1120401129120452282
chr1120457037120461485
chr1120464920120858555
chr1120878232120965232
chr1120972310121095976

CHM13 unique for hg19 (hub_567047_hgUniquehg19) Track Description
 

Description

These tracks show the regions unique to the T2T-CHM13 v2.0 assembly compared to the GRCh38/hg38 and GRCh37/hg19 reference assemblies.

Methods

    Converting a chain file to the PAF format

    We used the `to_paf.py` script from chaintools (https://doi.org/10.5281/zenodo.6342391, v0.1) to convert the v1_nfLO chains to the PAF format.

    Obtaining unique regions

    We used the follwing commands to obtain the regions unique to GRCh38/hg38 and GRCh37/hg19 in the BED format.

    
    cut -f 1,3,4 grch38-chm13v2.paf  \
      | bedtools sort -i - -g chm13v2.0.fasta.fai \
      | bedtools merge \
      | bedtools complement -g chm13v2.0.fasta.fai -i - \
      | bedtools merge \
      > T2T-CHM13v2.0_unique_regions_hg38.bed
    
    cut -f 1,3,4 hg19-chm13v2.paf |  bedtools sort -i - -g chm13v2.0.fasta.fai \
      | bedtools merge \
      | bedtools complement -g chm13v2.0.fasta.fai -i - \
      | bedtools merge \
      > T2T-CHM13v2.0_unique__regions_hg19.bed
    

Credits

The unique region annotations were generated by Nae-Chyun Chen<naechyun.chen@gmail.com> and Mitchell Vollger<mvollger@uw.edu>

References

Nurk S, Koren S, Rhie A, Rautiainen M, et al. The complete sequence of a human genome. bioRxiv, 2021.