The principle and application of three-dimensional genome (Hi-C)

The principle and application of three-dimensional genome (Hi-C)

Hi-C technology is derived from Chromosome Conformation Capture (3C) technology, using high-throughput sequencing technology, combined with biological information analysis methods, to study the relationship between the entire chromatin DNA in the whole genome in space, and obtain high resolution Rate information about the three-dimensional structure of chromatin. Hi-C technology can not only study the interaction between chromosome fragments and establish genome folding models, but also can be applied to genome assembly, haplotype map construction, auxiliary metagenomic assembly, etc., and can be combined with RNA-Seq, ChIP-Seq, etc. The data is jointly analyzed, and the related mechanism of the formation of biological traits is explained from the gene regulatory network and the epigenetic network. The following is from the compilation of Fraser Gene's explanation video. Original video https://www.bilibili.com/video/BV1f7411n7zU?p=23

Hi-C and other 3D genome technology features

HiC technology experiment principle

The three-dimensional genome was cross-linked and fixed with formaldehyde, and digested with endonuclease. After digestion, biotin was added to the end for end repair, and then ligation was performed. After ligation, the protein was removed and broken into small fragments, and the organism was captured with magnetic beads. The fragments of the element are sequenced.

Hi-C analysis process

(a) The first is quality control. After filtering, high-quality FASTQ data (PE, 150bp), if the comparison software does not support split mapping, iterative comparison is generally used, because the junction is based on bases outside the genome, which may be more sorry. The alignment starts from the left end of the sequence 25bp, if there is a unique alignment, stop, if there are multiple alignment positions, continue to extend 5bp until the unique alignment appears. Or you can choose software that supports split mapping for comparison, which can be processed by segment comparison. (b) Choose high-quality comparison data (c) HiC-specific comparison standard (d) Correct the Vaild pairs. After correction, the interaction matrix can be obtained.

Ferhat Ay et al;2015

Bryan R. Lajoie et al;2014

Common analysis software

Software tools for Hi-C data analysis

Tool

Short-read

Mapping

Read

Read-pair

Normalization

Visualization

Confidence

Implementation

aligner(s)

improvement

filtering

filtering

estimation

language(s)

:--

:--

:--

:--

:--

:--

:--

:--

:--

HiCUP [46]

Bowtie/Bowtie2

Pre-truncation

Perl, R

Hiclib [47]

Bowtie2

Iterative

✓ a

Matrix balancing

Python

HiC-inspector [131]

Bowtie

Perl, R

HIPPIE [132]

STAR

✓ b

Python, Perl, R

HiC-Box [133]

Bowtie2

Matrix balancing

Python

HiCdat [122]

Subread

−c

3.options d

C++, R

HiC-Pro [134]

Bowtie2

Trimming

Matrix balancing

Python, R

TADbit [120]

GEM

Iterative

Matrix balancing

Python

HOMER [62]

Two options e

Perl, R, Java

Hicpipe [54]

Explicit-factor

Perl, R, C++

HiBrowse [69]

Web-based

Hi-Corrector [57]

Matrix balancing

ANSI C

GOTHiC [135]

R

HiTC [121]

Two options f

R

chromoR [59]

Variance stabilization

R

Hi5.[136]

3.options g

Python

Fit-Hi-C [20]

Python

Hi-C visualization

image.png

data analysis

Sequence filtering

Principle of filtration

Data correction

Why do data corrections?

Eitan Yaffe & Amos Tanay; 2011

Data correction effect

Analysis that can be done

1. cis/trans interaction ratio

2. Interaction frequency is related to distance

3.compartment analysis

4. TAD analysis

5. Significant interaction analysis

image.png

Hi-C application

1. Analyze the genome-wide interaction model

2. Assist in improving genome assembly

3. Construct a genome haplotype map

Reference: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4556012/ https://www.ncbi.nlm.nih.gov/pubmed/?term=Bryan+R.+Lajoie+%3B+ 2014 https://www.ncbi.nlm.nih.gov/pubmed/?term=Eitan+Yaffe%3B2011 http://yulijia.net/cn/%E7%94%9F%E7%89%A9%E4% BF%A1%E6%81%AF/2016/04/15/3C-4C-5C-HiC-ChIAPET-and-ChIPloop.html https://www.bilibili.com/video/BV1f7411n7zU?p=23 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4347522/ https://www.nature.com/articles/ng.947.pdf

Reference: https://cloud.tencent.com/developer/article/1625366 Principle and Application of Three-dimensional Genome (Hi-C)-Cloud + Community-Tencent Cloud