Quantifying the Relationship between DNA Methylation and Gene

Quantifying the Relationship between DNA
Methylation and Gene Expression in Human
Colon Cancer
Matthias Lienhard
Max Planck Institute for Molecular Genetics, Berlin
IMPRS Colloquium
July 9th 2014
Epigenetics
CH Waddington, 1942: “epigenetic landscape”
Epigenetics
Athur Riggs: "the study of mitotically and/or meiotically
heritable changes in gene function that cannot be
explained by changes in DNA sequence."
Riggs AD, Russo VEA, Martienssen RA (1996,
Cold Spring Harbor Monograph): Epigenetic
mechanisms of gene regulation.
DNA Methylation
●
Methyl modification at
Cytosine
●
In CpG context
●
Represses gene activity
Gene silencing
●
Promoter methylation→gene repressed
Colon Cancer
rd
●
3 most common cancer
●
5 y survival ~40% - 60%
●
Risk factors: age, lifestyle
●
Epigenetic alteration
– Relatively frequent
– Epigenetic markers
Human Colon Cancer Samples
●
14 Colon Cancer Patients
●
Normal Colon and Tumor Tissue
●
MeDIP-Seq
●
Low coverage input sequencing (0.2 x)
●
RNA-Seq (12 Patients)
●
High coverage WGS (4 Patients, 80 x, CG)
MEDIPS package
A: IP-seq
DNA
fragments
Immunopreciptiation
Epigenetic
mark
Sequencing
Enriched
fragments
rpkm
Pearson correlation
MeDIP reads in genomic window
MeDIP read density
Estimated linear fit for MeDIP
Input read density
Saturation
Estimated saturation
0
Short reads
5M 10M 15M 20M
Number of reads
C: Quality control
CpG Coupling Factor
Fraction of differentially
methylated regions [%]
LogFC
r = 0.74
Lower quantile
25%-75% CF
Upper quantile
MeDIP-seq rms value
D: Normalization
1.0
B: Alignment
Bisulfite [% methylated]
MEDIPS package
E: Validation
hypomethylated
hypermethylated
0.8
0.6
0.4
0.2
0.0
All regions Introns
Exons
Promoter
CpG
Islands
CGI
Promoter
Log( Ad x No )
F: Statistical test for
differential coverage
http://www.bioconductor.org/MEDIPS
G: Annotation and functional
interpretation
M. Lienhard et al. (Bioinformatics, 2014)
MEDIPS: genome wide differential coverage analysis of
sequencing data derived from DNA enrichment experiments
MeDIP-seq
MeDIP-seq
ImmunoSequencing
preciptiation
Enriched
DNA
Short reads
methylated
fragments
fragments
cytosine
Alignment
MEDIPS: Preprocessing
Alignment
Counting
2
1
2
7
6
2
250 bases
→ Count matrix: 11.524.123 windows
x 28 samples
1
MEDIPS: Normalization
Coverage logFC
Library Size
CpG content
1
0
-1
avg. log coverage
M Robinson et al., Genome Biology (2010): A scaling
normalization method for differential expression analysis of
RNA-seq data
Number of CpG
L Chavez et al., Genome Research (2010): Computational
analysis of genome-wide DNA methylation during the
differentiation of human embryonic stem cells along the
endodermal lineage.
MEDIPS: Differential coverage
analysis
edgeR:
●
●
Developed for
RNA-seq
Negative binomial
distribution
Over-dispersion
estimation
LogFC
●
Log( Ad x No )
M. D. Robinson et al., Bioinformatics (2010): edgeR:
a bioconductor package for dierential expression
analysis of digital gene expression data
CNV
deletion
●
●
amplification
Present in
most Tumors
Tumor
●
CNA free
Accumulation
at genomic
locations
Normal
Profiles
define
clusters
Tumor
Chr 1 2 3 4 5 6 7 8 910 …
22
CNV and MeDIP signal
CNA-free
amplifications
methylation logFC
deletions
methylation logFC
avg. log methylation level
Mean CNV logFC of CNV state
M. Robinson et al. (Genome Research, 2012): Copy-numberaware differential analysis of quantitative DNA sequencing data.
Explorative analysis
PC 2
PCA
PC 1
Differentially methylated
Regions
Log Ratio T/N
27711 windows with
gain of methylation (0.22%)
16193 windows with
loss of methylation (0.13%)
avg log abundance
DNA Methylation
Gain of methylation
Fraction differentially
methylated regions[%]
7
Loss of methylation
6
5
4
3
2
1
at
BS r s
TF mote
pro
ter
mo
Pro
BS
TF
ne
Ge y
bod
All gions
Re
0
MAL
WNT2
No DMR in
TFBS
Promoter
5.5%
88.5%
6%
Hypermethylated
promoters
13.4%
81.4%
5.3%
13.4%
5.5%
Hypomethylated
promoters
4.4%
81.4%
88.5%
5.3%
6%
15.7%
79.9%
Enrichment of TFBS in DMRs
Fraction with loss of methylation
2%
1.5%
1%
0.5%
0%
Enrichment of TFBS in DMRs
Fraction with loss of methylation
Enrichment of TFBS in DMRs
Fraction with loss of methylation
2%
1.5%
1%
0.5%
0%
35%
30%
25%
20%
15%
10%
5%
0%
Fraction with gain of methylation
Enrichment of TFBS in DMRs
2%
1.5%
1%
0.5%
0%
Fraction with gain of methylation
EZH2 binding sites
40%
30%
20%
10%
0%
Genome Promoter
wide
Gene
body
intergenic
Promoter methylation logFC
Relationship of methylation and
gene expression
Expression logFC
Relationship of methylation and
gene expression
CHD1 binding sites
Acknowledgement
Max Planck Institute for Molecular
Genetics
Bernd Timmermann
Hans Lehrach
Christina Grimm
Michal Schweiger
Lukas Chavez
Ralf Herwig
Medical University Graz
Kurt Zatloukal
Universität des Saarlandes
Jörn Walter