cancer ngs - research / clinic bioinformatics challenges

CANCER NGS - RESEARCH / CLINIC
BIOINFORMATICS CHALLENGES
Anthony Ferrari
Plateforme Bioinformatique
Synergie Lyon Cancer
« Grant INCa-4664 »
www.cancer-lyric.com
&#+$'+'"+
/#""&# #",&'(&
NGS FAST EVOLUTION
« Grant INCa-4664 »
)"1"-
13 Mb/hour
200 Mb/hour
0,5 Gb/hour
2,5 Gb/hour
25 Gb/hour
2014
2005
&#+$'+'"+
/#""&# #",&'(&
<
HIERARCHY OF NEEDS
" /''
« Grant INCa-4664 »
5#!),&"('
5(&+(+& &&&"!"('
5("(&)#"
($&'"()#"
5'(&+(('#+&'
5'(&+(('#+&'
5(&#"#+'(
5(&#"#+'(
5(."
"&'(&+(+&'
5(#&
5(#&
5#!$+()#"
&#+$'+'"+
/#""&# #",&'(&
=
« Grant INCa-4664 »
)-'.+D
SN04(.+.')"3
(.+""-.,"
SNN8,.823
SNN.2,+3
&#+$'+'"+
/#""&# #",&'(&
>
PROGRAMS
« Grant INCa-4664 »
  HER2+ Amplified Breast Cancer
(G. Thomas, A. Viari)
  75 genome pairs, Tumour 45x, Normal 30x
  RNA-seq Infinium 450k
  Prostate Cancer
(O. Cussenot)
  25 genome pairs, Tumour 45x, Normal 30x
  RNA-seq
Exomes
Gynecological Carcinosarcoma
(A. Puisieux)
  20 genome pairs, Tumour 75x, Normal 50x
  RNA-seq
Infinium 450k
&#+$'+'"+
/#""&# #",&'(&
?
HIERARCHY OF NEEDS
" /''
« Grant INCa-4664 »
5#!),&"('
5(&+(+& &&&"!"('
5("(&)#"
($&'"()#"
($&'"()#"
5'(&+(('#+&'
5'(&+(('#+&'
5(&#"#+'(
5(&#"#+'(
5(."
"&'(&+(+&'
5(#&
5(#&
5#!$+()#"
&#+$'+'"+
/#""&# #",&'(&
@
INFRASTRUCTURES
« Grant INCa-4664 »
WHOLE GENOME – TUMOUR 50X, NORMAL 30X
+)'-,"-4
F
845.-3
-!"+3
F
TNN(08
+!+ ),$+)!
'%<:::
O
+!+ ),'(#&
IZONNNJ
&#+$'+'"+
/#""&# #",&'(&
A
INFRASTRUCTURES
« Grant INCa-4664 »
  INCa – Synergie cluster
  64 computing nodes [ 8 cpus // 64Gb RAM ]
  400Tb effective storage
  LYRIC cluster
  20 computing nodes [ 12 cpus // 128Gb RAM ]
  100Tb effective storage + 60Tb backup storage.
&#+$'+'"+
/#""&# #",&'(&
B
HIERARCHY OF NEEDS
" /''
« Grant INCa-4664 »
5#!),&"('
5(&+(+& &&&"!"('
5("(&)#"
($&'"()#"
5'(&+(('#+&'
5'(&+(('#+&'
5(&#"#+'(
5(&#"#+'(
5(."
"&'(&+(+&'
5(#&
5(#&
5#!$+()#"
&#+$'+'"+
/#""&# #",&'(&
C
INFORMATION SYSTEMS
# + &
" /''
« Grant INCa-4664 »
" (
%+""
(
"#!
&)#"'
#-&D
""#()#"'
!'
&#+$'+'"+
/#""&# #",&'(&
;:
INFORMATION SYSTEMS
)"(
« Grant INCa-4664 »
Clinical DB
C
)"(
"&# 4! '(#&/
4+&,, 4
'(/ 6'!#"37
4  4  3
&#$('
!$ " /''
GenericTracker
)"(
2"#
!'6 2237
"#(/$"
%+""
+ (/#"(&# Genomic
G
DB
)"(
&! "&)#"'
#!)&)#"'
6#'(&
2#
7
.2*%.:-'","-4<34",
:!4$+"3C"38+4$+"3C.&:2"9"23).-3
--.45.-3
"8"2<<34",
ICGC information system
&#+$'+'"+
/#""&# #",&'(&
;;
INFORMATION SYSTEMS
« Grant INCa-4664 »
&#+$'+'"+
/#""&# #",&'(&
;<
HIERARCHY OF NEEDS
" /''
« Grant INCa-4664 »
5#!),&"('
5(&+(+& &&&"!"('
5("(&)#"
($&'"()#"
($&'"()#"
5'(&+(('#+&'
5'(&+(('#+&'
5(&#"#+'(
5(&#"#+'(
5(."
"&'(&+(+&'
5(#&
5(#&
5#!$+()#"
&#+$'+'"+
/#""&# #",&'(&
;=
QUALITY CONTROLS
« Grant INCa-4664 »
2"H3"18"-)-'24"2)35.-
-' 2
4  G18-54<L18+)4<
4  -04(082)4<"35,5.-
4  8,.2+384<0"
4  "-.,)+4"25.-3
4  82)4<L+.)!<
.34H3"18"-)-'18+)4<.-42.+
)-' 18
4  (2"!3.2"18+)4<
4  -)18"+<,00"!2"!3
4  .9"2'"93E)3
4  F
&#+$'+'"+
/#""&# #",&'(&
;>
GENOMIC ALTERATIONS
« Grant INCa-4664 »
"<"23.-L+CPNON
+!#+&
&! ",&"('
'
#!),&"('
#&! &#+$'+'"+
/#""&# #",&'(&
;?
SOMATIC POINT MUTATIONS
« Grant INCa-4664 »
OS*
:
ON*
)+4"2"!
S*
R*
P*
++
.9
,0
0.3
YGH
2"0
34
0-"+
&#+$'+'"+
/#""&# #",&'(&
;@
;@
ICGC BENCHMARK
« Grant INCa-4664 »
&#+$'+'"+
/#""&# #",&'(&
;A
ICGC BENCHMARK
« Grant INCa-4664 »
0")$)4<ZUSX
0")$)4<ZQNX
&#+$'+'"+
/#""&# #",&'(&
;B
VARIANTS STRUCTURAUX
« Grant INCa-4664 »
9"2++02)-)0+"D
•  0+)42"!3
OE  2"*0.)-4!"4"5.-
PE  "-.,"2".-34285.-
•  "!!"04(
•  ++"+)25.
•  -.2,+,4"0)23
I2)"-45.-C-3"243)="J
&#+$'+'"+
/#""&# #",&'(&
;C
VARIANTS STRUCTURAUX
« Grant INCa-4664 »
(2OO
(2OQ
&#+$'+'"+
/#""&# #",&'(&
<:
VARIANTS STRUCTURAUX
« Grant
Gra INCa-4664 »
(2OO
(2OQ
&#+$'+'"+
/#""&# #",&'(&
<;
VARIANTS STRUCTURAUX
« Grant INCa-4664 »
&#+$'+'"+
/#""&# #",&'(&
<<
SUMMARY
« Grant INCa-4664 »
  Importance of infrastructure & data management
  SNV
  Good concordance
  No best pipeline though
Indels
  Very weak concordance
  Improvements required
  Structural variants
  Systematic detection is difficult
  False positives
&#+$'+'"+
/#""&# #",&'(&
<=
MORE CHALLENGES
« Grant INCa-4664 »
  Tumour heterogeneity, subclones
  Non coding DNA role in tumorigenesis
  Data analyses integration
 
&#+$'+'"+
/#""&# #",&'(&
<>
REMERCIEMENTS
« Grant INCa-4664 »
)++"3
+)-
-!2)-"
.-4(-
-)"
)-"-4
()84
".2(
'8"+.--"
--"H.0()"
,)+)"
,)+)"
82)"
"-H()+)00"
33)34-"8+)18"H
/0)48;!"2)3IH
J
"-42"!""("2("382+"34(.+.')"32.34518"3I""J
"-42"
.30)4+)"2!".)5"23
"-42"
.30)4+)"2.)-4"HH)42"G<,"3
"-42"
.30)4+)"2#').-+"4-)9"23)4)2"!"2"34
"-42"
.30)4+)"2#').-+-)9"23)4)2"!")++"
TPS
H
"-42"-4.)-"H33'-"
"-42"".2'"32- .)3IJ
"-42"
.30)4+)"2-)9"23)4)2"!""3- .-
"-42"
.30)4+)"2-)9"23)4)2"!")-4H5"--"
"-42""-"22)-
"-42"#.-H#22!
"-42"+!K82"++"
+)-)18"848+)34"
-35484"2'.-)#
-3548482)"
-35484!"-#2.+.')"!".22)-"IJH+";)3842)-
-354848349".833<
-35484"-H.!)-.4
-35484.+)H+,"7"3
&#+$'+'"+
/#""&# #",&'(&
<?
NOS PARTENAIRES
&#+$'+'"+
/#""&# #",&'(&
<@
ICGC BENCHMARK
« Grant INCa-4664 »
, )#"
recall
recall
, )#"
precision
precision
)"1Y+)!5.-48!<
RNNN3YUNN)-!"+3
&#+$'+'"+
/#""&# #",&'(&
<A