A machine learning approach to pharmacogenomics: application to therapy of Myasthenia Gravis Dimos Kapetis Bioinformatics, Scientific Direction Neurological Institute “Carlo Besta”, Italy Email: [email protected] A machine learning approach to pharmacogenomics: application to therapy of Myasthenia Gravis PGx of Myasthenia Gravis Aim of Study Hypothesis • Rare autoimmune disease antibody-mediated leading to fluctuating muscle weakness and fatigability • Single gene-based approach of thiopurine methyltransferase (TMPT) has uncovered genotype-phenotype association • Azathioprine (AZA) is a purine antagonist used as an immunosuppressant to block T- and B-cell proliferation. • Intolerance to AZA can occur in the absence of intolerance-associated TPMT alleles [Colleoni et al 2012 J Clin Pharmacol] •Establish and refine a machine learning-based pipeline applied to pathway-based microarray data to analyze combinations of SNPs that impact on metabolic pathways in the context of drug response. •Machine Learning methods are able to model the relationship between genotypephenotype and determine SNP interactions. •SNP combinations can identify minor associations that would not have been detected with a single-based approach. Design/Methods to study combinations of SNPs that impact on metabolic pathways in the context of drug response Study Population AZA Response PathwayBased PGx Data Mining Pipeline •Responders (control group): n=60 •Non-Responders: n=40 •Intolerant : n=39 •AZA dose: 100-200 mg per day • Responders: showed beneficial after 1 year of treatment • Intolerant: experiencing persistent side effects upon treatment •Non-responders: no pharmacological effect Design •Genomic DNA extracted from peripheral blood •1,936 drug metabolism markers in ~230 pharmacogenes • • • Feature Selection: InfoGain, Relief, Chi-squared, wrappers Multifactor Dimensionality Reduction, BayesNet, Logistic function, Random Forest True Positive metrics was used to compare classification accuracy Methods Data Mining Pipeline Pre-processing 1936 SNPs (235 genes) Feature Selection (FS) Model Building (MB) Random Forest Wrapper Algorithms InfoGain Relief BayesNet Logistic ChiSquared Performance Evaluation in Cross-Validation Multifactor Dimensionality Reduction (MDR) MDR (Jason H. Moore et al 2006) FS and MB are performed in WEKA (Mark Hall et al 2009) Responders Vs Non-Responders MG patients Classification performance comparison 102 100 98 96 94 92 90 88 86 84 82 GS GS wrapper+BayesNe wrapper+Random Relief+MDR (4 SNPs) GS wrapper+Logistic t (7 SNPs) Forest (8 SNPs) 3-Folds 88.8 87.9 99.3 90.5 5-Folds 87.9 89.4 98.3 92.48 10-Folds 88.8 89.4 99.3 90.5 Average Accuracy 88.5 88.9 98.9 91.16 3-Folds 5-Folds 10-Folds (7 SNPS) Average Accuracy GS=GreedyStepwise Responders Vs Intolerant MG patients Classification performance comparison 98 96 94 92 90 88 86 84 82 80 78 GS GS wrapper+BayesNet wrapper+RandomF (8 SNPs) orest (5 SNPs) 3-Folds 87.9 86.26 95.4 87.2 5-Folds 89.47 87.96 94.5 87.2 10-Folds 90.2 84.9 93.5 87.2 89 86.37 94.4 87.2 Average Accuracy 3-Folds 5-Folds 10-Folds Relief+MDR (4 SNPS) GS wrapper+Logistic (8 SNPs) Average Accuracy GS=GreedyStepwise Two 4-order SNP combinations predicts AZA response in MG patients A) Responders Vs NonResponders A Overall accuracy =98.9% SLCO1B1* (rs2291075) MAF:0.45 SLC22A2 (rs624249) MAF:0.29 4.84% SLCO1B1(rs2291075) + ABCB1(rs2032582) ABCB1(rs2032582) + UGT2B4 (rs1131878) 0.51% 4.17% -1.33% * SLCO1B1 (rs11045879), influence AZA response efficacy in acute lymphoblastic leukemia [Stocco G at al 2012] ABCB1** (rs2032582) MAF:0.34 **ABCB1 (rs2032582) missense mutation 2677TT was found in non responders with Crohn’s disease [Mendoza JL et al 2007] 6.46% UGT2B4 (rs1131878) MAF=0.27 Redunduncy Interaction B) Responders Vs Intolerant Overall accuracy = 94% SLC7A8 (Rs2268873) + CBR3 (rs8133052) CHST7 (rs735716) + CBR3 (rs8133052) B ABCC6 (rs8058694) MAF:0.35 -4.56% -2.23% 1.44% 1.55% SLC7A8 Rs2268873 MAF:0.27 CBR3 (rs8133052) MAF:0.4 -1.52% CHST7*** (rs735716) -3.27% MAF:0.30 Conclusions •We introduce a data mining two-steps analysis to select and classify MG patients able to classify our control groups vs responder and Intolerant group. •The MDR method outperformed other ML methods and determine with high degree of probability response to AZA in MG patients. •The two 4-models permitted to identify SNP synergic interactions in relatively small sample sizes. •In conclusion, in this research study shows that the application of a combination of multi-locus (in contrast to single based approach) may amplify the effects of single SNPs and is a powerful approach for pharmacogenomics studies Next Steps • Recruit more MG patients to validate the SNP interaction model • Use this approach to identify the inherited basis for inter-individual differences in response to AZA in patients with other immunomediated disease such as Multiple Sclerosis • R-Statistical package development of the ML pipeline ACKNOWLEDGEMENTS Neurologia IV Unit Dr. Renato Mantegazza Dr. Carlo Antozzi Dr. Pia Bernasconi Dr. Lara Colleoni Dr. Lorenzo Maggi Dr. Fulvio Baggi Bioinformatics Dr.Barbara Galbardi Genopolis Consortium Dr.Maria Foti Questions ? Dimos Kapetis Bioinformatics, Scientific Direction Neurological Institute “Carlo Besta”, Italy Email: [email protected] Wrappers “Wrap around” the learning algorithm Must therefore always evaluate subsets Return the best subset of attributes Apply for each learning algorithm Use same search methods as before Select a subset of attributes Induce learning algorithm on this subset Evaluate the resulting model (e.g., accuracy) No Stop? Yes Method Multifactor dimensionality reduction (MDR): Data mining approach to detect and characterize combinations of SNPs that interact to influence a dependent or class variable. MDR identify interactions among discrete variables that influence a binary outcome (e.g Treated vs CNTL) and is considered a nonparametric alternative to logistic regression analysis. MDR is good predictive machine learning method to analyze the data. The Naive Bayes classifier is used. Metabolism of 6-MP L Wang and R Weinshilboum, Oncogene 25, 1629-1638 (2006) Azathioprine Pathway metabolism
© Copyright 2024 ExpyDoc