Deep SNP: An End-to-end Deep Neural Network with Attention-based Localization for Break-point Detection in SNP Array Genomic data

Autoren Hamid Eghbal-zadeh
Lukas Fischer
Niko Popitsch
Florian Kromp
Sabine Taschner-Mandl
Khaled Koutini
Teresa Gerber
Eva Bozsaky
Peter F. Ambros
Inge M. Ambros
Gerhard Widmer
Bernhard A. Moser
Editoren
Titel Deep SNP: An End-to-end Deep Neural Network with Attention-based Localization for Break-point Detection in SNP Array Genomic data
Typ sonst
Ort WCB 2018 @ FAIM, Joint ICML (35th International Conference on Machine Learning) and IJCAI (27th International Joint Conference on Artificial Intelligence) Workshop on Computational Biology, Stockholm SWEDEN, July 10-15, 2018
Monat July
Jahr 2018
Seiten Poster
SCCH ID# 18057
Abstract

Diagnosis and risk stratification of cancer and many other diseases require the detection of genomic breakpoints as a prerequisite of calling copy number alterations (CNA). This, however, is still challenging and requires time-consuming manual curation. As deep-learning methods outperformed classical state-of-the-art algorithms in various domains and have also been successfully applied to life science problems including medicine and biology, we here propose Deep SNP, a novel Deep Neural Network to learn from genomic data. Specifically, we used a manually curated dataset from 12 genomic single nucleotide polymorphism array (SNPa) profiles as truth-set and aimed at predicting the presence or absence of genomic breakpoints, an indicator of structural chromosomal variations, in windows of 40,000 probes. We compare our results with wellknown neural network models as well as Rawcopy though this tool is designed to predict breakpoints and in addition genomic segments with high sensitivity. We show, that Deep SNP is capable of successfully predicting the presence or absence of a breakpoint in large genomic windows and outperforms state-of-the-art neural network models. Qualitative examples suggest that integration of a localization unit may enable breakpoint detection and prediction of genomic segments, even if the breakpoint coordinates were not provided for network training. These results warrant further evaluation of DeepSNP for breakpoint localization and subsequent calling of genomic segments.