Deep SNP: An end-to-end deep neural network with attention-based localization for break-point detection in SNP array genomic data

Autoren Hamid Eghbal-zadeh
Lukas Fischer
Niko Popitsch
Florian Kromp
Sabine Taschner-Mandl
Teresa Gerber
Eva Bozsaky
Peter F. Ambros
Inge M. Ambros
Gerhard Widmer
Bernhard A. Moser
Editoren
Titel Deep SNP: An end-to-end deep neural network with attention-based localization for break-point detection in SNP array genomic data
Typ Artikel
Journal Journal of Computational Biology
DOI 10.1089/cmb.2018.0172
Monat December
Jahr 2018
SCCH ID# 18101
Abstract

Clinical decision-making in cancer and other diseases relies on timely and cost-effective genome-wide testing. Classical bioinformatic algorithms, such as Rawcopy, can support genomic analysis by calling genomic breakpoints and copy-number variations (CNVs), but often require manual data curation, which is error prone, time-consuming, and thus substantially increasing costs of genomic testing and hampering timely delivery of test results to the treating physician.We aimed to investigate whether deep learning algorithms can be used to learn from genome-wide single-nucleotide polymorphism array (SNPa) data and improve state-of-the-art algorithms. We developed, applied, and validated a novel deep neural network (DNN), DeepSNP. A manually curated data set of 50 SNPa analyses was used as truthset. We show that DeepSNP can learn from SNPa data and classify the presence or absence of genomic breakpoints within large genomic windows with high precision and recall. DeepSNP was compared with well-known neural network models as well as with Rawcopy. Moreover, the use of a localization unit indicates the ability to pinpoint genomic breakpoints despite their exact location not being provided while training. DeepSNP results demonstrate the potential of DNN architectures to learn from genomic SNPa data and encourage further adaptation for CNV detection in SNPa and other genomic data types.