Teaching Computers to Read Genetic Blueprints
The enormous task of sequencing one's "book of life" reading the order in which billions of the chemical building blocks of DNA are arranged is only part of what's needed to examine the genetic blueprint for clues about one's health: knowing the chemical text of DNA is not the same as knowing what parts of it to read; or for understanding the diverse rules that govern the complex molecular circuitry inside living cells when expressing genetic instructions as observable traits.
The present state-of-the-art in genetic analysis uses computers to spot meaningful connections between glitches in one's genetic makeup and the onset of disease. But this logic must be carefully crafted because the typical methods for training machine learning algorithms to harness enormous volumes of data do not readily extend into the genomics domain.
Machine learning techniques work well in systems that can scan many tidy examples of what they are supposed to do; and when deficiencies in their existing logic can be refined through expert analysis and subsequent feedback.
But DNA sequences compiled from multiple sources are often "messy" datasets that tend to obscure subtle, but important, genetic variations; and expert supervision of training parameters is limited since geneticists still don't know enough about the complex molecular interactions inside living cells to offer meaningful guidance.
Search Engines for Genomes
Genotype Diagnostics uses powerful search engines to comb genetic blueprints for patterns that disrupt normal cell behaviors.
This approach builds upon recent technical advances in high-throughput DNA sequencing and molecular expression profiling; and applies new forms of data analysis, called "deep learning," to infer how genetic instructions inside living cells are normally read and, ultimately, how things can go awry from the cumulative impact of many small glitches in one's DNA.
Unlike traditional hunts for genetic markers that simply point to parts of the genome that, in some individuals, associate with harmful traits, our process looks more deeply at the functional impact of genetic profiles a more rational approach for exploring the many intermediate molecular profiles observed in complex spectrum disorders by spotting patterns in DNA that significantly contribute to the onset of disease ("pathogenic risks"); and sorting between loosely associated genetic variants that lead to small, but harmless, differences observed in human traits ("benign mutations") from those that alter the intensity with which one's disease symptoms express ("disease modifiers").