In a groundbreaking development, a team of computer scientists at New York University (NYU) has unveiled a neural network capable of elucidating the rationale behind its predictions. This innovative work sheds light on the inner workings of neural networks—a cornerstone of artificial intelligence (AI) and machine learning—and brings to the forefront a process that has largely remained veiled from users.
The breakthrough centers around the application of neural networks to tackle complex biological questions, a domain that has gained prominence in recent years. Specifically, the researchers delved into the intricacies of RNA splicing—a pivotal biological process responsible for transferring genomic information from DNA to functional RNA and protein products.
“Many neural networks operate as black boxes, concealing the mechanisms behind their decision-making. This opacity has raised concerns about their reliability and has hindered progress in understanding the intricate biological processes, such as genome encoding,” explains Oded Regev, a computer science professor at NYU’s Courant Institute of Mathematical Sciences and the senior author of the paper published in the Proceedings of the National Academy of Sciences. “Through a novel approach that enhances both the quantity and quality of data used for machine learning training, we have crafted an interpretable neural network capable of making accurate predictions while also providing explanations for its decisions.”
The collaborative efforts of Regev and co-authors Susan Liao, a faculty fellow at the Courant Institute, and Mukund Sudarshan, a Courant doctoral student at the time of the study, culminated in the development of a neural network grounded in existing knowledge about RNA splicing.
Specifically, they devised a model—akin to a sophisticated data-driven microscope—that allows scientists to meticulously trace and quantify the RNA splicing process, from the input sequence to the prediction of splicing outcomes.
Regev elaborates, “By adopting an ‘interpretable-by-design’ approach, we’ve engineered a neural network model that offers profound insights into RNA splicing—a fundamental process in the transfer of genomic information. Our model unveiled a crucial discovery: the presence of a small, hairpin-like structure within RNA that inhibits splicing.” The researchers validated their model’s findings through a series of experiments, corroborating their discovery that when the RNA molecule assumes a hairpin configuration, splicing is impeded. Conversely, when they disrupted this hairpin structure, splicing resumed—a pivotal revelation that holds promise for advancing our understanding of complex biological processes.
This groundbreaking research not only demystifies the decision-making processes of AI but also holds the potential to unlock deeper insights into the biological underpinnings of genome encoding. As AI continues to play an increasingly significant role in scientific inquiry, this development marks a significant stride toward transparency and interpretability in machine learning.