In this file photo taken on January 27, 2021, Palestinian doctors and technicians work at the IVF laboratory at the Razan Center fertility clinic in Nablus, in the Israeli-occupied West Bank – Copyright AFP/File Jaafar ASHTIYEH
Seventy countries and thousands of researchers and citizen scientists. This is how far a platform developed by Virginia Tech computer science researcher Debswapna Bhattacharya has spread. This is a publicly available biomedical artificial intelligence (AI) platform.
From well-funded labs in the U.S. to undergraduate students in developing countries, anyone with an Internet connection can, and has, run sophisticated molecular analyses using a simple, web-based platform hosted in the Department of Computer Science.
As an example of the take-up and benefits, Bhattacharya recalls one inquiry he received from Africa: “He actually started using this web server that we developed when he was an undergraduate student, and he carried out a project all by himself, came up with a paper as a single author, submitted it to a preprint server, and then sent me that paper, saying, ‘Using your server, I actually carried out this work,’” Bhattacharya explains.
That student has gone on to graduate studies in the U.S.
Bhattacharya, associate professor of computer science, has received a five-year, $2.1 million National Institutes of Health (NIH) Outstanding Investigator Award to build on this work to develop innovative AI approaches to decode disease and find treatments.
The grant program supports basic research related to disease diagnosis, treatment, and prevention, providing funding stability to push scientific discovery forward, faster.
Bhattacharya is confident: “We are fortunate to have a lot of resources, like internet connectivity and so on…The important thing is touching people’s lives in places that are not blessed to have these resources.”
Predicting proteins
Bhattacharya’s team focuses on proteins and RNA — the biological machines of human and animal life — and uses deep learning, a form of AI, to predict how these molecules are structured and how they function at the atomic level.
These molecules are complex, yet if scientists succeed in mapping their 3D shapes accurately, they can spot places to target treatments and begin developing new drugs for disease.
Unlike the image or text datasets used to train AI systems, biological datasets are often scarce, which can cause deep learning models trained on them to make unreliable predictions. To address that, Bhattacharya is building “biology-guided” and “biophysics-informed” AI systems that incorporate established scientific principles from chemistry and physics, making the models both more accurate and more interpretable.
“We’re training on structural data that experimentalists have painstakingly built over 70 or 75 years. We’re incredibly lucky to have it,” Bhattacharya clarifies. “Now our job is to use deep neural networks to fill in the gaps.”
The long-term goal is to better understand how biomolecules interact, particularly RNA and protein-RNA systems, which remain harder to model than proteins alone.
Explaining biomolecules
RNA and protein molecules present a daunting challenge because they don’t stand still. Because the shape of a molecule affects its function, decoding how it shifts and changes is crucial. But it’s very hard to do, even in labs. A single molecule can have thousands of atoms, and predicting the exact position of each one is extremely difficult. However, if scientists can accurately predict these 3D structures, they can find druggable pockets — places where you can target treatments.
“It’s like you’re riding a bicycle in a windstorm,” Bhattacharya states. “You’re constantly being pushed away from your path.”
