At Oak Ridge National Laboratory (ORNL), quantum biology, artificial intelligence, and bioengineering have collided to redefine the landscape of CRISPR Cas9 genome editing tools. This multidisciplinary approach, detailed in the journal Nucleic Acids Research, promises to elevate the precision and efficiency of genetic modifications in organisms, particularly microbes, paving the way for enhanced production of renewable fuels and chemicals.
CRISPR is adept at modifying genetic code to enhance an organism's performance or correct mutations. CRISPR Cas9 requires a guide RNA (gRNA) to direct the enzyme to its target site to perform these modifications. However, existing computational models for predicting effective guide RNAs in CRISPR tools have shown limited efficiency when applied to microbes. ORNL's Synthetic Biology group, led by Carrie Eckert, observed these disparities and set out to bridge the gap.
"A lot of the CRISPR tools have been developed for mammalian cells, fruit flies, or other model species. Few have been geared towards microbes where the chromosomal structures and sizes are very different," explained Eckert.
Electronic structure can have great effects on the chemical properties and interactions of nucleotides, the building blocks of DNA and RNA. Erica Prates, computational systems biologist at ORNL, explained that the distribution of electrons within these molecules influences reactivity and conformational stability, including the effective binding of the Cas9-gRNA complex to cellular DNA.
To design effective gRNA for these smaller organisms, these effects must be accounted for. To address the shortcomings in current gRNA design tools, ORNL scientists delved into quantum biology—a convergence of molecular biology and quantum chemistry– in a bid to understand the effects of electronic structure within cell nuclei, where genetic material is stored.
The scientists built an explainable artificial intelligence model, named iterative random forest, trained on a dataset of around 50,000 guide RNAs targeting the genome of E. coli bacteria. This model considered quantum chemical properties, offering insights into nucleotide features crucial for selecting optimal guide RNAs. The model was validated via laboratory experiments in which E. coli DNA was cut using guides selected by the model.
"The model helped us identify clues about the molecular mechanisms that underpin the efficiency of our guide RNAs, giving us a rich library of molecular information that can help us improve CRISPR technology," said Prates.
Jaclyn Noshay, a former ORNL computational systems biologist and first author of the paper, emphasized the value of the explainable AI model, which provides an in-depth understanding of the biological mechanisms driving results, as opposed to less interpretable deep learning models rooted in “black box” algorithms.
The quantum-informed CRISPR Cas9 model enhances microbial genome editing and has broader implications. According to Eckert, it opens avenues for improving guide RNA design across various species, including humans, with potential applications in drug development. “If you’re looking at any sort of drug development, for instance, where you’re using CRISPR to target a specific region of the genome, you must have the most accurate model to predict those guides,” Eckert noted.
Refining CRISPR Cas9 models not only accelerates research but also enhances the ability to modify the DNA of diverse organisms predictively. This holds significance in the field of functional genomics, linking genotype to phenotype, and aligns with the goals of the ORNL-led Center for Bioenergy Innovation (CBI), aiming to improve bioenergy feedstock plants and bacterial fermentation of biomass.
“A major goal of our research is to improve the ability to predictively modify the DNA of more organisms using CRISPR tools. This study represents an exciting advancement toward understanding how we can avoid making costly ‘typos’ in an organism’s genetic code,” said ORNL’s Paul Abraham, a bioanalytical chemist who leads the DOE Genomic Science Program’s Secure Ecosystem Engineering and Design Science Focus Area, or SEED SFA, that supported the CRISPR research. “I am eager to learn how much more these predictions can improve as we generate additional training data and continue to leverage explainable AI modeling.”
Moving forward, ORNL's synthetic biology team plans to collaborate with computational science colleagues to further refine the microbial CRISPR Cas9 model using experimental data from various microbial species. The team has a bright future planned for this technology. “We’re greatly improving our predictions of guide RNA with this research,” Eckert said. “The better we understand the biological processes at play and the more data we can feed into our predictions, the better our targets will be, improving the precision and speed of our research.” As this tool becomes more accurate and reaches further across the microbial domain, the applications in medicine, industry, and research seem endless.