ChatGPT Takes on CRISPR


In the fascinating domain of gene editing, the discovery and utilization of CRISPR systems has typically involved scouring natural environments—from hot springs to yogurt cultures—to identify novel mechanisms employed by microbes. However, a revolutionary shift is underway with the advent of generative artificial intelligence (AI) technologies, which enable researchers to design new CRISPR systems effectively “with the push of a button.”

Recent publications have detailed the innovative use of AI in creating CRISPR gene-editing proteins. One study showcased how a generative AI tool, specifically a protein language model trained on millions of protein sequences, successfully designed functional CRISPR systems that were validated in laboratory settings. Similarly, another team reported their success in developing CRISPR systems using an AI model trained on microbial genomes. These systems, consisting of a DNA or RNA-cutting enzyme paired with RNA molecules directing the cutting, highlight the potential of AI in expanding the capabilities of gene editing beyond natural CRISPR systems’ limitations.

Ali Madani, a machine-learning scientist and CEO of the biotech firm Profluent, emphasized the significance of these developments. His team achieved a milestone by successfully editing the human genome using proteins designed entirely through machine learning. This work represents a significant step forward in the application of AI to design complex biological systems.

The AI approaches utilized differ from the typical applications of machine learning, such as those seen in language models like ChatGPT. Instead of text, these CRISPR-designing AI tools were trained on extensive biological data, including protein and genome sequences. This training aims to imbue the AI models with a deep understanding of natural genetic sequences, facilitating the design of new gene-editing tools.

One such tool developed by Madani’s team is the OpenCRISPR-1 molecule. This Cas9 protein variant was engineered to cut DNA with high precision and reduced off-target effects, showcasing efficiency comparable to existing CRISPR systems and offering improved fidelity. The design was facilitated by an updated version of their ProGen protein language model, retrained with diverse CRISPR systems data.

Another significant development comes from a team led by computational biologist Brian Hie and bioengineer Patrick Hsu. They employed an AI model capable of generating both protein and RNA sequences, trained on vast genomic data from bacteria and archaea. Although their designs have not yet been tested in the laboratory, the predicted structures suggest these AI-designed CRISPR systems closely resemble those of natural proteins.

The potential applications of AI-designed CRISPR tools are vast, particularly in precision medicine. The unrestricted use of molecules like OpenCRISPR-1, unlike some patented gene-editing tools, and the availability of resources like the ProGen2 model, empower researchers and potentially accelerate the development of gene therapies.

Madani envisions these AI-generated CRISPR tools as critical to advancing gene therapy, emphasizing the necessity for precision and bespoke design that surpasses the capabilities of merely copying naturally occurring systems. The ongoing collaborations with companies to test these new designs highlight the practical implications and future potential of AI in revolutionizing gene editing for therapeutic purposes.