Web Notifications

SaltWire.com would like to send you notifications for breaking news alerts.

Activate notifications?

Colby Cosh: The biggest news in biology had nothing to do with the COVID-19 vaccine

A Markov state model illustrating 15 of the highest-flux folding pathways between the unfolded and native states of ACBP, a 86-residue helix-bundle protein.
A Markov state model illustrating 15 of the highest-flux folding pathways between the unfolded and native states of ACBP, a 86-residue helix-bundle protein.

STORY CONTINUES BELOW THESE SALTWIRE VIDEOS

Olive Tapenade & Vinho Verde | SaltWire

Watch on YouTube: "Olive Tapenade & Vinho Verde | SaltWire"

Regulators in the United Kingdom have approved the Pfizer SARS-CoV-2 vaccine for immediate use in that country, providing the world with its greatest jolt of optimism about the rampaging pandemic. But what if I told you that for virologists and molecular biologists, it wasn’t the biggest news of the week?

This might be a slight exaggeration. Scientists are trapped in their homes ordering takeout like the rest of us, and have just the same aversion to dying of pneumonia. But the results of a competition called CASP14 have produced disbelief and jubilation on social media; they extend far beyond COVID-19 in their implications. It appears that Google’s DeepMind artificial intelligence unit has made an astounding leap in the realm of automated protein prediction, offering the chance of filling a frustrating gap between gene sequencing and practical medicine.

Protein prediction is the art of taking the DNA sequence of a protein — its genetic construction code, which is now easy to read — and guessing the final three-dimensional structure of the protein from its molecular components alone. Proteins “fold” into complex, ugly shapes that are defined by the hundreds of attractions and repulsions between their atoms. These shapes then determine the protein’s behaviour in a biological system, which is why there is so much discussion, for example, of the nasty “spike” on the surface of the SARS-CoV-2 glycoprotein.

It sounds like a simple business, but this is one of those situations in which the statistician’s “curse of dimensionality” thwarts scientific ambitions. Predicting how a protein will fold involves keeping track of the energy states of hundreds or thousands of atoms in space. Move one atom and it influences the forces acting on all its neighbours.

So understanding an actual protein still involves lots of old-fashioned imaging of the finished object, some of it using the same crystallography techniques that led to the discovery of DNA’s helical structure, and a certain amount of artisanal guesswork involving rules of thumb. This is extremely laborious; we have sequences for millions of important proteins, but thorough structural knowledge of far fewer.

This problem — referred to as “de novo” protein structure prediction — has been an important target for artificial intelligence research. In 1994, scientists established CASP, an open competition format that would allow neutral assessment of AI protein-solving models. When a new protein is cracked by the old human methods, its structure is withheld from CASP competitors and examiners and the sequence is published. After a while the curtain is raised to see which of the AI programs hit the ball closest to the pin, so to speak.

Until this year, the AIs often provided clever guesses and new insights, much as AI chess software used to do before it attained human strength. But the robots were never fully competitive with classical imaging techniques. In the November CASP, according to DeepMind, the latest version of its AlphaFold software has obliterated the competition. AlphaFold predicted the structures of the candidate proteins to nearly the same level of accuracy as imaging methods can achieve given the physical protein itself. The implication is that the era of fully automated protein prediction has arrived — a prospect that has molecular biologists salivating.

It might, just for starters, help avert a pandemic when the next animal virus makes the zoonotic crossover to humans. We’re all grateful that vaccines for SARS-CoV-2 could be designed so quickly — it turns out to be the work of a few hours, and only the need for testing slows things down — but at some point in the future vaccines themselves will seem like clumsy, barbaric answers to new disease. We’ll just make pills that can knock out key proteins in a virus. You know, cures. AlphaFold’s CASP triumph brings that era within sight.

CASP was created with the idea that progress in AI de novo structure inference was bound to happen a good deal faster than it did. As with the history of AI in games, the pace of the dance turns out to be slow-slow-slow-slow-slow-KABLAMMO. It is rarely easy to be sure, before the fact, whether the exponential takeoff is hundreds of years away, or three weeks.

The part played by CASP itself teaches a lesson; it reminds us that in a scientific setting, rigorous open competitions can serve the same role, on an overarching level, that controlled trials do in evaluating hypotheses or medical treatments. DeepMind’s announcement doesn’t yet have the form of a proper peer-reviewed monograph, and even the designers may not understand how their software behaves very well (this being the nature of AI; if it works, that implies almost by definition that it’s smarter than humans in its domain). But the likelihood of Google cheating, or even torquing the CASP results very much, is so small that the announcement qualifies as reason to celebrate.

National Post
Twitter.com/colbycosh

Copyright Postmedia Network Inc., 2020

Share story:
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT