CASP: The Olympic Arena of Double-Blind Structural Biology
Nov 10, 2018Series: protein-structure-prediction
CASP: The Olympic Arena of Double-Blind Structural Biology
The mood in our lab this week is electric, mixed with a healthy dose of anxiety. In less than a month, the structural biology community will gather in Riviera Maya, Mexico, for the thirteenth Critical Assessment of Structure Prediction (CASP13) conference. For the uninitiated, CASP is the ultimate double-blind Olympics of our field.
As a researcher, CASP is the most fascinating scientific benchmark I have ever encountered. Today, I want to write about how CASP works, why it is the gold standard for avoiding machine learning "self-deception," and the mathematical metrics we use to evaluate whether a predicted protein matches reality.
The Ultimate Truth: How CASP Prevents Cheating
In machine learning, it is incredibly easy to lie to yourself. You design a model, train it on a dataset, test it on a held-out split, and celebrate a high accuracy score. But in biology, hidden biases, data leakage (such as training on proteins structurally similar to your test set), and subtle overfitting are constant hazards.
CASP was founded in 1994 by John Moult and Krzysztof Fidelis to solve this exact problem. It is designed as a true double-blind experiment:
- The Targets: Organizers identify experimental structural biologists who have mapped out a protein's 3D structure but have not yet published it.
- The Challenge: Organizers release the 1D amino acid sequence to the computational community.
- The Blind Prediction: Computational groups have exactly three weeks to run their algorithms and submit predicted 3D coordinates back to the organizers.
- The Evaluation: Independent assessors compare the submitted predictions against the secretly held experimental structures.
There is zero opportunity to overfit, cheat, or tweak parameters post-hoc. Your model either understands the biophysical principles of folding, or it crashes and burns on the grand stage.
Measuring Similarity: GDT_TS and TM-Score
How do you mathematically compare two folded proteins? You can't just overlay them and calculate the standard Root-Mean-Square Deviation (RMSD) of all atoms. If a protein has a flexible loop that is bent slightly out of place, a simple RMSD calculation will yield a massive error, even if the rest of the massive protein is aligned perfectly.
To resolve this, the community relies on two highly robust metrics:
The Historical 30-40 GDT Plateau
To understand why everyone in our lab is pacing around their desks, we have to look at the history of CASP on Free Modeling (FM) targets. Free Modeling targets are the hardest class of proteins—they have absolutely zero known structural templates in the public databases. Computational models must predict their shape ab initio (from scratch).
For over twenty years—from CASP1 in 1994 to CASP12 in 2016—the median GDT_TS score for Free Modeling targets was stuck in a depressing 30 to 40 GDT_TS plateau. Every two years, groups would present minor, incremental improvements, but the physical forces were simply too complex, and the search space too vast, for classical energy minimization to crack.
The CASP13 Rumor Mill
This year, however, things feel different. DeepMind—the Google AI division that conquered Chess and Go with AlphaGo—entered CASP13 under the team name A7D.
Whispers have been circulating through the computational biology departments. Word on the street is that A7D’s predictions have completely shattered the historical Free Modeling plateau. The rumors suggest they are hitting median GDT scores well past 50, pushing toward 60, using deep neural networks that predict continuous distance maps rather than binary contact matrices.
Lab Whispers (November 2018)
If these rumors are true, we are about to witness the first major tectonic shift in structural biology in decades—proving that deep learning can extract spatial geometry directly from genetic sequence database evolution. I'm packing my bags for Mexico, and my next post will be a deep-dive analysis of the CASP13 results once the embargo is lifted.
Next Log Entry: AlphaFold 1: The Distogram Revolution at CASP13.
This is a post in the protein-structure-prediction series.
Other posts in this series:
- Dec 08, 2020 - The CASP14 Watershed: AlphaFold 2 and the Dawn of End-to-End Attention
- Feb 15, 2019 - AlphaFold 1: The Distogram Revolution at CASP13
- Nov 10, 2018 - CASP: The Olympic Arena of Double-Blind Structural Biology
- Jul 15, 2018 - Evolution's Mathematical Whispers: Co-Evolution and the DCA Puzzle
- Feb 20, 2018 - From Pixels to Peptides: Predicting Secondary Structure with Bi-LSTMs and ResCNNs
- Oct 12, 2017 - Anfinsen's Dogma and Levinthal's Paradox: The Biophysical Riddle of Protein Folding