Learn

AI/ML Binding
Affinity Prediction

Machine learning approaches that predict protein-protein binding affinity from sequence—enabling rapid scoring of thousands of candidates without experimental measurement.

From Sequence to Binding Affinity

Experimental measurement of binding affinity—by surface plasmon resonance (SPR), bio-layer interferometry (BLI), or isothermal titration calorimetry (ITC)—is the gold standard for quantifying protein-protein interactions. But these methods require purified protein, specialized instruments, and hours per measurement. When you have thousands of computationally designed candidates, experimental measurement of every variant is impractical. Machine learning models that predict binding affinity directly from sequence bridge this gap, providing rapid approximate scoring that identifies the most promising candidates for experimental validation.

The key insight enabling modern binding affinity prediction is that protein language models—large neural networks trained on billions of natural protein sequences—learn rich representations of sequence-function relationships. These representations encode information about protein structure, stability, evolutionary conservation, and interaction propensity, even though the models were never explicitly trained on any of these properties. Transfer learning extracts these representations and fine-tunes them on experimental binding data to create specialized affinity predictors.

ESM2 Embeddings and Transfer Learning

ESM2, developed by Meta's FAIR lab, is a protein language model with 650 million parameters trained on 250 million protein sequences. For each input sequence, ESM2 produces a 1,280-dimensional embedding vector that captures the sequence's position in protein function space. Two sequences with similar embeddings tend to have similar structures, similar stability profiles, and often similar binding properties, even if their primary sequences diverge significantly.

To build a binding affinity predictor, we concatenate the ESM2 embeddings of the binder and target sequences, then train a lightweight neural network head (typically 2–3 fully connected layers with layer normalization) on experimental KD data from sources like SAbDab, the Structural Antibody Database. The model learns to map the combined embedding space to predicted log(KD) values. Because the heavy lifting of understanding protein sequence is handled by the pre-trained ESM2 backbone, the supervised head requires relatively little training data—thousands of binding measurements rather than millions—to achieve useful accuracy.

Validation and Real-World Performance

The critical question for any ML model is whether its predictions generalize to new, unseen proteins. Our ESM2-based KD predictor achieves a Pearson correlation of r = 0.913 against held-out experimental binding data—meaning it explains over 83% of the variance in measured binding affinity. This level of accuracy is sufficient for ranking and triaging computational candidates, though it is not a replacement for experimental measurement of final lead candidates.

The model is most reliable for ranking candidates within a campaign—comparing variants of the same binder against the same target—where systematic biases cancel out and relative rankings are highly predictive of experimental outcomes. It is less reliable for absolute KD prediction across unrelated targets, where the training data distribution may not cover the query space. For this reason, we use ML affinity prediction as a screening tool to select top candidates for experimental validation, not as a substitute for SPR or BLI measurement of the final shortlist. The combination of computational ranking followed by experimental confirmation consistently outperforms either approach alone.

Why It Matters

Binding affinity prediction changes the economics of protein design. Instead of synthesizing and testing hundreds of candidates to find a handful of binders, you computationally score thousands and only test the top-ranked designs. This reduces cost per hit by 10–50x and compresses timelines from months of iterative screening to weeks of targeted validation. As ML models continue to improve with more training data and better architectures, the accuracy gap between predicted and measured affinity will narrow further—making computational screening an increasingly essential component of modern protein engineering.

Explore Computational Design Services

Want ML-Powered Affinity Scoring for Your Candidates?