Learn

In Silico Screening
& Ranking

Virtual screening and multi-objective ranking of protein candidates to reduce wet lab burden and focus experimental resources on the highest-value designs.

The Case for Virtual Screening

Computational protein design tools can generate thousands of candidate sequences in hours. The bottleneck is no longer design generation but design evaluation—deciding which of those thousands of candidates are worth synthesizing, expressing, and testing. In silico screening applies computational scoring functions to rank candidates before any wet lab work begins, transforming the problem from “screen everything and hope” to “synthesize the top 50 and expect hits.”

The economics are compelling. Gene synthesis costs $0.05–0.15 per base pair, expression and purification costs $500–2,000 per candidate, and binding assays add $50–200 per measurement. Screening 2,000 candidates experimentally costs $1–4 million. Screening them computationally and synthesizing only the top 50 costs under $100,000. The question is whether the computational scoring is accurate enough to capture the best candidates—and with modern tools, it increasingly is.

Scoring Functions and Metrics

No single scoring function captures all aspects of protein quality. Effective in silico screening uses multiple complementary metrics. Binding affinity predictions—from physics-based methods (Rosetta interface energy, FoldX binding energy) or ML models (ESM2-based KD predictors)—estimate how tightly a candidate will bind its target. Structural confidence scores (pLDDT from AlphaFold/ESMFold, ipTM from Boltz-2 co-folding) indicate whether the predicted structure is reliable and whether the complex is likely to form as modeled.

Developability metrics add a manufacturing lens: predicted aggregation propensity, surface hydrophobicity patches, charge asymmetry, and sequence liability flags (deamidation sites, unpaired cysteines, glycosylation motifs in CDRs). Sequence diversity metrics ensure that the selected candidates span different sequence clusters rather than converging on a single solution, preserving backup options if the top candidates fail experimentally.

Multi-Objective Ranking

The challenge in multi-metric screening is that candidates rarely rank first on every dimension simultaneously. A design with the highest predicted binding affinity may have poor structural confidence, while the most stable candidate may bind only weakly. Multi-objective ranking addresses this through hard gates and weighted scoring. Hard gates eliminate candidates that fail minimum thresholds—for example, pLDDT below 70 or predicted aggregation above 30%—regardless of their other scores. Among candidates that pass all gates, weighted composite scores or Pareto-front analysis identifies the best overall designs.

The practical output of in silico screening is a ranked shortlist of 20–100 candidates with full scoring data: predicted binding affinity, structural models, confidence metrics, developability flags, and sequence diversity analysis. This shortlist, along with the scoring rationale, gives experimental teams the information they need to make informed decisions about which candidates to advance and in what order.

Why It Matters

In silico screening is not about replacing experimental validation—it is about making experimental validation more efficient. By filtering thousands of computational designs down to a focused shortlist before any protein is synthesized, you reduce cost by an order of magnitude, accelerate timelines from months to weeks, and increase the probability that your experimental hits will actually progress through development. The better your computational screen, the fewer dead ends in the lab.

Explore Computational Design Services

Have Candidates That Need Ranking?