Learn

Homology Modeling

Predicting protein structure from sequence using experimentally solved templates—the workhorse of structural biology for three decades.

How It Works

Homology modeling—also called comparative modeling—exploits a fundamental principle of structural biology: proteins with similar sequences adopt similar three-dimensional folds. If your target protein shares significant sequence identity with a protein whose structure has already been solved by X-ray crystallography or cryo-EM, you can use that solved structure as a template to build a reliable model of your target.

The workflow follows four stages. First, template identification: searching the PDB for structures with high sequence similarity to your query, typically using BLAST or HHpred. Second, sequence alignment: aligning your target sequence to the template, paying careful attention to gaps in loop regions and insertions that don't map cleanly to the template backbone. Third, model building: copying backbone coordinates from aligned regions, rebuilding side chains, and modeling loops that lack template coverage. Tools like SWISS-MODEL, MODELLER, and RosettaCM handle this step with varying degrees of automation. Fourth, refinement and validation: energy minimization to resolve steric clashes, followed by quality assessment using metrics like DOPE scores, QMEANDisCo, or MolProbity.

At sequence identities above 50%, homology models are typically accurate enough for drug design and binding site analysis. Between 30–50%, the overall fold is usually correct but loop conformations and side-chain packing become less reliable. Below 30%—the so-called twilight zone—template detection itself becomes uncertain, and model quality drops sharply.

When to Use Homology Modeling vs. AlphaFold

AlphaFold2 and its successors have fundamentally changed structural biology. For single-domain, monomeric proteins, AlphaFold often produces models that rival experimental resolution—even without a close template. So when does homology modeling still matter?

Homology modeling retains advantages in several practical scenarios. When you need to model a specific conformational state—an open vs. closed form of an enzyme, a ligand-bound conformation, or a particular oligomeric assembly—template selection lets you choose which state to model. AlphaFold tends to predict a single dominant conformation and can struggle with multi-state proteins. Homology modeling also gives you explicit control over which template features to preserve, which matters when you're engineering a protein and need the model to reflect a specific structural context rather than a consensus prediction.

In practice, most computational protein engineering workflows now start with AlphaFold for initial structure prediction and fall back to template-based modeling when conformational context, oligomeric state, or ligand binding geometry matters. The two approaches are complementary, not competing.

Limitations to Keep in Mind

No model is better than its template. Regions without template coverage—long loops, disordered tails, insertions unique to your target—are modeled ab initio and carry higher uncertainty. Glycosylation sites, post-translational modifications, and crystal-packing artifacts in the template can all introduce errors that propagate into your model. Always validate critical structural features against experimental data when available, and treat loop conformations in low-identity regions as hypotheses rather than facts.

Why It Matters

Structure drives function. Whether you're identifying druggable pockets, designing mutations to improve stability, or engineering binding interfaces, a reliable structural model is the foundation of every rational design decision. Homology modeling gives you that foundation quickly and cheaply—often in minutes rather than the months required for experimental structure determination.

Explore Computational Design Services

Need a Structural Model for Your Target?

Book a free 30-minute call. I'll assess your target and recommend the best modeling approach.