// SYSTEM_ONLINE

Aegis for
Biosecurity Screening

Detecting AI-generated malicious proteins and synthetic toxins beyond simple sequence homology.

Team Cholix — Rick Lee, Zhiyee Liang

Imperial College London AminoAnalytica
Cholix protein interacting with NAD

Cholix protein interacting with its substrate NAD

// IDENTIFYING_THE_GAP

Structural security at sequence speed.

  • Current sequence-level screening is easily evaded
  • Expensive structure prediction tools can't scale to high-throughput screening
  • Aegis infers structural context from sequence — bridging that gap
  • Detects toxin-specific interface fingerprints without folding a structure

// SEQUENCE_ECOSYSTEM

A world within your sequence.

A reference fingerprint is extracted from a known toxin-substrate interface. Aegis then searches any query sequence for a match.

Residue Embeddings

Each residue gets a high-dimensional vector encoding its biochemical context.
e.g. a catalytic serine reads differently from any other serine.

ESM2 · 1280-DIM VECTORS
+

Contact Maps

Pairwise distance probabilities between all residue pairs capture spatial proximity without ever predicting a 3D structure.

P(contact)ij · L×L MATRIX
yields

Interface Fingerprint

Residues with the right embeddings within spatial proximity form a unique fingerprint for a specific toxin-substrate interface.

e.g. Cholix · NAD⁺ interface
🔍

Aegis scores any query sequence against the reference fingerprint, flagging matches regardless of sequence identity to the original toxin.

// CASE_STUDY

Case Study: Cholix Toxin

METHODS
  • Benchmarking Aegis against Commec (IBBIS), a sequence-based biosecurity screening tool
  • Cholix toxin sequences redesigned using ProteinMPNN across a spectrum of mutational loads
  • Expected benign — mutations at conserved catalytic residues
  • Expected toxic — mutations outside the binding site
  • 30 sequences per class per temperature (120 total) put through DNA obfuscation to diverge from wild-type
  • Screened with COMMEC v1.0.0 biorisk step only — sequences matching known virulence factors flagged as Warning
T = 0.3 — All sequences flagged
Group n Biorisk
Benign 30 Warning
Toxic 30 Warning
T = 1.5 — Toxic sequences slipped through
Group n Biorisk
Benign 30 Pass
Toxic 15 Warning
Toxic 15 Pass ← missed

// CASE_STUDY

Case Study: Cholix Toxin

120 synthetic homologues of Cholix toxin were screened with Commec and Aegis. 15 toxic sequences evaded Commec entirely — Aegis flagged every one.

Aegis score histogram

Blue sequences passed Commec — many still score high on Aegis

Confusion matrix
15

toxic sequences missed by Commec
caught by Aegis

0

toxic sequences missed by Aegis

// STRUCTURAL_VERIFICATION

Structural Homology: TM-score of targeted Cholix domains

3 domains: catalytic · translocation · receptor-binding

Sequence Max TM-score₂ Best Target Detection
Toxic sequence 170.859catalytic domainFLAGGED
Toxic sequence 160.780catalytic domainFLAGGED
Toxic sequence 40.762catalytic domainFLAGGED
Toxic sequence 300.761catalytic domainFLAGGED
Toxic sequence 190.749catalytic domainFLAGGED
Toxic sequence 200.746catalytic domainFLAGGED
Toxic sequence 290.734catalytic domainFLAGGED
Toxic sequence 240.726catalytic domainFLAGGED
Toxic sequence 210.724catalytic domainFLAGGED
Toxic sequence 60.720catalytic domainFLAGGED
Toxic sequence 90.720catalytic domainFLAGGED
Toxic sequence 260.707catalytic domainFLAGGED
Toxic sequence 130.699catalytic domainBORDERLINE
Toxic sequence 250.669catalytic domainBORDERLINE
Toxic sequence 50.652catalytic domainBORDERLINE
TM-score₂ ≥ 0.7 — same fold (flagged)
TM-score₂ 0.5–0.7 — borderline
TM-score₂ < 0.5 — dissimilar

TM-score₂: normalised by target domain length

CONCLUSION

The 15 sequences flagged by Aegis share a similar fold with the catalytic domain of Cholix toxin, and are therefore likely to be toxic.

// CONCLUSIONS

Conclusions

!
Sequence-based screening alone can be evaded with sequence obfuscation
Structure-based screening catches what sequence screening misses
+
Combined interface signature detection + structural homology by Aegis detected Cholix toxic sequences successfully