We design more complex systems than anyone else

We are designing genomes and complex genomic systems with far greater novelty, accuracy and end performance than the previous state of the art. Our advances come from our hard work, our data and the diligent care we take to teach AI as much as possible about the underlying biology.

We design more complex systems than anyone else

We are designing genomes and proteins with far greater novelty, controllability, and performance than the previous state-of-the-art. Our advances come from our unique, proprietary dataset of real organisms and the diligent care we take to teach AI as much as possible about the underlying biology.
‍

We're delivering step changes across R&D, from personalised medicine to plastic degradation

The deep learning models we build can, for the first time ever, understand complete biological context. This allows our platform to transform biological R&D outcomes in fields as varied and important as genome editing, biocatalysis, and agriculture.

More accurate structure predictions

than Google DeepMind's AlphaFold2, unlocking more reliable small molecule docking for larger and more complex proteins than ever before

40% more proteins annotated

than all other state-of the art algorithms, including CLEAN and Google's ProteInfer, allowing us to discover and classify the most difficult dark matter sequences

Controllable sequence generation

that leverages our dataset's superior diversity, context, and quality to design proteins and genomic systems to best match our partners' desired function and performance

Drawing from the best possible foundational data

AI performance improves dramatically with more diversity and more context

The dataset that a model is trained off defines the AI’s ‘imagination’ — its ability to think creatively to solve problems.

Put simply, the AI will, by definition, never be able to ‘think outside this box’, as it can only ever recapitulate and reorder patterns that it has already been exposed to. Therefore, by expanding the training dataset, you quite literally allow it to think outside the box.

Out of necessity, all foundational models in biology today are trained on the same public datasets. Lacking diversity, consistency, context, quality, and clear commercialisation rights, these public datasets that are seriously unfit for the AI era.

Two-thirds of all public sequencing data come from just 12 species, while there are at least a trillion species on our planet. 10% of all enzyme classes only have a single sequence representative in public dataset.

For generative AI applications, these are probably the worst class imbalance problems ever encountered. That's why we've created a categorically superior dataset that gives us categorically superior performance.

Learn more

Technology highlights

Basecamp Research's platform applied across the bioeconomy

Protein Evolution, Inc. and Basecamp Research Aim to Make Polyurethane and Nylon Infinitely Recyclable with Expanded Strategic Collaboration

The companies will develop novel enzymes that break down difficult-to-recycle plastic waste to solve a major bottleneck in the fashion industry.

JM announces partnership with Basecamp Research to accelerate the adoption of biocatalysis solutions

This partnership combines Johnson Matthey’s catalysis expertise with Basecamp Research’s AI-enabled biodiversity genetic mapping to meet the growing demand of the pharmaceutical and chemicals industry.

Expanding the repertoire of recombinases that can integrate large DNA fragments into the human genome by over 30X

CRISPR for gene editing applications is beyond a doubt a milestone in modern biotechnology. Moving beyond edits, recent discoveries have enabled ‘gene writing’ technology, that is the insertion of large DNA ‘cargoes’ into host genomes.

Deep evolutionary context

Our knowledge graph has 4X more sequence diversity and 20X more genomic context than public resources

Continuous growth

Our unique Nagoya-compliant data supply chain covers 5 continents and over 60% of global biomes

Case studies

[ High-impact outcomes for protein design ]

[ Selecting 3-4 examples from this notion page: https://www.notion.so/Basecamp-Research-Protein-Design-GenAI-Overview-7097d8e7c2134dc799fdb7fc751ede52?pvs=4:

- Reducing development time from 2 years to 1 month
- Zero-shot multi-domain design, beating best-in-class & freedom-to-operate for therapeutics customers
- Zero-shot design of ultra-rare chemistries
- Gene-writing therapeutic systems with safe human integration sites ]

Dive into the details

Recent publications & views

Improving AlphaFold2 Performance with a Global Metagenomic & Biological Data Supply Chain

With higher protein sequence diversity captured in this dataset compared to existing public data, we apply this data advantage to the protein folding problem by MSA supplementation during inference of AlphaFold2.

HiFi-NN annotates the microbial dark matter with Enzyme Commission numbers

Here,we present HiFi-NN (Hierarchically-Finetuned Nearest Neighbor search) which annotates protein sequences with greater precision and recall than all existing deep learning methods.

ZymCTRL: a conditional language model for the controllable generation of artificial enzymes

Pre-print from our first collaboration with Dr Noelia Ferruz on an enzyme-specific language model to provide new opportunities to design purpose-built artificial enzymes.

The world's first foundational dataset purpose-built for biological AI

We design more complex systems than anyone else

We design more complex systems than anyone else

We're delivering step changes across R&D, from personalised medicine to plastic degradation

More accurate structure predictions

40% more proteins annotated

Controllable sequence generation

AI performance improves dramatically with more diversity and more context

Basecamp Research's platform applied across the bioeconomy

Our knowledge graph has 4X more sequence diversity and 20X more genomic context than public resources

Our unique Nagoya-compliant data supply chain covers 5 continents and over 60% of global biomes

[ High-impact outcomes for protein design ]

Recent publications & views

We target our expeditions based on your requirements

Contact us

Solve your product development bottlenecks