The Hardest Step in Materials Discovery? Making the Material

Headshot of Matthew McDermott
Matthew McDermott
May 15, 2025
6 min read

AI is going to revolutionize the materials industry.

With today’s machine learning tools, a single researcher can go from a list of desired material properties to thousands of predicted candidate compounds in just hours. AI is poised to drive the discovery of the next generation of materials.

The bad news? Most of these predicted materials will never be successfully made in the lab.

While AI has supercharged materials discovery, the field itself has a long and often challenging history. Despite decades of modeling, there are still very few success stories where a material was computationally designed and later validated in the lab with the predicted properties. More often, modeling has been used retroactively to understand why something works after it’s already been discovered, not to predict it in advance.

Most new materials have historically been predicted using substitution and screening, which involves tweaking known structures and filtering them based on calculated properties. The latest generative models, such as Microsoft’s MatterGen (which was released as open-source earlier this year), are breaking that mold. These models creatively generate new structures, fine-tuned to user properties. Where MatterGen excels is in producing candidates that are predicted to be thermodynamically stable.

But here’s the problem: thermodynamically stable ≠ synthesizable.

Synthesis is a pathway problem

Synthesizing a chemical compound is like crossing a mountain range; you can’t simply go straight over the top. You need a viable path. Sometimes there’s a mountain pass that cuts through. But more often, you’re forced to take the long way around. In synthesis, a material becomes especially difficult to make when all the obvious pathways try to go straight over the mountain.

Some of the most exciting emerging materials are notoriously challenging to synthesize. Consider bismuth ferrite (BiFeO₃), a promising multiferroic. Nearly every synthesis attempt produces unwanted impurities, such as Bi₂Fe₄O₉ or Bi₂₅FeO₃₉. Why? For many reasons, including:

  • BiFeO₃ is only thermodynamically stable over a narrow window of conditions
  • Competing phases are kinetically favorable to form
  • The conventional recipe is especially sensitive to precursor quality and defects

Or consider LLZO (Li₇La₃Zr₂O₁₂), a leading solid-state battery electrolyte currently being commercialized. Making it involves high temperatures (~1000 ºC) which volatilizes lithium, promoting the formation of the impurity La₂Zr₂O₇. Attempting to solve these issues can exacerbate other challenges.

These aren’t isolated cases. They’re representative of a broader challenge in materials chemistry, which is that synthesis is difficult, complex, and highly path-dependent. And this slows down our ability to discover and commercialize new materials.

Like crossing a mountain range, synthesis rarely follows a straight line. The easiest path may be the long way around, if you can find it.

Why AI hasn’t solved synthesis (yet)

Despite impressive progress in AI for materials, the scientific community is still far from developing an AI that can reliably predict how to make any new material.

Why? It’s a data problem. Simulating synthesis is fundamentally more complicated than simulating an atomic structure. Reaction pathways involve various factors, including time, temperature, atmosphere, pressure, defects, and grain boundaries, all of which operate across vast spatial and temporal scales. A single grain of sand has 10^20 atoms. Our best supercomputers today can only simulate 10^8 atoms simultaneously over a few picoseconds.

The reason we’ve been able to train AI models on atomic structures is due to multiple years of efforts in developing well-curated materials datasets. These efforts date back to at least 2011, when the Materials Genome Initiative helped spur the development of the first computational materials databases using quantum mechanical modeling approaches, such as density functional theory (DFT), to calculate material properties. The Materials Project, the pioneer of such databases, currently has about ~200,000 entries. Follow-up work has expanded these datasets to millions of entries for key properties, such as formation energy. But there is no equivalent database for synthesis.

Why not? Because building one would require experimentally testing millions of reaction combinations, including ones that fail, under every possible set of conditions. Even testing just binary reactions (A+B →) between 1000 compounds would require a minimum of 500,000 experiments, well beyond what we can expect from most high-throughput materials chemistry labs, even those that run autonomously, 24 hours a day.

Building a comprehensive synthesis dataset is not just expensive. It’s intractable.

Can we mine the scientific literature instead?

This seems like a great idea. After all, scientists have been publishing results from their experiments since the 17th century.

One notable effort by Kononova et al. (2019) involved scraping 32,000 synthesis recipes from the materials science literature to form an experimental dataset for synthesis. But there are significant limitations to this approach:

  1. Failed synthesis attempts (”negative results”) are almost never published.
  2. The scope of all chemical reactions tested is surprisingly narrow.

Researchers seem to avoid testing unconventional, “wacky” synthesis routes. And if they do, they often don’t spend the time to characterize the results and publish them.

For example, in the case of the ferroelectric barium titanate (BaTiO₃), 144 out of 164 recipe entries in the dataset use the same precursors, BaCO₃ + TiO₂. Only a few use less common options, such as BaO or Ba(OH)₂. While the most common route is often assumed to be best, that is not always the case; the BaCO₃/TiO₂ reaction is well known to proceed indirectly through intermediates (Ba₂TiO₄) and typically requires high temperatures (1000-1100ºC) and long heating times (4-8 hours). Its widespread use is driven more by convenience and convention than optimal performance.

It’s not uncommon that once a convenient route is found that is “good enough”, it tends to be the go-to approach. This isn’t good for science. Human bias in chemical experiment planning has been shown to even lead to less successful outcomes than those of randomly selected experiments. That means that, in the worst cases, centuries of scientific intuition can do more harm than good. For BaTiO₃, alternative recipes can easily be designed that outperform the conventional route — more on this case study in future blog posts.

Synthesis is often treated as a means to an end: noticed only when it fails, and dismissed as too complex to understand mechanistically. But that’s a problem we can solve.

We need better recipes

The list of theoretical materials grows daily, but unlike computational hours, lab experiments are expensive. So, how do you prioritize? Not by asking whether a material is theoretically synthesizable, but by asking: can I identify a viable, scalable recipe to make it?

That means a synthesis pathway that:

  • Produces the desired phase directly
  • Avoids problematic byproducts
  • Isn’t too sensitive to minor conditions
  • Is scalable to industrial production

Without that, a “breakthrough material” stays stuck on paper.

Truly novel materials are often difficult to make. If they were easy to synthesize, they’ve likely already been discovered. Often, this happens by accident when trying to make something else. While we can mitigate this by searching for new materials in less-explored chemical spaces (e.g., tellurides, arsenides), these spaces tend to be less explored for good reason; the elements are difficult or unsafe to work with.

Our approach at Newfound Materials

At Newfound Materials, we believe synthesis is the missing link. Our AI-assisted platform takes a reaction network-based approach, generating hundreds of thousands of reaction pathways for any inorganic compound of interest.

Some of these start from common precursors. Others begin with intermediate phases rarely tested in the lab. These alternatives often reveal low-barrier synthesis routes, like finding a shortcut around the mountain rather than going over it.

We model these routes with thermodynamic principles, simulate phase evolution in a virtual reactor, and use machine-learned predictors to filter promising candidates. Unlike end-to-end black-box AI models, our system is grounded in chemistry and designed for real-world lab translation.

Once you generate enough potential routes, something remarkable happens: you can start to assess just how difficult (or easy!) a material will be to make.

Synthesis is not only the primary bottleneck in materials discovery; it’s also our most significant untapped lever. By searching for better recipes, we can discover more effective materials faster, smarter, and at scale.

By mapping the reaction space of inorganic materials, we discover synthesis pathways that balance thermodynamics, selectivity, and synthetic accessibility.
Share this post