Accelerating the discovery of general reaction conditions via machine learning

By Zhuoying Lin

This article was originally published by UCLA Chemistry & Biochemistry

A machine learning optimization approach developed by Professor Abigail Doyle’s lab, led by graduate student Jason Wang, in collaboration with scientists at Bristol Myers Squibb (BMS), significantly accelerates advances in synthetic chemistry by helping chemists identify the most generally applicable conditions with minimal experiments.

“How do you find the best coffee in town? Do you simply stick with the cafe that has your all-time favorite, or would you explore other places at the expense of time and the risk of getting a lesser brew?”

This dilemma, the tradeoff between exploitation of the current best option and the exploration of better options, is the classic multi-armed bandit problem. In fact, it mirrors the challenge chemists face in their quest to optimize a new reaction.

Traditionally, chemists attempt to identify working reaction conditions for target substrates by applying the optimal conditions obtained from exhaustive experiments with a model substrate, hoping that they will work similarly well despite the potential differences in reactivities between substrates. The optimization process becomes intensely laborious when the list of conditions, various in solvents, catalysts, temperatures and more, amount to hundreds or even thousands of combinations to test by hand. Consequently, advances in synthetic chemistry are constrained by the time and resources available for experimentation.

Recognizing the necessity to efficiently identify the most generally applicable conditions, the Doyle lab and scientists at BMS leveraged their expertise to develop a universal reaction optimization model that is inspired by the multi-armed bandit problem. In contrast to the prior approaches that assign equal importance to every individual substrate when evaluating conditions, bandit optimization algorithms prioritize conditions that likely maximize reaction yields for all substrates in consideration. The efficient sampling strategies employed by these algorithms also can identify optimal conditions with only a fraction of the total experiments typically required for chemists. Their work was recently published in the journal Nature.

“Generally applicable reaction conditions are highly sought because chemists want to apply previously identified reaction conditions to different, but related molecules than those initially tested and have the conditions work on the first try without additional optimization required to isolate the reaction product,” said Abby Doyle. “As an example, generally applicable reaction conditions can facilitate medicinal chemists’ tasks of making libraries of molecules to test for biological activity as quickly and efficiently as possible. However, synthetic chemists do not have a deliberate and data-efficient approach to identifying these general conditions during reaction optimization, which is what we sought in this study.”

How it works

The optimization model begins with proposing initial conditions for chemists to explore in research. Upon receiving the actual experimental results as inputs, the algorithm updates its beliefs and suggests a new set of conditions for the next round. As the process runs iteratively, the model will bias toward the conditions it perceives to give better yields on average for all substrates and increase their sampling frequency over time until the experimental budget is reached.

“Because inferior conditions are sampled less over time, this approach can be highly data efficient, in some cases achieving over 90% accuracy after sampling only 2% of all possible reactions,” said Jason Wang, the lead author and a fourth-year Ph.D. student in the Doyle Lab. “Bandit optimization algorithms are also well suited for nonstationary problems, where the substrate search space changes over time. This is especially beneficial in practice, as the target substrates in consideration can often change during optimization.”

Figure 1. Model architecture and workflow of bandit algorithms during reaction optimization. (Image courtesy: UCLA Chemistry & Biochemistry)

Model validation

To assess its accuracy and effectiveness, the team applied the algorithm to three commonly used reactions in pharmaceutical chemistry. The study of palladium-catalyzed imidazole direct C5-arylation involved a total of 1536 reactions from 24 ligands and 64 different substrate pairings of imidazoles and aryl bromides. Using the experimentally conducted 1536 reactions as ground truth, the bandit model demonstrates an average 85% accuracy in correctly identifying the top 5 most optimal conditions after running only 200 experiments.

In the second study, the team performed the bandit model on the amide coupling reaction, the most commonly practiced reaction in medicinal chemistry. Following the model’s suggestions, the researchers successfully identified and ranked the top conditions in 12% of the entire reaction condition scope. This was confirmed by collecting the experimental data for the remaining 88% of the reactions. For the third study of base-promoted phenol alkylation reaction with alkyl mesylates, the bandit model’s top result was experimentally verified to be more generally applicable than the benchmark conditions.

Nevertheless, as useful as the tool can be, machine learning doesn’t solve everything.

“Given the typical experimental budget, this approach is not suitable for the evaluation of thousands of possible conditions. Expert chemists are still needed to pre-filter conditions to ensure some initial reactivity,” said Jason. “Because the model has no sharing of chemical information between conditions, a more direct approach with knowledge transfer from existing conditions to new conditions using human expertise is still desired.”

The open-source software package is available online. The paper “Identifying general reaction conditions via bandit optimization” was authored by Prof. Abigail Doyle, Jason Wang, Mai-Jan Tom of UCLA, Jason Stevens, Dung Golden, Jun Li, Jose Tabora, David Primer, Bo Hao, David Del Valle, Stacey DiSomma, Ariel Furman, G. Greg Zipp, James Paulson of Bristol Myers Squibb, Stavros Kariofillis, Marvin Parasram, Benjamin Shields of Princeton University, Sergey Melnikov of Spectrix Analytical Services and Bao Hao of Janssen.

The research was supported by grants from Bristol Myers Squibb (BMS), the Princeton Catalysis Initiative, the NSF under the CCI Center for Computer Assisted Synthesis, and the Dreyfus Program for Machine Learning in the Chemical Sciences and Engineering.

About the lead authors

Professor Abigail G. Doyle (pictured above left) received her A.B. and A.M. summa cum laude in Chemistry and Chemical Biology from Harvard University in 2002 and her Ph.D. from the same department in 2008. Professor Doyle began her independent academic career in the Department of Chemistry at Princeton University in 2008. In 2021, she moved to UCLA as the Saul Winstein Chair in Organic Chemistry.

Jason Wang (pictured above right) received his B.A. in biochemistry and B.A. in computer science from Columbia University in 2020. After a year at Princeton University, he moved with Professor Abigail Doyle in 2021 to UCLA to continue his Ph.D. research in integrating computer science tools into synthetic organic chemistry.

Article by Zhuoying Lin, UCLA Department of Chemistry & Biochemistry, zylin@g.ucla.edu.