The use of nuclear magnetic resonance (NMR) to obtain rich structural information of molecules has led to significant scientific breakthroughs since its discovery in the mid-20th century. The advancements of NMR have benefited from innovations in technology, resulting in new capabilities as well as enhanced workflows.
But it’s the rapid expansion of computing capabilities since the turn of the century that has set the stage for the latest evolution of NMR, particularly with the utilization of artificial intelligence (AI), machine learning (ML) and deep learning.
The advanced analytical power of these techniques is particularly useful in fields with complex computing and data needs like NMR. This is allowing scientists to automate complex and time-consuming tasks making NMR more efficient.
AI and ML are rapidly evolving in multiple disciplines, made possible by the exponential expansion of computing technology. These advances are transforming the way scientists can access and analyze data.
Many ML approaches are part of the present and future landscape of NMR research, including statistical boosting, simulated annealing, and principal component analysis. Researchers are currently looking at AI and ML for applications in signal processing, baseline/phase correction, peak detection, assignment, analysis of complex mixtures, quantum mechanical recapitulating density-functional theory, and structural determination/prediction. It’s a complicated process with misleadingly simple steps.
“All the superhuman performance that you’ve read about has been delivered by a deceptively simple recipe,” Professor David Donoho, Ph.D., Stanford University, explained. “Step 1 is you get access to massive data, which of course should be relevant. Step 2 is you formulate a challenge task, which, if it’s solved, is going to be valuable for your application. Step 3 is you're going to go out and find out about someone else’s deep learning model. And then Step 4 is you tinker like mad. What enables all of this is that you can now use highly advanced computing technology to achieve results.”
Phase and baseline corrections are important processing steps in the analysis of NMR spectra, and many different approaches have been developed in the past to conduct these tasks automatically. While these methods perform well generally, many suffer when applied to spectra with high signal densities, such as proton spectra. Deep learning has shown to achieve excellent results in recognition and segmentation tasks. In the field of NMR spectroscopy, it has been used to support users with spectra processing and interpretation.
A team of Bruker scientists recently introduced a deep learning-based method for phase and baseline correction of 1D 1H NMR spectra. Currently available in Bruker TopSpin software 4.1.3, the algorithm provides consistently better correction of phase and baseline both for low- and high-field spectra, even reaching human-level quality results in phase correction accuracy. The new method marks a further step towards the fully automated analysis of NMR spectra, providing a more robust phase and baseline correction method that is suitable for the new Bruker Fourier 80 high-performance benchtop NMR system, as well as high-field Bruker spectrometers.
Signal region detection is a routine task in NMR spectroscopy, but it is one of the few processing steps that is traditionally performed manually. However, Bruker scientists developed a deep learning algorithm for signal region detection in 1D 1H NMR spectra. This method has achieved excellent accuracy for spectra obtained in a wide range of base frequencies without requiring any inputs from the user side, exhibiting the ability to reach and surpass human-level performance in a fraction of the time. These techniques could potentially lead to a fully automatic extraction and analysis of the information contained in NMR spectra.
But with these research possibilities comes a problem – data. “Progress in biological NMR is hindered by data availability and processing time,” says Prof. Dr. Roland Riek, Professor of Physical Chemistry and head of the Bio-NMR group at the Department of Chemistry and Applied Biosciences at the Swiss Federal Institute of Technology (ETH) Zürich. “Considering data availability first, it’s notable that out of the many thousands of protein structures studied by NMR, only a tiny fraction have had the original datasets made available to other researchers. That is currently a huge unresolved problem in NMR.”
But the second difficulty – the time taken to run NMR experiments and analyze the results – is one he thinks he can tackle using the rapidly-developing field of artificial intelligence (AI).
“Currently,” he says, “it takes anything from six months to a few years to fully characterize a protein structure: taking all the measurements and analyzing all the data is very time-consuming and requires constant expert judgement.” This bottleneck holds back progress in the field, he thinks – and finding a solution to this is difficult. “It isn’t easy to speed up the process of acquiring NMR data,” he points out. “We’re limited by the need to culture biological media and prepare the samples for analysis, and then by the defined amount of time it takes to run NMR pulse sequences. But what we can do is use the time on the NMR instrument more effectively and streamline data analysis.”
Prof. Riek suggests that AI offers a route to achieving these two goals: “By training algorithms to assess the results as they’re generated, and then make automated changes to the experiments on-the-fly, we can save a lot of time, by only running those pulse sequences that are necessary for solving the structure. And because instrument time is expensive, we’d also save money in the process.” Prof. Riek’s work on this area is still at an early stage, but he’s confident that once initial results have been published, the benefits of this approach will be wide-reaching.
Of course, all this capability is also driven by the increasing availability of high-resolution instrumentation. Bruker BioSpin has been developing technologies to enable scientific innovations by the NMR community by combining high-performance instruments with new workflow software and service offerings.
For example, Bruker’s incorporation of AI Deep Learning capabilities into its TopSpin NMR software has improved signal detection in proton NMR spectra. A deep neural network was trained on two million NMR spectra that were simulated with artificial noise and other artifacts from publicly available compounds. As a result, more accurate detection of NMR signals enables easier and faster automatic spectral analysis.
Bruker’s line of GHz-class NMR instruments, as well as unique probes and software, also allow scientists to advance their research with NMR for functional structural biology, which is valuable for the structural determination of proteins that are neither crystallizable nor soluble, e.g., membrane proteins embedded in lipid bilayers, or protein aggregates.
Many scientific breakthrough based on the combination of AI and high-resolution instrumentation come directly from the NMR community. For example, Professor Peter Güntert, PhD, Goethe University Frankfurt, is working with a ML approach to enable fully automated protein structure determination directly from NMR in a collaborative project called NMRtist. Working together with software developments for BioNMR really pushes the barrier of NMR and enable discoveries that could not be accessible with NMR alone.
“The NMRtist workflow goes from spectra to assignments and structure in a fully automated process,” he explained. “I think it really is possible now to do end-to-end fully automated NMR structure determination. That means when you have obtained the measurements for a well-behaved protein, the assignments and structure are there within a few hours.”
Together with advanced analytical instrumentation, AI and ML have the potential to spark new scientific discoveries across disciplines, including a range of compelling NMR applications. Dr. Falko Busse, President of the Bruker BioSpin Group, commented, "We are proud to support functional structural biology research with our GHz-class NMR solutions plus our recent advancements in solid-state NMR. We appreciate the support of the MR community as we continue to innovate high-value chemistry and biology research methods."