Chemical derivatization is a commonly used technique in analytical chemistry to enhance the separability and detectability of target compounds in chromatographic and mass spectrometric analyses, and its combination with liquid chromatography-mass spectrometry (LC-MS) analyses has been widely used in biomarker discovery, off-target metabolomics, and environmental studies. However, there is a lack of standard chemically derivatized molecules (CDMs) mass spectral profiles for database matching and it is difficult to supplement the library-scale experimental mass spectrometry data of CDMs by the available technological means. Therefore, the high-throughput identification of CDMs remains an urgent challenge in data mining and compound identification.
Recently, Prof. Feng Li's team developed a deep learning method (DeepCDM) to achieve high-quality prediction of CDMs Electrospray Ionization Tandem Mass Spectrometry (ESI-MS/MS), which compensates for the low predictability of CDMs by the traditional quantum chemical simulation and machine learning methods. Unlike new prediction tools developed from scratch, DeepCDM focuses on fine-tuning existing models by migration learning using a small portion of experimentally acquired mass spectral data of CDMs. Due to its good accuracy and scalability, DeepCDM can be generalized for high-quality prediction of MS/MS spectra of a wide range of CDMs by switching small training sets of different classes of CDMs. By constructing a dedicated model Dns-MS for dansulfonylated molecules, a dedicated mass spectrometry database DnsBank containing 294,647 MS/MS spectra of dansulfonylated molecules was subsequently successfully established, contributing to the comprehensive monitoring of chemical processes in nontargeted analyses and the high-throughput discovery of new contaminants.
Figure 1. Non-targeted analysis of chemically derivatized molecules (CDMs) based on liquid chromatography-mass spectrometry (LC-MS)
This work titled “Deep learning prediction of electrospray ionization tandem mass spectra of chemically derived molecules” was published in Nature Communications, with Prof. Feng Li (College of Chemistry, Sichuan University) and Prof. Yannan Tang (Analytical & Testing Center, Sichuan University) as corresponding authors and Bin Chen and Hailiang Li as co-first authors.