Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data.
Peets P, Litos A, Dührkop K, Garza DR, van der Hooft JJJ, Böcker S, Dutilh BE 2025 Chemical characteristics vectors map the chemical space of natural biomes from untargeted mass spectrometry data. J Cheminform 17, 82.
Abstract
Untargeted metabolomics can comprehensively map the chemical space of a biome, but is limited by low annotation rates (< 10%). We used chemical characteristics vectors, consisting of molecular fingerprints or chemical compound classes, predicted from mass spectrometry data, to characterize compounds and samples. These chemical characteristics vectors (CCVs) estimate the fraction of compounds with specific chemical properties in a sample. Unlike the aligned MS1 data with intensity information, CCVs incorporate the chemical properties of compounds, allowing chemical annotation to be used for sample comparison. Thus, we identified compound classes differentiating biomes, such as ethers which are enriched in environmental biomes, while steroids enriched in animal host-related biomes. In biomes with greater variability, CCVs revealed key clustering compound classes, such as organonitrogen compounds in animal distal gut and lipids in animal secretions. CCVs thus enhance the interpretation of untargeted metabolomic data, providing a quantifiable and generalizable understanding of the chemical space of natural biomes.