Home Chemical similarity methods for analyzing secondary metabolite structures
Article
Licensed
Unlicensed Requires Authentication

Chemical similarity methods for analyzing secondary metabolite structures

  • Lena Y. E. Ekaney , Donatus B. Eni and Fidele Ntie-Kang EMAIL logo
Published/Copyright: June 19, 2021
Become an author with De Gruyter Brill

Abstract

The relation that exists between the structure of a compound and its function is an integral part of chemoinformatics. The similarity principle states that “structurally similar molecules tend to have similar properties and similar molecules exert similar biological activities”. The similarity of the molecules can either be studied at the structure level or at the descriptor level (properties level). Generally, the objective of chemical similarity measures is to enhance prediction of the biological activities of molecules. In this article, an overview of various methods used to compare the similarity between metabolite structures has been provided, including two-dimensional (2D) and three-dimensional (3D) approaches. The focus has been on methods description; e.g. fingerprint-based similarity in which the molecules under study are first fragmented and their fingerprints are computed, 2D structural similarity by comparing the Tanimoto coefficients and Euclidean distances, as well as the use of physiochemical properties descriptor-based similarity methods. The similarity between molecules could also be measured by using data mining (clustering) techniques, e.g. by using virtual screening (VS)-based similarity methods. In this approach, the molecules with the desired descriptors or /and structures are screened from large databases. Lastly, SMILES-based chemical similarity search is an important method for studying the exact structure search, substructure search and also descriptor similarity. The use of a particular method depends upon the requirements of the researcher.

Acknowledgements

FNK acknowledge funding from the European Structural and Investment Funds, through the OP RDE-funded project “ChemJets” (Award No. CZ.02.2.69/0.0/0.0/16_027/0008351). FNK also received an equipment donation from the Alexander von Humboldt Foundation, Germany. The technical support of Mme. Bokeng and Mr. Eseme are acknowledged. The reviewers are appreciated for their constructive comments to improve the final manuscript.

References

1. Nikolova N, Jaworska J. Approaches to measure chemical similarity - a review. QSAR Combi Sci. 2003;22:1006–26.10.1002/qsar.200330831Search in Google Scholar

2. Johnson AM, Maggiora GM. Concepts and applications of molecular similarity. New York: John Willey & Sons, 1990. ISBN 978-0-471-62175–1.Search in Google Scholar

3. Martin Y, Kofron J, Traphagen L. Do structurally similar molecules have similar biological activity. J Med Chem. 2002;45:4350.10.1021/jm020155cSearch in Google Scholar PubMed

4. Kubinyi H. Similarity and dissimilarity: a medicinal chemist’s view. Perspect Drug Discovery Des. 1998;9:225.10.1007/0-306-46857-3_13Search in Google Scholar

5. Abegaz BM, Kinfe HH. Secondary metabolites, their structural diversity, bioactivity, and ecological functions: an overview. Phys Sci Rev. 2018. DOI:10.1515/psr-2018-0100.Search in Google Scholar

6. Cragg G, Newman D. Natural products: a continuing source of novel drug leads. Biochim Biophys Acta. 2013;1830:3670.10.1016/j.bbagen.2013.02.008Search in Google Scholar PubMed PubMed Central

7. Bennett R, Wallsgrove R. Secondary metabolites in plant defence mechanisms. New Phytol. 1994;127:617.10.1111/j.1469-8137.1994.tb02968.xSearch in Google Scholar PubMed

8. Liu K, Abdullah AA, Huang M, Nishioka T, Altaf-Ul-Amin M, Kanay S. Novel approach to classify plants based on metabolite-content similarity. BioMed Res Int. 2017;2017:296729.10.1155/2017/5296729Search in Google Scholar PubMed PubMed Central

9. Bajusz D, Rácz A, Héberger K. Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J Cheminform. 2015;7:20.10.1186/s13321-015-0069-3Search in Google Scholar PubMed PubMed Central

10. Lo YC, Senese S, Damoiseaux R, Torres JZ. 3D Chemical similarity networks for structure-based target prediction and scaffold hopping. ACS Chem Biol. 2016;11:2244–53.10.1021/acschembio.6b00253Search in Google Scholar PubMed PubMed Central

11. Yan X, Liao C, Liu Z, Hagler AT, Gu Q1, Xu J. Chemical structure similarity search for ligand-based virtual screening: methods and computational resources. Curr Drug Targets. 2016;17:1580–5.10.2174/1389450116666151102095555Search in Google Scholar PubMed

12. Skinnider MA, Dejong CA, Franczak BC, McNicholas PD, Magarvey NA. Comparative analysis of chemical similarity methods for modular natural products with a hypothetical structure enumeration algorithm. J Cheminform. 2017;9:46.10.1186/s13321-017-0234-ySearch in Google Scholar PubMed PubMed Central

13. Schwartz J, Awale M, Reymond J-L. SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules. J Chem Inf Model. 2013;538:1979–89.10.1021/ci400206hSearch in Google Scholar PubMed

14. Kumar A. Chemical similarity methods - a tutorial review. Chem Educator. 2011;16:1.Search in Google Scholar

15. Mackay D. Chapter 20, “An example inference task: clustering information theory, inference and learning algorithms. Cambridge University Press, 2003:284–92.Search in Google Scholar

16. Koulouridi E, Valli M, Ntie-Kang F, Bolzani VS. A primer on natural product-based virtual screening. Phys Sci Rev. 2018. DOI:10.1515/psr-2018-0105.Search in Google Scholar

17. Sterling T, Irwin JJ. ZINC 15 – ligand discovery for everyone. J Chem Inf Model. 2015;55:2324–37.10.1021/acs.jcim.5b00559Search in Google Scholar PubMed PubMed Central

18. Irwin JJ. Using ZINC to acquire a virtual screening library. In: Current protocols in bioinformatics (Suppl. 22) 14.6.1-14.6.23. Wiley Interscience John Wiley & Sons, Inc., 2008. DOI:10.1002/0471250953.bi1406s22.10.1002/0471250953.bi1406s22Search in Google Scholar PubMed

19. Atta-ur-rahmann CM. Chemistry and biology of steroidal alkaloids from marine organisms. Alkaloids. 1999;52:233.10.1016/S0099-9598(08)60028-0Search in Google Scholar

20. Kotler-Brajtburg J, Medoff G, Kobayashi GS, Boggs S, Schlessinger D, Pandey RC, et al. Classification of polyene antibiotics according to chemical structure and biological effects. Antimicrob Agents Chemother. 1979;15:716–22.10.1128/AAC.15.5.716Search in Google Scholar PubMed PubMed Central

21. Maggiora G, Vogt M, Stumpfe D, Bajorath J. Molecular similarity in medicinal chemistry. J Med Chem. 2014;57:3186–204.10.1021/jm401411zSearch in Google Scholar PubMed

22. Bender A, Jenkins J, Scheiber J, Sukuru S, Glick M, Davies J. How similar are similarity searching methods? A principal component analysis of molecular descriptor space. J Chem Inf Model. 2009;49:108–19.10.1021/ci800249sSearch in Google Scholar PubMed

23. Thimm M, Goede A, Hougardy S, Preibner R. Comparison of 2D similarity and 3D superposition. Application to searching a conformational drug database. J Chem Inf Computer Sci. 2004;44:1816–22.10.1021/ci049920hSearch in Google Scholar PubMed

24. Awale M, Reymond JL. A multi-fingerprint browser for the ZINC database. Nucleic Acids Res. 2014;42:W234–39.10.1093/nar/gku379Search in Google Scholar PubMed PubMed Central

25. Awale M, Jin X, Reymond J-L. Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints. J Cheminform. 2015;7:3.10.1186/s13321-014-0051-5Search in Google Scholar

26. Schwartz J, Awale M, Reymond JL. SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules. J Chem Inf Model. 2013;53:1979–89.10.1021/ci400206hSearch in Google Scholar

27. Wink M. Evolution of secondary metabolites from an ecological and molecular phylogenetic perspective. Phytochemistry. 2003;64:3–19.10.1016/S0031-9422(03)00300-5Search in Google Scholar

28. Nakamura Y, Afendi M, Parvin K. KNApSAcK metabolite activity database for retrieving the relationships between metabolites and biological activities. Plant Cell Physiol. 2014;55:e7.10.1093/pcp/pct176Search in Google Scholar PubMed

29. Altaf-Ul-Amin M, Tsuji H, Kurokawa H, Asahi H, Shinbo Y, Kanaya S. DPClus: a density-periphery based graph clustering software mainly focused on detection of protein complexes in interaction networks. J Comput-Aided Chem. 2006;7:150.10.2751/jcac.7.150Search in Google Scholar

30. Cao Y, Charisi L, Cheng C, Jiang T, Girke T. ChemmineR: a compound mining framework for R. Bioinformatics. 2008;24:1733–4.10.1093/bioinformatics/btn307Search in Google Scholar PubMed PubMed Central

31. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97.10.1007/BF00994018Search in Google Scholar

32. Durant JL, Leland BA, Henry DR, Nourse JD. Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci. 2002;42:1273–80.10.1021/ci010132rSearch in Google Scholar PubMed

33. Fox HM. Chemical taxonomy. Nature. 1946;157:511.10.1038/157511a0Search in Google Scholar

34. Smith CR Jr, Powell RG. Plant sources of hepatotoxic pyrrolizidine alkaloids. In: Pelletier SW, editor. Alkaloids, vol. 2. NY: Wiley, 1984:149–204.Search in Google Scholar

35. Kupchan SM, Komoda Y, Court WA, Thomas GJ, Smith RM, Karim A, et al. Maytansine, a novel antileukemic ansa macrolide from Maytenus ovatus. J Am Chem Soc. 1972;94:1354–6.10.1021/ja00759a054Search in Google Scholar PubMed

36. Yu T-W, Bai L, Clade D, Hoffmann D, Toelzer S, Trinh KQ, et al The biosynthetic gene cluster of the maytansinoid antitumor agent ansamitocin from Actinosynnema pretiosum. Proc Natl Acad Sci USA. 2002;99:7968–73.10.1073/pnas.092697199Search in Google Scholar PubMed PubMed Central

37. National Cancer Institute: Definition of Maytansine. https://www.cancer.gov/publications/dictionaries/cancer-drug/def/maytansine?redirect=true. Accessed: 20 Aug 2019.Search in Google Scholar

38. Yang JY, Sanchez LM, Rath CM, Liu X, Boudreau PD, Bruns N, et al. Molecular networking as a dereplication strategy. J Nat Prod. 2013;769:1686–99.10.1021/np400413sSearch in Google Scholar PubMed PubMed Central

39. Aron AT, Gentry EC, McPhail KL, Nothias LF, Nothias-Esposito M, Bouslimani A, et al. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat Protoc. 2020;15:1954–91.10.1038/s41596-020-0317-5Search in Google Scholar PubMed

40. Kang KB, Ernst M, Van Der Hooft JJ, Da Silva RR, Park J, Medema MH, et al. Comprehensive mass spectrometry-guided phenotyping of plant specialized metabolites reveals metabolic diversity in the cosmopolitan plant family Rhamnaceae. Plant J. 2019;98:1134–44.10.1111/tpj.14292Search in Google Scholar PubMed

41. Nothias LF, Petras D, Schmid R, Dührkop K, Rainer J, Sarvepalli A, et al. Feature-based molecular networking in the GNPS analysis environment. Nat Methods. 2020;17:905–8.10.1038/s41592-020-0933-6Search in Google Scholar PubMed PubMed Central

42. Gao YL, Wang YJ, Chung HH, Chen KC, Shen TL, Hsu CC. Molecular networking as a dereplication strategy for monitoring metabolites of natural product treated cancer cells. Rapid Commun Mass Spectrom. 2020;34:e8549.10.1002/rcm.8549Search in Google Scholar PubMed

43. Kuo TH, Huang HC, Hsu CC. Mass spectrometry imaging guided molecular networking to expedite discovery and structural analysis of agarwood natural products. Anal Chim Acta. 2019;1080:95–103.10.1016/j.aca.2019.05.070Search in Google Scholar PubMed

Published Online: 2021-06-19

© 2020 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 24.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/psr-2018-0129/html
Scroll to top button