BGCs are sets of genes encoding the biosynthesis of specialized metabolites in studied organisms. By exploring BGCs, one can identify ecological and evolutionary factors shaping the biosynthetic potential of microbial communities and discover completely new bioactive compounds that play significant roles in biological processes and can be of interest to the pharmaceutical and biotechnological industries.
“Since the BGC Atlas also works with metadata, one can explore the environmental distribution of BGCs worldwide; thus, we can study if some compounds are produced everywhere or are specific to certain ecosystems. This will help us understand the ecological role of specific metabolites,” explains one of the coauthors, Luděk Sehnal from the RECETOX center.
The researchers also verified the functionality of this tool by analyzing over 35,000 metagenomic datasets, resulting in the identification of 1.8 million BGCs.
“The analysis showed that ribosomally synthesized and post-translationally modified peptides (RiPPs) are the most abundant compound class in host-associated metagenomes, while terpenes are the most abundant compound class in environmental samples, specifically in terrestrial ecosystems. This points to the specificity of certain metabolites in certain environments. This is also a question (metabolites’ environmental specificity) that can be answered using BGC Atlas in the future,” says Sehnal.
The BGC Atlas will be important for investigating many phenomena, such as the ecological role of genome-encoded compounds, the evolution of these biosynthetic pathways, the environmental distribution of specific compounds, and the discovery of novel molecules targeting specific issues (e.g., resistant microbes, neurodegenerative diseases, cancer, etc.).
“If we understand the evolution of specific biosynthetic pathways responsible for producing some interesting compounds, in the context of human well-being, we can use this information to engineer the pathway to produce more efficient compounds, in better yields than the native pathway, which is more specific to the target of interest. In practical terms, this can result in more efficient drugs or other products,” describes Sehnal.
It’s also important to consider why we need to conduct this research at this moment. “First, the vast amount of publicly available data highlights the importance of mining these data. If you want to mine the data, you need tools to do so, and BGC Atlas is exactly one of these tools for the efficient investigation of de novo sequenced metagenomes or diverse metagenomes stored in public databases. Second, the use of BGC Atlas can inform the discovery effort for novel interesting compounds, including antibiotics. Currently, we are losing our protection against infections due to the widespread development of antimicrobial resistance worldwide, and any tools that help fight infections are very much appreciated,” adds Sehnal.
Most of the work on this research was done by Caner Bagci, a postdoc in the group of Prof. Ziemert, with the assistance of bachelor student Casimir Ladanyi. However, the tool was developed with the participation of other researchers across the world, specifically Matin Nuhamunada, Kai Blin, and Tilmann Weber from Technical University of Denmark, Hemant Goyat and Shrikant Mantri from National Agri-Food Biotechnology Institute (NABI) in India, Azat Tagirdzhanov and Alexey Gurevich from Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) in Germany, Christian von Mering from University of Zurich in Switzerland, Daniel Udwary from DOE Joint Genome Institute in USA, and Marnix H Medema from Wageningen University in Netherlands.
Luděk Sehnal’s participation in this paper was funded by the European Union, Horizon Europe —Research and Innovation Framework Programme [Project NAfrAM 101064285]. However, the main funding of the BGC Atlas was provided by the Federal Ministry of Research and Education (BMBF) [161L0284C] and the German Centre for Infection Research (DZIF) [TTU09.716], which supports the research of Prof. Nadine Ziemert, head of the project and principal investigator. Since the paper was a collaborative effort, different people from different institutes were supported from their specific sources. Namely, Novo Nordisk Foundation [NNF20CC0035580 to K.B. and T.W., NNF20SA0035588 to M.N.]; Saarland University (the NextAID project to support the position of A.T.); National Agri-Food Biotechnology Institute (to S.M.); The work conducted by D.U. and S.K. at the U.S. Department of Energy Joint Genome Institute (https://ror.org/04xm1d337), a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy operated under Contract No. DE-AC02-05CH11231.
https://doi.org/10.1093/nar/gkae953