GenVS TBDB: An Open AI-Generated andVirtual Screened Small-Molecule Database for Tuberculosis Drug Discovery
GenVS TBDB: An Open AI-Generated andVirtual Screened Small-Molecule Database for Tuberculosis Drug Discovery
Lv, X.; Guo, H.; Lu, X.; Tang, Z.; Yu, K.; Xu, G.; Chen, S.; Zhang, R.; Guo, J.
AbstractTuberculosis (TB) remains a leading global health threat, with over 10 million new cases and 1.6 million deaths reported in 2021. Current TB therapies rely on a limited drug repertoire and prolonged treatment courses that have changed little in four decades. Here, we present GenVS-TBDB, an open AI-generated andvirtual screened small-molecule database, which expands the chemical space against Mycobacterium tuberculosis (Mtb) essential proteins. We firstly identified 458 probable small-molecule binding pockets across 376 essential Mtb proteins by integrating crystallographic ligand binding sites, homology to ligand-bound templates, literature review and computational pocket predictions. Then, by leveraging the target aware generative model TamGen, over 1.2 million novel small molecules were produced tailored to these pockets. All compounds were evaluated using AutoDock Vina, yielding binding affinity distributions for each target. An anti-TB specific graph neural network, Ligandformer, further enriched the compounds for whole cell activity. We also computed their key physicochemical properties (i.e., logP, molecular weight, topological polar surface area) and flagged any Pan-Assay Interference Compounds (PAINS) to ensure medicinal chemistry tractability, enriching a set of promising, drug-like molecules targeting diverse Mtb proteins. GenVS-TBDB, accessible at http://datascience.ghddi.org/tuberculosis/view, provides downloadable docking scores, cellular activity probabilities and physicochemical annotations, thereby offering an open resource for rapid hit generation and lead optimization. Our work demonstrates the power of generative AI and virtual screening in expanding the chemical search space for tuberculosis therapeutics and provide a valuable resource for accelerating early drug discovery.