Figure A4. Distributions of the docking scores of the generated molecules for S1PR1.

Figure A4. Distributions of the docking scores of the generated molecules for S1PR1.

Source publication
Article
Full-text available
Recent years have seen tremendous success in the design of novel drug molecules through deep generative models. Nevertheless, existing methods only generate drug-like molecules, which require additional structural optimization to be developed into actual drugs. In this study, a deep learning method for generating target-specific ligands was propose...

Citations

... These models are typically based on the Simplified Molecular Input Line Entry System (SMILES) [18]. SMILES is easy for computers to encode molecular structures into text, which is in line with chemists' cognition and suitable for training deep learning models [19]. However, it has some limitations, such as the inability to represent some structural features and the need for explicit hydrogen atoms. ...
Article
Generative molecular models generate novel molecules with desired properties by searching chemical space. Traditional combinatorial optimization methods, such as genetic algorithms, have demonstrated superior performance in various molecular optimization tasks. However, these methods do not utilize docking simulation to inform the design process, and heavy dependence on the quality and quantity of available data, as well as require additional structural optimization to become candidate drugs. To address this limitation, we propose a novel model named DockingGA that combines Transformer neural networks and genetic algorithms to generate molecules with better binding affinity for specific targets. In order to generate high quality molecules, we chose the Self-referencing Chemical Structure Strings to represent the molecule and optimize the binding affinity of the molecules to different targets. Compared to other baseline models, DockingGA proves to be the optimal model in all docking results for the top 1, 10 and 100 molecules, while maintaining 100% novelty. Furthermore, the distribution of physicochemical properties demonstrates the ability of DockingGA to generate molecules with favorable and appropriate properties. This innovation creates new opportunities for the application of generative models in practical drug discovery.
... For example, Pham et al ployed conditional variational autoencoder frameworks to efficiently generate novel ecules with enhanced biological activity [20]. Wang et al. utilized generative pre-tra techniques to extract contextual information from molecules, facilitating the generati molecules with improved binding affinity to target proteins [21]. ...
Article
Full-text available
Drug discovery involves a crucial step of optimizing molecules with the desired structural groups. In the domain of computer-aided drug discovery, deep learning has emerged as a prominent technique in molecular modeling. Deep generative models, based on deep learning, play a crucial role in generating novel molecules when optimizing molecules. However, many existing molecular generative models have limitations as they solely process input information in a forward way. To overcome this limitation, we propose an improved generative model called BD-CycleGAN, which incorporates BiLSTM (bidirectional long short-term memory) and Mol-CycleGAN (molecular cycle generative adversarial network) to preserve the information of molecular input. To evaluate the proposed model, we assess its performance by analyzing the structural distribution and evaluation matrices of generated molecules in the process of structural transformation. The results demonstrate that the BD-CycleGAN model achieves a higher success rate and exhibits increased diversity in molecular generation. Furthermore, we demonstrate its application in molecular docking, where it successfully increases the docking score for the generated molecules. The proposed BD-CycleGAN architecture harnesses the power of deep learning to facilitate the generation of molecules with desired structural features, thus offering promising advancements in the field of drug discovery processes.
... 26,27 One such promising adaptation is PETrans, a method that ne-tunes GPT for de novo drug design. 28 PETrans leverages transfer learning and protein-specic encoding to generate target-specic ligands, offering a fresh perspective on the challenge of producing drug-like molecules optimized for binding to target proteins. ...
Article
Full-text available
Drug discovery is a process that finds new potential drug candidates for curing diseases and is also vital to improving the wellness of people. Enhancing deep learning approaches, e.g., molecular generation models, increases the drug discovery process's efficiency. However, there is a problem in this field in creating drug candidates with desired properties such as the quantitative estimate of druglikeness (QED), synthetic accessibility (SA), and binding affinity (BA), and there is a challenge for training a generative model for specific protein targets that has less pharmaceutical data. In this research, we present Mol-Zero-GAN, a framework that aims to solve the problem based on Bayesian optimization (BO) to find the model optimal weights' singular values, factorized by singular value decomposition, and generate drug candidates with desired properties with no additional data. The proposed framework can produce drugs with the desired properties on protein targets of interest by optimizing the model's weights. Our framework outperforms the state-of-the-art methods sharing the same objectives. Mol-Zero-GAN is publicly available at https://github.com/cucpbioinfo/Mol-Zero-GAN.
... Another example is the CNN-based deep transfer learning approach that can be utilized to detect COVID-19 [26]. In addition, transfer-learning-based methods can also be applied to the design of new drug development [27]. Therefore, in this study, we use the phosphorylation dataset as the source domain. ...
Article
Full-text available
Protein dephosphorylation is the process of removing phosphate groups from protein molecules, which plays a vital role in regulating various cellular processes and intricate protein signaling networks. The identification and prediction of dephosphorylation sites are crucial for this process. Previously, there was a lack of effective deep learning models for predicting these sites, often resulting in suboptimal outcomes. In this study, we introduce a deep learning framework known as “DephosNet”, which leverages transfer learning to enhance dephosphorylation site prediction. DephosNet employs dual-window sequential inputs that are embedded and subsequently processed through a series of network architectures, including ResBlock, Multi-Head Attention, and BiGRU layers. It generates predictions for both dephosphorylation and phosphorylation site probabilities. DephosNet is pre-trained on a phosphorylation dataset and then fine-tuned on the parameters with a dephosphorylation dataset. Notably, transfer learning significantly enhances DephosNet’s performance on the same dataset. Experimental results demonstrate that, when compared with other state-of-the-art models, DephosNet outperforms them on both the independent test sets for phosphorylation and dephosphorylation.
... This training set consists of input examples along with their corresponding output labels [25]. Consequently, various techniques have been developed in order to enhance FSL, including transfer learning [26,27], neural network, and meta-learning [28,29]. These methods aim to address the limitations associated with learning from limited examples and improve the performance of models in FSL scenarios. ...
Article
Full-text available
Computational approaches have revolutionized the field of drug discovery, collectively known as Computer-Assisted Drug Design (CADD). Advancements in computing power, data generation, digitalization, and artificial intelligence (AI) techniques have played a crucial role in the rise of CADD. These approaches offer numerous benefits, enabling the analysis and interpretation of vast amounts of data from diverse sources, such as genomics, structural information, and clinical trials data. By integrating and analyzing these multiple data sources, researchers can efficiently identify potential drug targets and develop new drug candidates. Among the AI techniques, machine learning (ML) and deep learning (DL) have shown tremendous promise in drug discovery. ML and DL models can effectively utilize experimental data to accurately predict the efficacy and safety of drug candidates. However, despite these advancements, certain areas in drug discovery face data scarcity, particularly in neglected, rare, and emerging viral diseases. Few-shot learning (FSL) is an emerging approach that addresses the challenge of limited data in drug discovery. FSL enables ML models to learn from a small number of examples of a new task, achieving commendable performance by leveraging knowledge learned from related datasets or prior information. It often involves meta-learning, which trains a model to learn how to learn from few data. This ability to quickly adapt to new tasks with low data circumvents the need for extensive training on large datasets. By enabling efficient learning from a small amount of data, few-shot learning has the potential to accelerate the drug discovery process and enhance the success rate of drug development. In this review, we introduce the concept of few-shot learning and its application in drug discovery. Furthermore, we demonstrate the valuable application of few-shot learning in the identification of new drug targets, accurate prediction of drug efficacy, and the design of novel compounds possessing desired biological properties. This comprehensive review draws upon numerous papers from the literature to provide extensive insights into the effectiveness and potential of few-shot learning in these critical areas of drug discovery and development.
... In many cases, this was done to condition the generated molecules to have drug-like properties. VAEs [14,15,16], GANs [17,12,18] and sequence based (language) models [19,20,21,22] have been used for molecule generation tasks, in this regard. Reinforcement learning (RL) has also been used for this purpose, with reward-penalty functions guiding models towards desired molecular characteristics in the respective latent space. ...
Preprint
Full-text available
Discovering novel drug candidate molecules is one of the most fundamental and critical steps in drug development. Generative deep learning models, which create synthetic data given a probability distribution, have been developed with the purpose of picking completely new samples from a partially known space. Generative models offer high potential for designing de novo molecules; however, in order for them to be useful in real-life drug development pipelines, these models should be able to design target-specific molecules, which is the next step in this field. In this study, we propose DrugGEN, for the de novo design of drug candidate molecules that interact with selected target proteins. The proposed system represents compounds and protein structures as graphs and processes them via serially connected two generative adversarial networks comprising graph transformers. DrugGEN is trained using a large dataset of compounds from ChEMBL and target-specific bioactive molecules, to design effective and specific inhibitory molecules against the AKT1 protein, which has critical importance for developing treatments against various types of cancer. On fundamental benchmarks, DrugGEN models have either competitive or better performance against other methods. To assess the target-specific generation performance, we conducted further in silico analysis with molecular docking and deep learning-based bioactivity prediction. Results indicate that de novo molecules have high potential for interacting with the AKT1 protein structure in the level of its native ligand. DrugGEN can be used to design completely novel and effective target-specific drug candidate molecules for any druggable protein, given target features and a dataset of experimental bioactivities. Code base, datasets, results and trained models of DrugGEN are available at https://github.com/HUBioDataLab/DrugGEN
Article
Understanding protein sequence and structure is essential for understanding protein–protein interactions (PPIs), which are essential for many biological processes and diseases. Targeting protein binding hot spots, which regulate signaling and growth, with rational drug design is promising. Rational drug design uses structural data and computational tools to study protein binding sites and protein interfaces to design inhibitors that can change these interactions, thereby potentially leading to therapeutic approaches. Artificial intelligence (AI), such as machine learning (ML) and deep learning (DL), has advanced drug discovery and design by providing computational resources and methods. Quantum chemistry is essential for drug reactivity, toxicology, drug screening, and quantitative structure–activity relationship (QSAR) properties. This review discusses the methodologies and challenges of identifying and characterizing hot spots and binding sites. It also explores the strategies and applications of artificial-intelligence-based rational drug design technologies that target proteins and protein–protein interaction (PPI) binding hot spots. It provides valuable insights for drug design with therapeutic implications. We have also demonstrated the pathological conditions of heat shock protein 27 (HSP27) and matrix metallopoproteinases (MMP2 and MMP9) and designed inhibitors of these proteins using the drug discovery paradigm in a case study on the discovery of drug molecules for cancer treatment. Additionally, the implications of benzothiazole derivatives for anticancer drug design and discovery are deliberated.
Article
Full-text available
Drug discovery plays a critical role in advancing human health by developing new medications and treatments to combat diseases. How to accelerate the pace and reduce the costs of new drug discovery has long been a key concern for the pharmaceutical industry. Fortunately, by leveraging advanced algorithms, computational power and biological big data, artificial intelligence (AI) technology, especially machine learning (ML), holds the promise of making the hunt for new drugs more efficient. Recently, the Transformer-based models that have achieved revolutionary breakthroughs in natural language processing have sparked a new era of their applications in drug discovery. Herein, we introduce the latest applications of ML in drug discovery, highlight the potential of advanced Transformer-based ML models, and discuss the future prospects and challenges in the field.