Yan Zhong’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (12)


Figure 4.1: Explainability Techniques in Large Language Models
Ethics and Social Implications of Large Models
  • Preprint
  • File available

April 2025

·

12 Reads

·

Silin Chen

·

Tianyang Wang

·

[...]

·

Yan Zhong

Large Language Models (LLMs) have become a cornerstone of modern artificial intelligence (AI), finding applications across various domains such as healthcare, finance, entertainment, and customer service. To understand their ethical and social implications, it is essential to first grasp what these models are, how they function, and why they carry significant impact. This introduction aims to provide a comprehensive and beginner-friendly overview of LLMs, introducing their basic structure, training process, and the types of tasks they are commonly employed for. We will also include simple analogies and examples to ease understanding.

Download

Deep Learning and Machine Learning: Contrastive Learning, from scratch to application

April 2025

·

7 Reads

Contrastive learning is a powerful technique in the field of machine learning, specifically in representation learning. The central idea of contrastive learning is to learn a model by distinguishing between similar and dissimilar data points. This involves pulling similar data points closer in the learned representation space while pushing dissimilar points farther apart.Imagine you have a collection of images, some of which are different views of the same object and some of which are completely unrelated. Contrastive learning would aim to generate embeddings (i.e., numerical representations) for these images such that the images showing the same object (similar images) are mapped close together, while images of different objects (dissimilar images) are mapped far apart.For example, consider a scenario where we have two images of cats and one image of a dog. The model would be trained to pull the representations of the two cat images closer together and push the dog image further away from the cat images in the representation space. This way, when the model encounters new images, it can easily identify whether two images represent the same object or not.Contrastive learning can be applied to a variety of tasks, such as image classification, natural language processing, and even reinforcement learning . The core principle, though, remains the same: learn a representation space that reflects the similarities and dissimilarities between the data points.


Advanced Deep Learning Methods for Protein Structure Prediction and Design

March 2025

·

184 Reads

After AlphaFold won the Nobel Prize, protein prediction with deep learning once again became a hot topic. We comprehensively explore advanced deep learning methods applied to protein structure prediction and design. It begins by examining recent innovations in prediction architectures, with detailed discussions on improvements such as diffusion based frameworks and novel pairwise attention modules. The text analyses key components including structure generation, evaluation metrics, multiple sequence alignment processing, and network architecture, thereby illustrating the current state of the art in computational protein modelling. Subsequent chapters focus on practical applications, presenting case studies that range from individual protein predictions to complex biomolecular interactions. Strategies for enhancing prediction accuracy and integrating deep learning techniques with experimental validation are thoroughly explored. The later sections review the industry landscape of protein design, highlighting the transformative role of artificial intelligence in biotechnology and discussing emerging market trends and future challenges. Supplementary appendices provide essential resources such as databases and open source tools, making this volume a valuable reference for researchers and students.


Mastering Reinforcement Learning: Foundations, Algorithms, and Real-World Applications

February 2025

·

14 Reads

Reinforcement Learning (RL) is a distinct branch of machine learning focused on how agents should take actions in an environment to maximize cumulative rewards. Unlike supervised learning, which relies on labeled datasets, RL is driven by the agent's interactions with its environment, learning optimal behaviors through trial and error. The agent learns to make decisions by performing certain actions and receiving rewards or penalties in return. The goal is to learn a policy that maximizes the cumulative reward over time.


Deep Learning and Machine Learning: Contrastive Learning, from Scratch to Application

Contrastive learning is a powerful technique in the field of machine learning, specifically in representation learning. The central idea of contrastive learning is to learn a model by distinguishing between similar and dissimilar data points. This involves pulling similar data points closer in the learned representation space while pushing dissimilar points farther apart. Imagine you have a collection of images, some of which are different views of the same object and some of which are completely unrelated. Contrastive learning would aim to generate embeddings (i.e., numerical representations) for these images such that the images showing the same object (similar images) are mapped close together, while images of different objects (dissimilar images) are mapped far apart. For example, consider a scenario where we have two images of cats and one image of a dog. The model would be trained to pull the representations of the two cat images closer together and push the dog image further away from the cat images in the representation space. This way, when the model encounters new images, it can easily identify whether two images represent the same object or not. Contrastive learning can be applied to a variety of tasks, such as image classification, natural language processing, and even reinforcement learning . The core principle, though, remains the same: learn a representation space that reflects the similarities and dissimilarities between the data points.


Figure 4.1: Explainability Techniques in Large Language Models
Ethics and Social Implications of Large Language Models

January 2025

·

13 Reads

Large Language Models (LLMs) have become a cornerstone of modern artificial intelligence (AI), finding applications across various domains such as healthcare, finance, entertainment, and customer service. To understand their ethical and social implications, it is essential to first grasp what these models are, how they function, and why they carry significant impact. This introduction aims to provide a comprehensive and beginner-friendly overview of LLMs, introducing their basic structure, training process, and the types of tasks they are commonly employed for. We will also include simple analogies and examples to ease understanding.


Exploring Multimodal Embeddings for Text and Impact on Language Processing

The breakthrough achievements of multimodal large language models in recent years have brought researchers reverie. This paper provides an in-depth survey of text embeddings, exploring the evolution of word representations from fundamental continuous and similarity-preserving techniques to advanced contextual models. Beginning with the basics of word embeddings, it outlines the transition from one hot encoding to dense representations and discusses the inherent advantages of such transformations. The text then systematically reviews classical models such as Word2Vec, GloVe, and FastText detailing their architectures, training methodologies, and practical implementations. Building on this foundation, the book delves into the realm of contextual embeddings and large language models, examining seminal frameworks like BERT, ELMo, and transformer-based architectures, and illustrating their role in dynamically generating context-aware representations. The final section highlights diverse applications of text embeddings, including text classification, clustering, information retrieval, and natural language generation for dialogue systems, supported by practical examples and implementation insights. This comprehensive survey aims to serve as a valuable reference for researchers and practitioners seeking to understand and leverage text embeddings in natural language processing.


Multimodal Embeddings for Representation Learning

January 2025

·

9 Reads

This comprehensive survey explores the theoretical foundations and practical applications of multimodal embedding representations for text, image, audio, and video data. The work begins with an introduction to the fundamental concepts of embeddings, detailing the transformation of high-dimensional discrete data into low-dimensional continuous vectors. It addresses the challenges associated with high-dimensional data, such as the curse of dimensionality, and discusses the necessity of efficient representation and dimensionality reduction in modern machine learning. The survey then delves into the mathematical underpinnings of embeddings, including vector space theory, inner products, and similarity measures. Detailed discussions of dimensionality reduction techniques, such as Principal Component Analysis (PCA), t-Distributed Stochastic Neighbour Embedding (t-SNE), and Linear Discriminant Analysis (LDA) provide a rigorous framework for understanding how low-dimensional embeddings can be effectively derived from complex data sets. Further, the survey categorises various embedding types, including text, image, video, audio, and cross-modal embeddings, and reviews their diverse applications across fields such as natural language processing, computer vision, recommendation systems, and network analysis. Concluding with a discussion on current challenges and future prospects, this work aims to serve as a valuable reference for researchers and practitioners seeking to harness the potential of multimodal embeddings in advancing machine learning methodologies.


Mastering Reinforcement Learning: Foundations, Algorithms, and Real-World Applications

January 2025

·

6 Reads

Reinforcement Learning (RL) is a distinct branch of machine learning focused on how agents should take actions in an environment to maximize cumulative rewards. Unlike supervised learning, which relies on labeled datasets, RL is driven by the agent's interactions with its environment, learning optimal behaviors through trial and error. The agent learns to make decisions by performing certain actions and receiving rewards or penalties in return. The goal is to learn a policy that maximizes the cumulative reward over time.


Deep Learning and Machine Learning: Contrastive Learning, from scratch to application

December 2024

·

4 Reads

Contrastive learning is a powerful technique in the field of machine learning, specifically in representation learning. The central idea of contrastive learning is to learn a model by distinguishing between similar and dissimilar data points. This involves pulling similar data points closer in the learned representation space while pushing dissimilar points farther apart.Imagine you have a collection of images, some of which are different views of the same object and some of which are completely unrelated. Contrastive learning would aim to generate embeddings (i.e., numerical representations) for these images such that the images showing the same object (similar images) are mapped close together, while images of different objects (dissimilar images) are mapped far apart.For example, consider a scenario where we have two images of cats and one image of a dog. The model would be trained to pull the representations of the two cat images closer together and push the dog image further away from the cat images in the representation space. This way, when the model encounters new images, it can easily identify whether two images represent the same object or not.Contrastive learning can be applied to a variety of tasks, such as image classification, natural language processing, and even reinforcement learning . The core principle, though, remains the same: learn a representation space that reflects the similarities and dissimilarities between the data points.