Xinyuan Song

Xinyuan Song
  • PhD Student at Emory University

About

155
Publications
13,536
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
983
Citations
Current institution
Emory University
Current position
  • PhD Student

Publications

Publications (155)
Preprint
Full-text available
Early identification of high-risk ICU patients is crucial for directing limited medical resources. We introduce ALFIA (Adaptive Layer Fusion with Intelligent Attention), a modular, attention-based architecture that jointly trains LoRA (Low-Rank Adaptation) adapters and an adaptive layer-weighting mechanism to fuse multi-layer semantic features from...
Preprint
Full-text available
This paper provides a comprehensive overview of fine-tuning techniques for Large Language Models (LLMs), a critical component in advancing natural language processing. It synthesizes recent progress in instruction fine-tuning, multitask learning, federated learning, and pedagogical alignment, highlighting their effectiveness, challenges, and potent...
Preprint
Full-text available
This paper examines emerging solutions for efficient medical Artificial Intelligence (AI) computing, exploring the convergence of AI with innovative technologies like silicon photonics, edge AI, distributed machine learning, and parallelism techniques to revolutionize healthcare. Covering applications in natural language processing, computer vision...
Preprint
Full-text available
This paper examines emerging solutions for efficient medical Artificial Intelligence (AI) computing, exploring the convergence of AI with innovative technologies like silicon photonics, edge AI, distributed machine learning, and parallelism techniques to revolutionize healthcare. Covering applications in natural language processing, computer vision...
Preprint
Full-text available
This survey systematically reviews Low-Rank Adaptation (LoRA) methods and their variants for Large Language Models (LLMs), emphasizing their efficiency in adapting models without compromising performance. We analyze recent advancements, technical foundations, and practical applications of these methods. By comparing LoRA with full fine-tuning, we h...
Preprint
Full-text available
Knowledge distillation (KD) is a technique for transferring knowledge from complex teacher models to simpler student models, significantly enhancing model efficiency and accuracy. It has demonstrated substantial advancements in various applications including image classification, object detection, language modeling, text classification, and sentime...
Preprint
Full-text available
Large Language Models (LLMs) have become a cornerstone of modern artificial intelligence (AI), finding applications across various domains such as healthcare, finance, entertainment, and customer service. To understand their ethical and social implications, it is essential to first grasp what these models are, how they function, and why they carry...
Preprint
Contrastive learning is a powerful technique in the field of machine learning, specifically in representation learning. The central idea of contrastive learning is to learn a model by distinguishing between similar and dissimilar data points. This involves pulling similar data points closer in the learned representation space while pushing dissimil...
Preprint
Full-text available
Large language models (LLMs) have transformed natural language processing, achieving remarkable results across various tasks. However, their computational demands during inference-referred to as test-time compute-present distinct challenges that often determine real-world applicability. This paper provides a systematic review of test-time compute o...
Article
We develop an interquantile shrinkage estimation method to examine the underlying commonality structure of regression coefficients across various quantile levels for longitudinal data in a data-driven manner. This method provides a deeper insight into the relationship between the response and covariates, leading to enhanced estimation efficiency an...
Preprint
Deep neural networks (DNNs) have become powerful tools for modeling complex data structures through sequentially integrating simple functions in each hidden layer. In survival analysis, recent advances of DNNs primarily focus on enhancing model capabilities, especially in exploring nonlinear covariate effects under right censoring. However, deep le...
Preprint
Full-text available
After AlphaFold won the Nobel Prize, protein prediction with deep learning once again became a hot topic. We comprehensively explore advanced deep learning methods applied to protein structure prediction and design. It begins by examining recent innovations in prediction architectures, with detailed discussions on improvements such as diffusion bas...
Preprint
Full-text available
Semantic segmentation is a machine learning algorithm that associates a label or category with each pixel of an image. It is used to identify sets of pixels that constitute distinguishable categories. This book offers a comprehensive and practical guide to the principles, methodologies, and applications of semantic segmentation in computer vision....
Preprint
Reinforcement Learning (RL) is a distinct branch of machine learning focused on how agents should take actions in an environment to maximize cumulative rewards. Unlike supervised learning, which relies on labeled datasets, RL is driven by the agent's interactions with its environment, learning optimal behaviors through trial and error. The agent le...
Preprint
Full-text available
Generative Adversarial Networks (GAN) have greatly influenced the development of computer vision and artificial intelligence in the past decade and also connected art and machine intelligence together. This book begins with a detailed introduction to the fundamental principles and historical development of GANs, contrasting them with traditional ge...
Preprint
The integration of bioinformatics predictions and experimental validation plays a pivotal role in advancing biological research, from understanding molecular mechanisms to developing therapeutic strategies. Bioinformatics tools and methods offer powerful means for predicting gene functions, protein interactions, and regulatory networks, but these p...
Preprint
Uncertainty quantification (UQ) is a critical aspect of artificial intelligence (AI) systems, particularly in high-risk domains such as healthcare, autonomous systems, and financial technology, where decision-making processes must account for uncertainty. This review explores the evolution of uncertainty quantification techniques in AI, distinguish...
Preprint
Full-text available
This comprehensive survey explores the theoretical foundations and practical applications of multimodal embedding representations for text, image, audio, and video data. The work begins with an introduction to the fundamental concepts of embeddings, detailing the transformation of high-dimensional discrete data into low-dimensional continuous vecto...
Preprint
Full-text available
The breakthrough achievements of multimodal large language models in recent years have brought researchers reverie. This paper provides an in-depth survey of text embeddings, exploring the evolution of word representations from fundamental continuous and similarity-preserving techniques to advanced contextual models. Beginning with the basics of wo...
Preprint
Full-text available
This work presents a comprehensive survey of the Transformer architecture and its wide-ranging applications. We begin with an in‐depth introduction to deep learning and natural language processing fundamentals, tracing the evolution from traditional rule-based and statistical methods to modern neural network models. Special emphasis is placed on th...
Preprint
Full-text available
In recent years, the field of artificial intelligence (AI) and machine learning (ML) has undergone a transformative shift, with generative models emerging as one of the most significant and impactful areas of research. Generative models, in essence, are models that can generate new data instances that resemble a given set of training data. Unlike d...
Preprint
Full-text available
Contrastive learning is a powerful technique in the field of machine learning, specifically in representation learning. The central idea of contrastive learning is to learn a model by distinguishing between similar and dissimilar data points. This involves pulling similar data points closer in the learned representation space while pushing dissimil...
Preprint
Full-text available
Reinforcement Learning (RL) is a distinct branch of machine learning focused on how agents should take actions in an environment to maximize cumulative rewards. Unlike supervised learning, which relies on labeled datasets, RL is driven by the agent's interactions with its environment, learning optimal behaviors through trial and error. The agent le...
Preprint
Full-text available
Large Language Models (LLMs) have become a cornerstone of modern artificial intelligence (AI), finding applications across various domains such as healthcare, finance, entertainment, and customer service. To understand their ethical and social implications, it is essential to first grasp what these models are, how they function, and why they carry...
Article
Full-text available
Recurrent events are common in medical practice or epidemiologic studies when each subject experiences a particular event repeatedly over time. In some long-term observations of recurrent events, a terminal event such as death may exist in recurrent event data. Meanwhile, some inspected subjects will withdraw from a study for some time for various...
Preprint
Clinical trials are an indispensable part of the drug development process, bridging the gap between basic research and clinical application. During the development of new drugs, clinical trials are used not only to evaluate the safety and efficacy of the drug but also to explore its dosage, treatment regimens, and potential side effects. This revie...
Preprint
Full-text available
Deep learning has transformed AI applications but faces critical security challenges, including adversarial attacks, data poisoning, model theft, and privacy leakage. This survey examines these vulnerabilities, detailing their mechanisms and impact on model integrity and confidentiality. Practical implementations, including adversarial examples, la...
Preprint
Full-text available
Deep learning-based image generation has undergone a paradigm shift since 2021, marked by fundamental architectural breakthroughs and computational innovations. Through reviewing architectural innovations and empirical results, this paper analyzes the transition from traditional generative methods to advanced architectures, with focus on compute-ef...
Preprint
Full-text available
In recent years, the field of artificial intelligence (AI) and machine learning (ML) has undergone a transformative shift, with generative models emerging as one of the most significant and impactful areas of research. Generative models, in essence, are models that can generate new data instances that resemble a given set of training data. Unlike d...
Preprint
Deep learning is a subset of machine learning that models data using artificial neural networks with multiple layers. Each layer in the neural network processes the input data, extracts relevant features, and passes it to the next layer. Deep learning techniques have led to significant advancements in areas such as image recognition, natural langua...
Preprint
Contrastive learning is a powerful technique in the field of machine learning, specifically in representation learning. The central idea of contrastive learning is to learn a model by distinguishing between similar and dissimilar data points. This involves pulling similar data points closer in the learned representation space while pushing dissimil...
Preprint
The integration of bioinformatics predictions and experimental validation plays a pivotal role in advancing biological research, from understanding molecular mechanisms to developing therapeutic strategies. Bioinformatics tools and methods offer powerful means for predicting gene functions, protein interactions, and regulatory networks, but these p...
Preprint
Clinical trials are an indispensable part of the drug development process, bridging the gap between basic research and clinical application. During the development of new drugs, clinical trials are used not only to evaluate the safety and efficacy of the drug but also to explore its dosage, treatment regimens, and potential side effects. This revie...
Preprint
Full-text available
Explainability in artificial intelligence (AI) has become crucial for ensuring transparency, trust, and usability across diverse application domains, such as healthcare, finance, and autonomous systems. This comprehensive review analyzes the state of research on explainability techniques, categorizing approaches into model-agnostic, model-specific,...
Preprint
Contrastive learning is a powerful technique in the field of machine learning, specifically in representation learning. The central idea of contrastive learning is to learn a model by distinguishing between similar and dissimilar data points. This involves pulling similar data points closer in the learned representation space while pushing dissimil...
Preprint
Reinforcement Learning (RL) is a distinct branch of machine learning focused on how agents should take actions in an environment to maximize cumulative rewards. Unlike supervised learning, which relies on labeled datasets, RL is driven by the agent's interactions with its environment, learning optimal behaviors through trial and error. The agent le...
Preprint
Full-text available
Artificial Intelligence (AI) has permeated numerous aspects of our daily lives, from predictive text on our smartphones to complex decision-making systems in healthcare and finance. While AI has shown remarkable accuracy and efficiency, it is often criticized for being a 'black box,' particularly when it comes to complex models like deep learning a...
Preprint
Full-text available
Large Language Models (LLMs) have become a cornerstone of modern artificial intelligence (AI), finding applications across various domains such as healthcare, finance, entertainment, and customer service. To understand their ethical and social implications, it is essential to first grasp what these models are, how they function, and why they carry...
Preprint
Full-text available
Large Language Models (LLMs) represent a significant advancement in artificial intelligence, capable of understanding and generating human-like text based on extensive training data. These models are trained on vast datasets that encompass various topics, languages, and styles, enabling them to perform a wide range of language-related tasks, such a...
Preprint
Full-text available
Large Language Models (LLMs) have become a cornerstone of modern artificial intelligence (AI), finding applications across various domains such as healthcare, finance, entertainment, and customer service. To understand their ethical and social implications, it is essential to first grasp what these models are, how they function, and why they carry...
Preprint
Reinforcement Learning (RL) is a distinct branch of machine learning focused on how agents should take actions in an environment to maximize cumulative rewards. Unlike supervised learning, which relies on labeled datasets, RL is driven by the agent's interactions with its environment, learning optimal behaviors through trial and error. The agent le...
Preprint
Full-text available
Advancements in artificial intelligence, machine learning, and deep learning have catalyzed the transformation of big data analytics and management into pivotal domains for research and application. This work explores the theoretical foundations, methodological advancements, and practical implementations of these technologies, emphasizing their rol...
Preprint
Full-text available
Explainable Artificial Intelligence (XAI) addresses the growing need for transparency and interpretability in AI systems, enabling trust and accountability in decision-making processes. This book offers a comprehensive guide to XAI, bridging foundational concepts with advanced methodologies. It explores interpretability in traditional models such a...
Preprint
This book provides a comprehensive guide to Transformer and other modern language model architectures. It covers influential models such as BERT, GPT, RWKV, RetNet, and Mamba, examining their applications in NLP, computer vision, and beyond. With a focus on both theoretical foundations and practical advancements, the book also explores future direc...
Article
Partially linear models provide a valuable tool for modeling failure time data with nonlinear covariate effects. Their applicability and importance in survival analysis have been widely acknowledged. To date, numerous inference methods for such models have been developed under traditional right censoring. However, the existing studies seldom target...
Article
Full-text available
In multi-site studies, sharing individual-level information across multiple data-contributing sites usually poses a significant risk to data security. Thus, due to privacy constraints, analytical tools without using individual-level data have drawn considerable attention to researchers in recent years. In this work, we consider regression analysis...
Preprint
Full-text available
With a focus on natural language processing (NLP) and the role of large language models (LLMs), we explore the intersection of machine learning, deep learning, and artificial intelligence. As artificial intelligence continues to revolutionize fields from healthcare to finance, NLP techniques such as tokenization, text classification, and entity rec...
Article
This study proposes a heterogeneous mediation analysis for survival data that accommodates multiple mediators and sparsity of the predictors. We introduce a joint modeling approach that links the mediation regression and proportional hazards models through Bayesian additive regression trees with shared typologies. The shared tree component is motiv...
Article
Full-text available
This paper develops a novel doubly robust triple cross-fit estimator to estimate the average treatment effect (ATE) using observational and imaging data. The construction of the proposed estimator consists of two stages. The first stage extracts representative image features using the high-dimensional functional principal component analysis model....
Article
Full-text available
Traditional methods used in causal mediation analysis with continuous treatment often focus on estimating average causal effects, limiting their applicability in precision medicine. Machine learning techniques have emerged as a powerful approach for precisely estimating individualized causal effects. This paper proposes a novel method called CGAN-I...
Article
Interval-censored failure time data frequently arise in various scientific studies where each subject experiences periodical examinations for the occurrence of the failure event of interest, and the failure time is only known to lie in a specific time interval. In addition, collected data may include multiple observed variables with a certain degre...
Article
Full-text available
Most classical methods popularly used in causal mediation analysis can only estimate the average causal effects and are difficult to apply to precision medicine. Although identifying heterogeneous causal effects has received some attention, the causal effects are explored using the assumptive parametric models with limited model flexibility and ana...
Article
Hidden Markov models (HMMs), which can characterize dynamic heterogeneity, are valuable tools for analyzing longitudinal data. The order of HMMs (ie, the number of hidden states) is typically assumed to be known or predetermined by some model selection criterion in conventional analysis. As prior information about the order frequently lacks, pairwi...
Article
Alzheimer’s (AD) is a progressive neurodegenerative disease frequently associated with memory deficits and cognitive decline. Despite its irreversible once onset, some discoveries revealed the existence of a certain percentage of people who are non-susceptible to AD. This study proposes a joint analysis of multivariate longitudinal data, survival d...
Article
We discuss the fundamental issue of identification in linear instrumental variable (IV) models with unknown IV validity. With the assumption of the ‘sparsest rule’, which is equivalent to the plurality rule but becomes operational in computation algorithms, we investigate and prove the advantages of non-convex penalized approaches over other IV est...
Article
Full-text available
Mediation analysis aims at quantifying and explaining the underlying causal mechanism between an exposure and an outcome of interest. In the context of survival analysis, mediation models have been widely used to achieve causal interpretation for the direct and indirect effects on the survival of interest. Although heterogeneity in treatment effect...
Article
Full-text available
Partly interval censoring is frequently encountered in clinical trials when the failure time of an event is observed exactly for some subjects but is only known to fall within an observed interval for others. Although this kind of censoring has drawn recent attention in survival analysis, available methods typically assume that the observed interva...
Article
Current status data arise when each subject under study is examined only once at an observation time, and one only knows the failure status of the event of interest at the observation time rather than the exact failure time. Moreover, the obtained failure status is frequently subject to misclassification due to imperfect tests, yielding misclassifi...
Article
Full-text available
This study considers a functional concurrent hidden Markov model. The proposed model consists of two components. One is a transition model for elucidating how potential covariates influence the transition probability from one state to another. The other is a conditional functional linear concurrent regression model for characterizing the state-spec...
Article
This study proposes a joint modeling approach to conduct causal mediation analysis that accommodates multivariate longitudinal data, dynamic latent mediator, and survival outcome. First, we introduce a confirmatory factor analysis model to characterize a time-varying latent mediator through multivariate longitudinal observable variables. Then, we e...
Article
In modern scientific research, multi‐block missing data emerges with synthesizing information across multiple studies. However, existing imputation methods for handling block‐wise missing data either focus on the single‐block missing pattern or heavily rely on the model structure. In this study, we propose a single regression‐based imputation algor...
Article
This study proposes a joint deep learning method, namely, confounding factor joint decomposition under counterfactual regression (CFJD-CFR), to identify a minimum adjustment covariate set and estimate the average treatment effect (ATE). CFJD-CFR includes two levels: feature learning and prediction model. Feature learning constructs the objective fu...
Article
Full-text available
Conventional hazard regression analyses frequently assume constant regression coefficients and scalar covariates. However, some covariate effects may vary with time. Moreover, medical imaging has become an increasingly important tool in screening, diagnosis, and prognosis of various diseases, given its information visualization and quantitative ass...
Preprint
Full-text available
We develop an interquantile shrinkage estimation to explore the underlying commonality structure of regression coefficients across different quantile levels for longitudinal data in a data-driven manner. It provides a more insightful view between the response and covariates and enhances estimation efficiency and model interpretability. We propose a...
Article
This paper considers a quantile latent factor-on-image (Q-LoI) regression model to comprehensively investigate the relationship between the latent factor of interest and scalar and imaging predictors at different quantiles. The latent factor is characterized by several manifest variables through a confirmatory factor analysis model and then regress...
Article
Recurrent event data with a terminal event commonly arise in many longitudinal follow‐up studies. This article proposes a class of dynamic semiparametric transformation models for the marginal mean functions of the recurrent events with a terminal event, where some covariate effects may be time‐varying. An estimation procedure is developed for the...
Article
The conventional Cox proportional hazards (PH) model typically assumes fully observed predictors and constant regression coefficients. However, some predictors are latent variables, each of which must be characterized by multiple observed indicators from various perspectives. Moreover, the predictor effects may vary with time in practice. Accommoda...
Article
This article considers a joint modeling framework for simultaneously examining the dynamic pattern of longitudinal and ultrahigh-dimensional images and their effects on the survival of interest. A functional mixed effects model is considered to describe the trajectories of longitudinal images. Then, a high-dimensional functional principal component...
Article
Full-text available
Interval-censored failure time and panel count data, which frequently arise in medical studies and social sciences, are two types of important incomplete data. Although methods for their joint analysis have been available in the literature, they did not consider the observation process, which may depend on the failure time and/or panel count of int...
Preprint
We discuss the fundamental issue of identification in linear instrumental variable (IV) models with unknown IV validity. We revisit the popular majority and plurality rules and show that no identification condition can be "if and only if" in general. With the assumption of the "sparsest rule", which is equivalent to the plurality rule but becomes o...
Article
Hidden Markov models (HMMs) describe the relationship between two stochastic processes: an observed process and an unobservable finite-state transition process. Owing to their modeling dynamic heterogeneity, HMMs are widely used to analyze heterogeneous longitudinal data. Traditional HMMs frequently assume that the number of hidden states (i.e., th...
Article
Full-text available
We propose a joint modeling approach to investigate the observed and latent risk factors of the multivariate failure times of interest. The proposed model comprises two parts. The first part is a distribution-free confirmatory factor analysis model that characterizes the latent factors by correlated multiple observed variables. The second part is a...
Article
In psychological, social, behavioral, and medical studies, hidden Markov models (HMMs) have been extensively applied to the simultaneous modeling of longitudinal observations and the underlying dynamic transition process. However, the existing HMMs mainly focus on constant-coefficient HMMs. This study considers a varying-coefficient HMM, which enab...
Article
As extensions of means, expectiles embrace all the distribution information of a random variable. The expectile regression is computationally friendlier because the asymmetric least square loss function is differentiable everywhere. This regression also enables effective estimation of the expectiles of a response variable when potential explanatory...
Article
A novel feature screening method is proposed to examine the correlation between latent responses and potential predictors in ultrahigh dimensional data analysis. First, a confirmatory factor analysis (CFA) model is used to characterize latent responses through multiple observed variables. The expectation‐maximization algorithm is employed to estima...
Article
Full-text available
This study considers a time-varying coefficient additive hazards model with latent variables to examine potential observed and latent risk factors for survival of interest. The model consists of two parts: confirmatory factor analysis to measure each latent factor through multiple observable variables and a varying coefficient additive hazards mode...
Article
Full-text available
We consider accelerated failure time models with error-prone time-to-event outcomes. The proposed models extend the conventional accelerated failure time model by allowing time-to-event responses to be subject to measurement errors. We describe two measurement error models, a logarithm transformation regression measurement error model and an additi...
Article
Full-text available
Censored quantile regression has elicited extensive research interest in recent years. One class of methods is based on an informative subset of a sample, selected via the propensity score. Propensity score can either be estimated using parametric methods, which poses the risk of misspecification or obtained using nonparametric approaches, which su...
Article
In many scientific fields, partly interval-censored data, which consist of exactly observed and interval-censored observations on the failure time of interest, appear frequently. However, methodological developments in the analysis of partly interval-censored data are relatively limited and have mainly focused on additive or proportional hazards mo...
Article
Alzheimer's disease (AD) is an incurable and progressive disease that starts from mild cognitive impairment and deteriorates over time. Examining the effects of patients' longitudinal cognitive decline on time to conversion to AD and obtaining a reliable diagnostic model are therefore critical to the evaluation of AD prognosis and early treatment....
Article
A mixture additive hazards cure model with latent variables is proposed to investigate the risk factors of the corporate default issue with a sample of corporate bonds from the Chinese financial market. The proposed model combines confirmatory factor analysis, additive hazards, and cure models to characterize latent attributes, such as profitabilit...
Article
A time-varying coefficient ARCH-in-mean (ARCH-M) model with a dynamic latent variable that follows an AR process is considered. The joint model extends the existing ARCH-M model by considering a dynamic structure of latent variable for examining a latent effect on the time-varying risk–return relationship. A Bayesian approach coped with Markov Chai...
Article
Full-text available
A mixture proportional hazards cure model with latent variables is proposed. The proposed model assesses the effects of the observed and latent risk factors on the hazards of uncured subjects and the cure rate through a proportional hazards model and a logistic model, respectively. Factor analysis is employed to measure the latent variables through...
Article
Tobit models (also called as “censored regression models” or classified as “sample selection models” in microeconometrics) have been widely applied to microeconometric problems with censored outcomes. However, due to their linear parametric settings and restrictive normality assumptions, the traditional Tobit models fail to capture the pervading no...
Article
Multivariate interval-censored data arise when each subject under study can potentially experience multiple events and the onset time of each event is not observed exactly but is known to lie in a certain time interval formed by adjacent examination times with changed statuses of the event. This type of incomplete and complex data structure poses a...
Article
Full-text available
Current status data occur in many fields including demographical, epidemiological, financial, medical, and sociological studies. We consider the regression analysis of current status data with latent variables. The proposed model consists of a factor analytic model for characterizing latent variables through their multiple surrogates and an additiv...

Network

Cited By