Lingfei Wu's research while affiliated with IBM and other places

Publications (37)

Preprint
Full-text available
Conversational recommendation systems (CRS) effectively address information asymmetry by dynamically eliciting user preferences through multi-turn interactions. Existing CRS widely assumes that users have clear preferences. Under this assumption, the agent will completely trust the user feedback and treat the accepted or rejected signals as strong...
Article
Knowledge graph (KG) question generation (QG) aims to generate natural language questions from KGs and target answers. Previous works mostly focus on a simple setting that is to generate questions from a single KG triple. In this work, we focus on a more realistic setting where we aim to generate questions from a KG subgraph and target answers. In...
Article
Full-text available
During the past decade, great significant advancements have been witnessed in the domain of automatic product description generation. As the services provided by e‐commerce platforms become diverse, it is necessary to dynamically adapt the patterns of descriptions generated. The selling point of products is an important type of product description...
Preprint
It has been observed in practice that applying pruning-at-initialization methods to neural networks and training the sparsified networks can not only retain the testing performance of the original dense models, but also sometimes even slightly boost the generalization performance. Theoretical understanding for such experimental observations are yet...
Article
Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we...
Article
Deep generative models for graphs have recently achieved great successes in modeling and generating graphs for studying networks in biology, engineering, and social sciences. However, they are typically unconditioned generative models that have no control over the target graphs given a source graph. In this article, we propose a novel graph-transla...
Preprint
Graph Structure Learning (GSL) recently has attracted considerable attentions in its capacity of optimizing graph structure as well as learning suitable parameters of Graph Neural Networks (GNNs) simultaneously. Current GSL methods mainly learn an optimal graph structure (final view) from single or multiple information sources (basic views), howeve...
Chapter
Natural language processing (NLP) and understanding aim to read from unformatted text to accomplish different tasks. While word embeddings learned by deep neural networks are widely used, the underlying linguistic and semantic structures of text pieces cannot be fully exploited in these representations. Graph is a natural way to capture the connect...
Chapter
In this chapter, we first describe what representation learning is and why we need representation learning. Among the various ways of learning representations, this chapter focuses on deep learning methods: those that are formed by the composition of multiple non-linear transformations, with the goal of resulting in more abstract and ultimately mor...
Chapter
Full-text available
Deep Learning has become one of the most dominant approaches in Artificial Intelligence research today. Although conventional deep learning techniques have achieved huge successes on Euclidean data such as images, or sequence data such as text, there are many applications that are naturally or best represented with a graph structure. This gap has d...
Chapter
Graph representation learning aims at assigning nodes in a graph to lowdimensional representations and effectively preserving the graph structures. Recently, a significant amount of progress has been made toward this emerging graph analysis paradigm. In this chapter, we first summarize the motivation of graph representation learning. Afterwards and...
Book
Deep Learning models are at the core of artificial intelligence research today. It is well known that deep learning techniques are disruptive for Euclidean data, such as images or sequence data, and not immediately applicable to graph-structured data such as text. This gap has driven a wave of research for deep learning on graphs, including graph r...
Article
In this paper, we propose a novel DynAttGraph2Seq framework to model complex dynamic transitions of an individual user's activities and the textual information of the posts over time in online health forums and learning how these correspond to his/her health stage. To achieve this, we first formulate the transition of user activities as a dynamic a...
Conference Paper
Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code and neglect the creation of the documentation in a notebook. In this work, we present a human-centered automation system, Themisto, that can support users to easily creat...
Preprint
During the past decade, deep learning's performance has been widely recognized in a variety of machine learning tasks, ranging from image classification, speech recognition to natural language understanding. Graph neural networks (GNN) are a type of deep learning that is designed to handle non-Euclidean issues using graph-structured data that are d...
Preprint
Many data scientists use Jupyter notebook to experiment code, visualize results, and document rationales or interpretations. The code documentation generation CDG task in notebooks is related but different from the code summarization task in software engineering, as one documentation (markdown cell) may consist of a text (informative summary or ind...
Preprint
Full-text available
Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations, which leads to challenges in sharing their notebooks with others and future selves. Inspire...
Preprint
Prior work on automated question generation has almost exclusively focused on generating simple questions whose answers can be extracted from a single document. However, there is an increasing interest in developing systems that are capable of more complex multi-hop question generation, where answering the questions requires reasoning over multiple...
Conference Paper
Conversational machine comprehension (MC) has proven significantly more challenging compared to traditional MC since it requires better utilization of conversation history. However, most existing approaches do not effectively capture conversation history and thus have trouble handling questions involving coreference or ellipsis. Moreover, when reas...
Preprint
Sequence-to-sequence models for abstractive summarization have been studied extensively, yet the generated summaries commonly suffer from fabricated content, and are often found to be near-extractive. We argue that, to address these issues, the summarizer should acquire semantic interpretation over input, e.g., via structured representation, to all...
Preprint
Automatic source code summarization is the task of generating natural language descriptions for source code. Automatic code summarization is a rapidly expanding research area, especially as the community has taken greater advantage of advances in neural network and AI technologies. In general, source code summarization techniques use the source cod...
Preprint
Graph Neural Networks (GNNs) have boosted the performance of many graph related tasks such as node classification and graph classification. Recent researches show that graph neural networks are vulnerable to adversarial attacks, which deliberately add carefully created unnoticeable perturbation to the graph structure. The perturbation is usually cr...

Citations

... Training of a DNN question answering model requires a set of text passages and corresponding pairs of questions and answers. Multiple approaches exist for generation of questions (and answers): knowledgegraph-to-question template-based methodology (similar to the context generation) [67,98,136,140], WH questions (e.g., Where, Who, What, When, Why) rule-based approach [80], knowledge graph-based question generation [16,50], and DNN-based models for generating additional types of questions [25,49,117,135]. The rule-based method uses part-of-speech parsing of sentences using the Stanford Parser [59], creates a tree query language and tree manipulation [65], and applies a set of rules to simplify and transform the sentences to a question. ...
... Most notably for this study, AI systems are currently being designed to benefit software development teams with AI technology currently showing incredible promise in its ability to aid in the completion of programming tasks (Nguyen & Nadi, 2022;Sobania et al., 2022;Weisz et al., 2021). Moreover, human-AI collaboration has the potential to benefit various other tasks critical to software development, including the modernization (Houde et al., 2022) and documentation (Barenkamp et al., 2020;Wang et al., 2022) of software systems, which are often repetitive tasks that could be easily completed more rapidly by AI teammates. The above examples demonstrate that the concept of human-AI teaming is not a passing trend but rather a highly appealing application of AI technology that can further optimize modern teaming processes. ...
... In addition, despite decent aforehand denoising operation, the original graph structure may still contain task-irrelevant information undesired for the downstream link prediction task or even counteractive links that may come from false-negative edges. 18 In practical application, some edges in the RDRA graph often need to be discarded or weakened to capture the most valuable edges and alleviate over-smoothing. In contrast, the model must emphasize some functional edges depending on the downstream link prediction task. ...
... Conversational recommender systems (CRS) aim to understand user preferences and provide personalized recommendations through conversations. Typical traditional CRS setups include template-based CRS [13,26,37,38,70] and critiquing-based CRS [9,42,67]. More recently, as natural language processing has advanced, the community developed "deep" CRS [10,41,64] that support interactions in natural language. ...
... As showcased by Faez et al. [16], research for such deep graph generators mainly focuses on chemical and bioinformatics [27], [28], social structures [18], [29]- [31], or synthetic datasets [30], [31], created by ER, WS, and others. In this paper, we adopt some techniques from other application fields to evaluate for our use case of WAN generation. ...
... In [23], [24], focus on capturing patients' sentiments expressed in their discussions to identify reliable medical knowledge. In order to further aid healthcare organizations in supporting patients, Gao et al. [25], [26] propose to infer patients' health stages based on their online activities. ...
... We also write a monograph [126] about our systematic work on the topic of network embedding. In terms of GNNs, Wu et al. [114] write a book covering more than 20 topics of GNNs in 4 parts: introduction, foundations, frontiers, and applications. Shi et al. [93] write a monograph providing a launch point for discussing the latest trends of GNNs. ...
... The GNN captures the information within the node dependencies while the recurrent structure captures the information within their temporal evolution. Specifically, we consider implementing our embedding predictor g ω as an RNN-GNN (Wu et al. 2022). The combination of LSTM and GCN is what we found empirically the optimal trade-off between efficiency and capacity in most cases, while the RNN-GNN is generic for many different variants (LSTM-GAT, GRU-GCN, etc.) ...
... Graph neural networks (GNNs) have gained widespread recognition as strong tools for solving real-world problems in diverse domains. They have been successfully applied in recommendation systems [1][2][3][4], bioinformatics [5][6][7], traffic prediction [8][9][10], natural language processing [11,12], drug discovery [13][14][15], fraud detection [16][17][18], robotics [19][20][21], and social media analysis [22,23], among other fields. The effectiveness of current GNNs can be attributed to their utilization of techniques such as gated recurrent units and the self-attention mechanism. ...
... Using both our directives and open dataset, it will be possible to train machine and/or deep learning models as done by [56;24] and develop tool support for comment-assistance, either automatically generating the comments or advising developers of anti-patterns. Although some studies have been conducted in Jupyter Notebooks (which, just like Python, is also dynamically-typed) [43], they require an understanding of common patterns and anti-patterns [38], and manually-labelled datasets which, before our study, were not available for R programming. Machine learning for knowledge identification in API has been previously used [24]. ...