Liang Qiu

Liang Qiu
Amazon · Alexa AI

PhD

About

30
Publications
6,143
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
282
Citations
Introduction
I am an Applied Scientist at Amazon Alexa AI. I received my Ph. D. degree from the Electrical and Computer Engineering Department, UCLA, co-advised by Prof. Song-Chun Zhu from the Center for Vision, Cognition, Learning, and Autonomy (VCLA) and Prof. Achuta Kadambi from the Visual Machines Group. He also works closely with Prof. Zhou Yu from the NLP research at Columbia University.
Additional affiliations
June 2021 - December 2021
Salesforce.com
Position
  • Research Intern
March 2021 - June 2021
Microsoft
Position
  • Research Intern
September 2017 - June 2020
DMAI
Position
  • Software Engineer
Education
September 2016 - March 2022
University of California, Los Angeles
Field of study
  • Electrical and Computer Engineering
September 2012 - June 2016
Shanghai Jiao Tong University
Field of study
  • Electrical Engineering

Publications

Publications (30)
Preprint
Full-text available
Mathematical reasoning is a fundamental aspect of human intelligence and is applicable in various fields, including science, engineering, finance, and everyday life. The development of artificial intelligence (AI) systems capable of solving math problems and proving theorems has garnered significant interest in the fields of machine learning and na...
Preprint
Full-text available
Mathematical reasoning, a core ability of human intelligence, presents unique challenges for machines in abstract thinking and logical reasoning. Recent large pre-trained language models such as GPT-3 have achieved remarkable progress on mathematical reasoning tasks written in text form, such as math word problems (MWP). However, it is unknown if t...
Preprint
Full-text available
When answering a question, humans utilize the information available across different modalities to synthesize a consistent and complete chain of thought (CoT). This process is normally a black box in the case of deep learning models like large-scale language models. Recently, science question benchmarks have been used to diagnose the multi-hop reas...
Conference Paper
Full-text available
Building a socially intelligent agent involves many challenges. One of which is to track the agent's mental state transition and teach the agent to make decisions guided by its value like a human. Towards this end, we propose to incorporate mental state simulation and value modeling into dialogue agents. First, we build a hybrid mental state parser...
Article
Full-text available
Current pre-training methods in computer vision focus on natural images in the daily-life context. However, abstract diagrams such as icons and symbols are common and important in the real world. We are inspired by Tangram, a game that requires replicating an abstract pattern from seven dissected shapes. By recording human experience in solving tan...
Article
Building a socially intelligent agent involves many challenges, one of which is to teach the agent to speak guided by its value like a human. However, value-driven chatbots are still understudied in the area of dialogue systems. Most existing datasets focus on commonsense reasoning or social norm modeling. In this work, we present a new large-scale...
Preprint
Full-text available
Dramatic progress has been made in animating individual characters. However, we still lack automatic control over activities between characters, especially those involving interactions. In this paper, we present a novel energy-based framework to sample and synthesize animations by associating the characters' body motions, facial expressions, and so...
Preprint
Full-text available
Extracting structure information from dialogue data can help us better understand user and system behaviors. In task-oriented dialogues, dialogue structure has often been considered as transition graphs among dialogue states. However, annotating dialogue states manually is expensive and time-consuming. In this paper, we propose a simple yet effecti...
Preprint
Full-text available
Building a socially intelligent agent involves many challenges, one of which is to teach the agent to speak guided by its value like a human. However, value-driven chatbots are still understudied in the area of dialogue systems. Most existing datasets focus on commonsense reasoning or social norm modeling. In this work, we present a new large-scale...
Preprint
Full-text available
Current pre-training methods in computer vision focus on natural images in the daily-life context. However, abstract diagrams such as icons and symbols are common and important in the real world. This work is inspired by Tangram, a game that requires replicating an abstract pattern from seven dissected shapes. By recording human experience in solvi...
Preprint
Full-text available
With the recent success of deep learning algorithms, many researchers have focused on generative models for human motion animation. However, the research community lacks a platform for training and benchmarking various algorithms, and the animation industry needs a toolkit for implementing advanced motion synthesizing techniques. To facilitate the...
Preprint
Full-text available
Current visual question answering (VQA) tasks mainly consider answering human-annotated questions for natural images. However, aside from natural images, abstract diagrams with semantic richness are still understudied in visual understanding and reasoning research. In this work, we introduce a new challenge of Icon Question Answering (IconQA) with...
Preprint
Full-text available
3D teeth reconstruction from X-ray is important for dental diagnosis and many clinical operations. However, no existing work has explored the reconstruction of teeth for a whole cavity from a single panoramic radiograph. Different from single object reconstruction from photos, this task has the unique challenge of constructing multiple objects at h...
Preprint
The cognitive system for human action and behavior has evolved into a deep learning regime, and especially the advent of Graph Convolution Networks has transformed the field in recent years. However, previous works have mainly focused on over-parameterized and complex models based on dense graph convolution networks, resulting in low efficiency in...
Preprint
Full-text available
Inferring social relations from dialogues is vital for building emotionally intelligent robots to interpret human language better and act accordingly. We model the social network as an And-or Graph, named SocAoG, for the consistency of relations among a group and leveraging attributes as inference cues. Moreover, we formulate a sequential structure...
Preprint
Full-text available
Geometry problem solving has attracted much attention in the NLP community recently. The task is challenging as it requires abstract problem understanding and symbolic reasoning with axiomatic knowledge. However, current datasets are either small in scale or not publicly available. Thus, we construct a new large-scale benchmark, Geometry3K, consist...
Preprint
Full-text available
Building a socially intelligent agent involves many challenges, one of which is to track the agent's mental state transition and teach the agent to make rational decisions guided by its utility like a human. Towards this end, we propose to incorporate a mental state parser and utility model into dialogue agents. The hybrid mental state parser extra...
Preprint
Full-text available
Convolutional networks (ConvNets) have achieved promising accuracy for various anatomical segmentation tasks. Despite the success, these methods can be sensitive to data appearance variations. Considering the large variability of scans caused by artifacts, pathologies, and scanning setups, robust ConvNets are vital for clinical applications, while...
Preprint
Full-text available
Patient's understanding on forthcoming dental surgeries is required by patient-centered care and helps reduce fear and anxiety. Due to the gap of expertise between patients and dentists, conventional techniques of patient education are usually not effective for explaining surgical steps. In this paper, we present \textit{OralViewer} -- the first in...
Preprint
Full-text available
In this paper, we propose a lightweight music-generating model based on variational autoencoder (VAE) with structured attention. Generating music is different from generating text because the melodies with chords give listeners distinguished polyphonic feelings. In a piece of music, a chord consisting of multiple notes comes from either the mixture...
Preprint
Full-text available
Inducing a meaningful structural representation from one or a set of dialogues is a crucial but challenging task in computational linguistics. Advancement made in this area is critical for dialogue system design and discourse analysis. It can also be extended to solve grammatical inference. In this work, we propose to incorporate structured attenti...
Chapter
Full-text available
3D teeth reconstruction from X-ray is important for dental diagnosis and many clinical operations. However, no existing work has explored the reconstruction of teeth for a whole cavity from a single panoramic radiograph. Different from single object reconstruction from photos, this task has the unique challenge of constructing multiple objects at h...
Article
Full-text available
In recent years, Recurrent Neural Networks (RNNs) based models have been applied to the Slot Filling problem of Spoken Language Understanding and achieved the state-of-the-art performances. In this paper, we investigate the effect of incorporating pre-trained language models into RNN based Slot Filling models. Our evaluation on the Airline Travel I...
Preprint
Full-text available
In recent years, Recurrent Neural Networks (RNNs) based models have been applied to the Slot Filling problem of Spoken Language Understanding and achieved the state-of-the-art performances. In this paper, we investigate the effect of incorporating pre-trained language models into RNN based Slot Filling models. Our evaluation on the Airline Travel I...
Thesis
Full-text available
Non-linguistic Vocalization Recognition refers to the detection and classification of non-speech voice such as laughter, sneeze, cough, cry, screaming, etc. It could be seen as a subtask of Acoustic Event Detection (AED). Great progress has been made by previous research to increase the accuracy of AED. On the front end, multiple kinds of features...

Network

Cited By