S. R. K. Branavan’s research while affiliated with Massachusetts Institute of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (13)


Learning to Win by Reading Manuals in a Monte-Carlo Framework
  • Article

January 2014

·

70 Reads

·

56 Citations

Journal of Artificial Intelligence Research

S. R. K. Branavan

·

David Silver

·

Regina Barzilay

Domain knowledge is crucial for effective performance in autonomous control systems. Typically, human effort is required to encode this knowledge into a control algorithm. In this paper, we present an approach to language grounding which automatically interprets text in the context of a complex control application, such as a game, and uses domain knowledge extracted from the text to improve control performance. Both text analysis and control strategies are learned jointly using only a feedback signal inherent to the application. To effectively leverage textual information, our method automatically extracts the text segment most relevant to the current game state, and labels it with a task-centric predicate structure. This labeled text is then used to bias an action selection policy for the game, guiding it towards promising regions of the action space. We encode our model for text analysis and game playing in a multi-layer neural network, representing linguistic decisions via latent variables in the hidden layers, and game action quality via the output layer. Operating within the Monte-Carlo Search framework, we estimate model parameters using feedback from simulated games. We apply our approach to the complex strategy game Civilization II using the official game manual as the text guide. Our results show that a linguistically-informed game-playing agent significantly outperforms its language-unaware counterpart, yielding a 34% absolute improvement and winning over 65% of games when playing against the built-in AI of Civilization.


Learning High-Level Planning from Text

July 2012

·

29 Reads

·

51 Citations

Comprehending action preconditions and effects is an essential step in modeling the dynamics of the world. In this paper, we express the semantics of precondition relations extracted from text in terms of planning operations. The challenge of modeling this connection is to ground language at the level of relations. This type of grounding enables us to create high-level plans based on language abstractions. Our model jointly learns to predict precondition relations from text and to perform high-level planning guided by those relations. We implement this idea in the reinforcement learning framework using feedback automatically obtained from plan execution attempts. When applied to a complex virtual world and text describing that world, our relation extraction technique performs on par with a supervised baseline, yielding an F-measure of 66% compared to the baseline's 65%. Additionally, we show that a high-level planner utilizing these extracted relations significantly outperforms a strong, text unaware baseline -- successfully completing 80% of planning tasks as compared to 69% for the baseline.


Learning to Win by Reading Manuals in a Monte-Carlo Framework

January 2011

·

41 Reads

·

53 Citations

This paper presents a novel approach for leveraging automatically extracted textual knowledge to improve the performance of control applications such as games. Our ultimate goal is to enrich a stochastic player with highlevel guidance expressed in text. Our model jointly learns to identify text that is relevant to a given game state in addition to learning game strategies guided by the selected text. Our method operates in the Monte-Carlo search framework, and learns both text analysis and game strategies based only on environment feedback. We apply our approach to the complex strategy game Civilization II using the official game manual as the text guide. Our results show that a linguistically-informed game-playing agent significantly outperforms its language-unaware counterpart, yielding a 27% absolute improvement and winning over 78% of games when playing against the builtin AI of Civilization II.


Non-Linear Monte-Carlo Search in Civilization II

January 2011

·

226 Reads

·

26 Citations

This paper presents a new Monte-Carlo search algorithm for very large sequential decision-making problems. We apply non-linear regression within Monte-Carlo search, online, to estimate a state-action value function from the outcomes of random roll-outs. This value function generalizes between related states and actions, and can therefore provide more accurate evaluations after fewer rollouts. A further significant advantage of this approach is its ability to automatically extract and leverage domain knowledge from external sources such as game manuals. We apply our algorithm to the game of Civilization II, a challenging multiagent strategy game with an enormous state space and around 1021 joint actions. We approximate the value function by a neural network, augmented by linguistic knowledge that is extracted automatically from the official game manual. We show that this non-linear value function is significantly more efficient than a linear value function, which is itself more efficient than Monte-Carlo tree search. Our non-linear Monte-Carlo search wins over 78% of games against the built-in AI of Civilization II.


Reading Between the Lines: Learning to Map High-level Instructions to Commands

December 2010

·

77 Reads

·

106 Citations

In this paper, we address the task of mapping high-level instructions to sequences of commands in an external environment. Processing these instructions is challenging---they posit goals to be achieved without specifying the steps required to complete them. We describe a method that fills in missing information using an automatically derived environment model that encodes states, transitions, and commands that cause these transitions to happen. We present an efficient approximate approach for learning this environment model as part of a policy-gradient reinforcement learning algorithm for text interpretation. This design enables learning for mapping high-level instructions, which previous statistical methods cannot handle.


Good grief, I can speak it! preliminary experiments in audio restaurant reviews
  • Article
  • Full-text available

December 2010

·

93 Reads

·

6 Citations

·

·

S.R.K. Branavan

·

[...]

·

Regina Barzilay

In this paper, we introduce a new envisioned application for speech which allows users to enter restaurant reviews orally via their mobile device, and, at a later time, update a shared and growing database of consumer-provided information about restaurants. During the intervening period, a speech recognition and NLP based system has analyzed their audio recording both to extract key descriptive phrases and to com-pute sentiment ratings based on the evidence provided in the audio clip. We report here on our preliminary work moving towards this goal. Our experiments demonstrate that multi-aspect sentiment ranking works surprisingly well on speech output, even in the presence of recognition errors. We also present initial experiments on integrated sentence boundary detection and key phrase extraction from recognition output.

Download

Content Modeling Using Latent Permutations

October 2009

·

63 Reads

·

41 Citations

Journal of Artificial Intelligence Research

We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that this space of orderings can be effectively represented using a distribution over permutations called the Generalized Mallows Model. We apply our method to three complementary discourse-level tasks: cross-document alignment, document segmentation, and information ordering. Our experiments show that incorporating our permutation-based model in these applications yields substantial improvements in performance over previously proposed methods.


High compression rate text summarization

January 2009

·

78 Reads

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008. Includes bibliographical references (p. 95-97). This thesis focuses on methods for condensing large documents into highly concise summaries, achieving compression rates on par with human writers. While the need for such summaries in the current age of information overload is increasing, the desired compression rate has thus far been beyond the reach of automatic summarization systems. The potency of our summarization methods is due to their in-depth modelling of document content in a probabilistic framework. We explore two types of document representation that capture orthogonal aspects of text content. The first represents the semantic properties mentioned in a document in a hierarchical Bayesian model. This method is used to summarize thousands of consumer reviews by identifying the product properties mentioned by multiple reviewers. The second representation captures discourse properties, modelling the connections between different segments of a document. This discriminatively trained model is employed to generate tables of contents for books and lecture transcripts. The summarization methods presented here have been incorporated into large-scale practical systems that help users effectively access information online. by Satchuthananthavale Rasiah Kuhan Branavan. S.M.


Global Models of Document Structure using Latent Permutations.

January 2009

·

134 Reads

·

64 Citations

We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organiza- tion of document topics. We propose a global model in which both topic selection and order- ing are biased to be similar across a collection of related documents. We show that this space of orderings can be elegantly represented us- ing a distribution over permutations called the generalized Mallows model. Our structure- aware approach substantially outperforms al- ternative approaches for cross-document com- parison and single-document segmentation.1


Learning Document-Level Semantic Properties from Free-Text Annotations.

January 2009

·

53 Reads

·

87 Citations

Journal of Artificial Intelligence Research

This paper presents a new method for inferring the semantic properties of documents by lever- aging free-text keyphrase annotations. Such annotations are becoming increasingly abundant due to the recent dramatic growth in semi-structured, user-generated online content. One especially relevant domain is product reviews, which are often annotated by their authors with pros/cons keyphrases such as "a real bargain" or "good value." These annotations are representative of the underlying semantic properties; however, unlike expert annotations, they are noisy: lay authors may use different labels to denote the same property, and some labels may be missing. To learn using such noisy annotations, we find a hidden paraphrase structure which clusters the keyphrases. The paraphrase structure is linked with a latent topic model of the review texts, enabling the sys- tem to predict the properties of unannotated documents and to effectively aggregate the semantic properties of multiple reviews. Our approach is implemented as a hierarchical Bayesian model with joint inference. We find that joint inference increases the robustness of the keyphrase clustering and encourages the latent topics to correlate with semantically meaningful properties. Multiple evalua- tions demonstrate that our model substantially outperforms alternative approaches for summarizing single and multiple documents into a set of semantically salient keyphrases.


Citations (12)


... For the non-NLP (multi-modal) tasks, most focused on environment-grounded language learning, i.e., driving the agent to associate natural language instructions with the environments and make corresponding reactions, such as selecting mentioned objects from an image/video (Matuszek et al. 2012;Krishnamurthy and Kollar 2013;Puig et al. 2018), following navigational instructions to move the agent (Tellex et al. 2011;Kim and Mooney 2012;Chen 2012;Artzi and Zettlemoyer 2013;Bisk, Yuret, and Marcu 2016), plotting corresponding traces on a map (Vogel and Jurafsky 2010; Chen and Mooney 2011), playing soccer/card games based on given rules (Kuhlmann et al. 2004;Eisenstein et al. 2009;Branavan, Silver, and Barzilay 2011;Babeş-Vroman et al. 2012;Goldwasser and Roth 2011), generating real-time sports broadcast (Chen and Mooney 2008;Liang, Jordan, and Klein 2009), controlling software (Branavan, Zettlemoyer, and Barzilay 2010), and querying external databases (Clarke et al. 2010), among others. Meanwhile, instructions are also widely adapted to help communicate with the system in solving NLP tasks, for example, following instructions to manipulate strings (Gaddy and Klein 2019), classifying e-mails based on the given explanations Mitchell 2017, 2018), and text-to-code generation (Acquaviva et al. 2022). ...

Reference:

Large Language Model Instruction Following: A Survey of Progresses and Challenges
Learning to Win by Reading Manuals in a Monte-Carlo Framework
  • Citing Conference Paper
  • January 2011

... Understanding action preconditions and effects in text is a crucial yet challenging task. Branavan et al. (2012) pioneer work in this area using reinforcement learning to extract high-level planning knowledge from text with the guidance of action preconditions and effects. Dalvi et al. (2018) develop a dataset and models for paragraph comprehension, and highlight the importance of tracking state changes in procedural text. ...

Learning High-Level Planning from Text
  • Citing Conference Paper
  • July 2012

... Another straightforward approach to using natural language for reward shaping can be seen in [26], which describes rules using natural language to decide whether to choose an action. [27][28] also perform language-to-reward mapping by designing rewards based on if the agent arrives at the right location or not by following the instruction provided. These approaches still face the problem of efficiently defining reward functions since it requires an expert programmer who can make decisions about how language instructions are mapped to the environment. ...

Learning to Win by Reading Manuals in a Monte-Carlo Framework
  • Citing Article
  • January 2014

Journal of Artificial Intelligence Research

... A few sentiment analysis systems take as input the sample speech of the user and perform prosodic processing of the speech signals. [16] takes the speech signal as input, converts it into text and then, only considers the text content for the affect analysis. The feature extraction is performed on the text review and adjective-noun pairs are identified as potential sentiment phrases. ...

Good grief, I can speak it! preliminary experiments in audio restaurant reviews

... One of the most well studied class of models on the space of rankings/permutations are the celebrated Mallows models, first introduced in [24]. Since then, Mallows models (and their variants) have received significant attention in statistics ( [9,10,13,14,15,25,28]), probability ( [3,6,12,17,19,29,30]), and machine learning ( [2,8,21,22,23,26,27]). In [28], the authors introduce a class of exponential family models on the space of permutations, which includes some of the commonly studied Mallows models. ...

Content Modeling Using Latent Permutations
  • Citing Article
  • October 2009

Journal of Artificial Intelligence Research

... Before the prevalence of Large Language Model, traditional autonomous agents primarily implement through reinforcement learning (Branavan et al., 2009;Shvo et al., 2021;Gur et al., 2022), semantic parsing (Li et al., 2020) and imitation learning (Humphreys et al., 2022) that clones human's keyboard and mouse actions . The recent trend is to use Large Language Model to generate GUI instructions and actions. ...

Reinforcement Learning for Mapping Instructions to Actions
  • Citing Conference Paper
  • January 2009

... For the non-NLP (multi-modal) tasks, most focused on environment-grounded language learning, i.e., driving the agent to associate natural language instructions with the environments and make corresponding reactions, such as selecting mentioned objects from an image/video (Matuszek et al. 2012;Krishnamurthy and Kollar 2013;Puig et al. 2018), following navigational instructions to move the agent (Tellex et al. 2011;Kim and Mooney 2012;Chen 2012;Artzi and Zettlemoyer 2013;Bisk, Yuret, and Marcu 2016), plotting corresponding traces on a map (Vogel and Jurafsky 2010; Chen and Mooney 2011), playing soccer/card games based on given rules (Kuhlmann et al. 2004;Eisenstein et al. 2009;Branavan, Silver, and Barzilay 2011;Babeş-Vroman et al. 2012;Goldwasser and Roth 2011), generating real-time sports broadcast (Chen and Mooney 2008;Liang, Jordan, and Klein 2009), controlling software (Branavan, Zettlemoyer, and Barzilay 2010), and querying external databases (Clarke et al. 2010), among others. Meanwhile, instructions are also widely adapted to help communicate with the system in solving NLP tasks, for example, following instructions to manipulate strings (Gaddy and Klein 2019), classifying e-mails based on the given explanations Mitchell 2017, 2018), and text-to-code generation (Acquaviva et al. 2022). ...

Reading Between the Lines: Learning to Map High-level Instructions to Commands
  • Citing Conference Paper
  • December 2010

... The originality of the work stems from using the n-gram concept to model nodes in a tree in addition to modeling sequential entities. Branavan, Deshpande and Barzilay (2007) propose a method that automatically generates a table-of-contents structure for long documents such as books. They first segment a document hierarchically and then generate an informative title for each segment. ...

Generating a Table-of-Contents.

... Mais ces modèles ne sont pas adéquats dans le cas où le domaine etudié possède une multitude d'aspects dont certains apparaissent trés rarement dans le corpus. Ainsi, certains auteurs [6,10,12,18] ont plutôt adopté des méthodes basées sur la modélisation des sujets. ...

Learning Document-Level Semantic Properties from Free-Text Annotations
  • Citing Conference Paper
  • January 2008

Journal of Artificial Intelligence Research

... To train a classifier, a vast amount of labeled training data is needed. With the advent of segment annotated datasets [2,3,4,5,6,7] deep neural network based methods started to emerge. A popular approach is to use a hierarchical structure, where the lower layer projects sentences into an embedding space, and the upper layer then classifies weather a sentence marks a topic change or not [5,8]. ...

Global Models of Document Structure using Latent Permutations.
  • Citing Conference Paper
  • January 2009