Janara Christensen’s research while affiliated with University of Washington and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (7)


Hierarchical Summarization: Scaling Up Multi-Document Summarization
  • Conference Paper

June 2014

·

132 Reads

·

55 Citations

Janara Christensen

·

Stephen Soderland

·

Gagan Bansal

·


Teaching Classification Boundaries to Humans

June 2013

·

38 Reads

·

59 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

Given a classification task, what is the best way to teach the resulting boundary to a human? While machine learning techniques can provide excellent methods for finding the boundary, including the selection of examples in an online setting, the y tell us little about how we would teach a human the same task. We propose to investigate the problem of example selection and presentation in the context of teaching humans, and explore a variety of mechanisms in the interests of finding what may work best. In particular, we begin with the baseline of random presentation and the n examine combinations of several mechanisms: the indication of an example's relative difficulty, the use of the shaping heuristic from the cognitive science literature (moving from easier examples to harder ones), and a novel kernel-based "coverage model" of the subject's mastery of the task. From our experiments on 54 human subjects learning and performing a pair of synthetic classification tasks via our teaching system, we found that we can achieve the greatest gains with a combination of shaping and the coverage model. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.


Figure 3: LargeCorpus: F-measure achieved in a given amount of computation time. The hybrid extractor obtains the best F-measure for binary extractions.
In SmallCorpus, SRL-IE-Lund has the highest precision. Taking the union of the SRL systems and the higher precision results from TextRunner achieves the highest recall and F-measure. Both SRL-based systems require over an order of magnitude more processing time. The bold values indicate the highest values for the metric and relation-type.
An Analysis of Open Information Extraction based on Semantic Role Labeling
  • Conference Paper
  • Full-text available

June 2011

·

261 Reads

·

193 Citations

Open Information Extraction extracts relations from text without requiring a pre-specified domain or vocabulary. While existing techniques have used only shallow syntactic features, we investigate the use of semantic role labeling techniques for the task of Open IE. Semantic role labeling (SRL) and Open IE, although developed mostly in isolation, are quite related. We compare SRL-based open extractors, which perform computationally expensive, deep syntactic analysis, with TextRunner, an open extractor, which uses shallow syntactic analysis but is able to analyze many more sentences in a fixed amount of time and thus exploit corpus-level statistics. Our evaluation answers questions regarding these systems, including, can SRL extractors, which are trained on PropBank, cope with heterogeneous text found on the Web? Which extractor attains better precision, recall, f-measure, or running time? How does extractor performance vary for binary, n-ary and nested relations? How much do we gain by running multiple extractors? How do we select the optimal extractor given amount of data, available time, types of extractions desired?

Download

Open Information Extraction: The Second Generation.

January 2011

·

312 Reads

·

437 Citations

·

Anthony Fader

·

Janara Christensen

·

[...]

·

How do we scale information extraction to the massive size and unprecedented heterogeneity of the Web corpus? Beginning in 2003, our KnowItAll project has sought to extract high-quality knowledge from the Web. In 2007, we introduced the Open Information Extraction (Open IE) paradigm which eschews hand-labeled training examples, and avoids domain-specific verbs and nouns, to develop unlexicalized, domain-independent extractors that scale to the Web corpus. Open IE systems have extracted billions of assertions as the basis for both common-sense knowledge and novel question-answering systems. This paper describes the second generation of Open IE systems, which rely on a novel model of how relations and their arguments are expressed in English sentences to double precision/recall compared with previous systems such as TEXTRUNNER and WOE.


Semantic role labeling for open information extraction

January 2010

·

90 Reads

·

60 Citations

Open Information Extraction is a recent paradigm for machine reading from arbitrary text. In contrast to existing techniques, which have used only shallow syntactic features, we investigate the use of semantic features (semantic roles) for the task of Open IE. We compare TextRunner (Banko et al., 2007), a state of the art open extractor, with our novel extractor SRL-IE, which is based on UIUC's SRL system (Punyakanok et al., 2008). We find that SRL-IE is robust to noisy heterogeneous Web data and outperforms TextRunner on extraction quality. On the other hand, TextRunner performs over 2 orders of magnitude faster and achieves good precision in high locality and high redundancy extractions. These observations enable the construction of hybrid extractors that output higher quality results than TextRunner and similar quality as SRL-IE in much less time.


A rose is a roos is a ruusu

January 2009

·

12 Reads

We query Web Image search engines with words (e.g., spring) but need images that correspond to particular senses of the word (e.g., flexible coil). Querying with polysemous words often yields unsatisfactory results from engines such as Google Images. We build an image search engine, Idiom, which improves the quality of returned images by focusing search on the desired sense. Our algorithm, instead of searching for the original query, searches for multiple, automatically chosen translations of the sense in several languages. Experimental results show that Idiom outperforms Google Images and other competing algorithms returning 22% more relevant images.


A Rose is a Roos is a Ruusu: Querying Translations for Web Image Search.

January 2009

·

23 Reads

·

5 Citations

We query Web Image search engines with words (e.g., spring) but need images that correspond to particular senses of the word (e.g., flexible coil). Querying with poly- semous words often yields unsatisfactory results from engines such as Google Im- ages. We build an image search engine, IDIOM, which improves the quality of re- turned images by focusing search on the desired sense. Our algorithm, instead of searching for the original query, searches for multiple, automatically chosen trans- lations of the sense in several languages. Experimental results show that IDIOM out- performs Google Images and other com- peting algorithms returning 22% more rel- evant images.

Citations (6)


... As for many other concepts in machine learning (ML), curriculum learning (CL) owes parts of its appeal to its relatable inspiration from human learning: in the same way that children are first taught clear and distinguishable concepts before more difficult or nuanced ones [1], [2], ML should also benefit from a curriculum structure, in which models are first confronted with easy examples before difficulty is steadily increased. In the broader optimisation background of deep neural networks (DNNs), this idea could be conceptualised as easier samples providing a smoother loss landscape, allowing models to quickly reach favourable parameter regions and finally converge therein [3], [4]. ...

Reference:

Does the Definition of Difficulty Matter? Scoring Functions and their Role for Curriculum Learning
Teaching Classification Boundaries to Humans
  • Citing Article
  • June 2013

Proceedings of the AAAI Conference on Artificial Intelligence

... Other works present automatically derived hierarchically ordered summaries allowing users to drill down from a general overview to detailed information [36,37]. Therefore, these systems are neither interactive nor consider the user's feedback to update their internal summarization models. ...

Hierarchical Summarization: Scaling Up Multi-Document Summarization
  • Citing Conference Paper
  • June 2014

... The tuple includes a relational phrase and multiple or a pair of argument phrases, which are semantically connected by the relational phrase. For the collection of extraction patterns, some studies use hand-crafted rules, 3,16,17 whereas others learn from automatically labeled training datasets. 1,2,4 Additionally, a number of studies improved the accuracy of Open IE by transforming complex sentences, including several clauses into a collection of simplified independent clauses. ...

Semantic role labeling for open information extraction
  • Citing Article
  • January 2010

... It utilizes a set of patterns in order to obtain propositions but does not capture the 'context' of each clause for effective extraction. A follow-up study relies on semantic features (semantic roles) for the OIE task, demonstrating that Semantic role labeling (SRL) can be used to increase the precision and recall of OIE [8]. Separately, a greedy parser, which relies on a classifier to predict the correct transition based on a small number of dense features, is treated for speedy parsing [6]. ...

An Analysis of Open Information Extraction based on Semantic Role Labeling

... We follow [89] and construct a directed graph G = (V , E) for an input word sequence σ = (w 1 , . . . , w n ) containing a set of words to be disambiguated, 22 based on the lexical and semantic relations found in a given knowledge resource KBi.e., WordNet or BabelNet, in our case. In order to build the graph, we follow the same procedure used to create graphs for estimating mapping probabilities (Section 3.1.3). ...

A Rose is a Roos is a Ruusu: Querying Translations for Web Image Search.
  • Citing Conference Paper
  • January 2009

... Even the largest VQA datasets can not contain all realworld concepts. So VQA models should know how to acquire VOLUME 4, 2016 Examples of such EKBs are, large scale KBs constructed by human annotation, e.g., DBpedia [11], Freebase [16], Wikidata [127] and automatic extraction from unstructured/semistructured data, e.g., YAGO [48], [80], OpenIE [12], [34], [35], NELL [22], NEIL [25], WebChild [118], ConceptNet [76]. ...

Open Information Extraction: The Second Generation.
  • Citing Conference Paper
  • January 2011