
Yael RavinIBM · Thomas J. Watson Research Center
Yael Ravin
About
27
Publications
5,452
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
755
Citations
Introduction
Skills and Expertise
Publications
Publications (27)
An embodiment of the invention provides a method for security classification applying social norming. More specifically, content is received from a user via an interface; and, a data repository connected to the interface stores the content. A portal connected to the data repository identifies an attempt to access the content from a non-user. A prog...
An embodiment of the invention includes a method and system for content management. More specifically, the system includes a user interface for receiving content from a user and a data repository connected to the user interface for storing the content. The user interface also receives a request to access the content from the user. A program process...
We describe a hybrid approach to improving search performance by providing a natural language front end to a traditional keyword-based search engine. The key component of the system is iterative query formulation and retrieval, in which one or more queries are automatically formulated from the user's question, issued to the search engine, and the r...
This paper first describes the fitting process and gives examples of ill-formed language situations where it is called into play. We then show how a fitted parse allows EPISTLE to carry on its text-critiquing mission where conventional grammars would fail either because of input problems or because of limitations in the grammars themselves. Some in...
To achieve our goat of buildintg a compre- hensive lexical database out of various on-line resources, it ' necessary to interpret and disambiguate tle information found in these resources. In this paper we describe a Disambiguation Module which analyzes the content of dictionary definitions, in particular, definitions of the form "to VERB with We d...
Identifying the occurrences of proper names in text and the entities they refer to can be a difficult task because of the many- to-many mapping between names and their referents. We analyze the types of ambi- guity -- structural and semantic -- that make the discovery of proper names difficult in text, and describe the heuristics used to disambigua...
This paper describes an exploration of the implicit synonymy relationship expressed by synonym lists in an on-line thesaurus. A series of automatic steps was taken to propcdy constrain this relationship. The resulting groupings of semantically related word senses are believed to constitute a useful tool for natural language processing and for work...
This paper describes a toolkit for building multiagent autonomic systems. The IBM Agent Building and Learning Environment (ABLE) provides a lightweight Java™ agent framework, a comprehensive JavaBeans™ library of intelligent software components, ...
A brief discussion about machine intelligence and the Turing test is presented. Natural language understanding (NLU) and machine reasoning (MR) were used for the Turing test. The simulation of human capability to create a store of prior knowledge and a representation of the meaning of the current text was also discussed.
versions of conference papers, as is often the case for edited works, but seem to have been specially commissioned for the purposes of this book, which makes it even more exciting to examine. The book is composed of 11 chapters. It is not formally divided into parts, but chapters dealing more specifically with the computational aspects of polysemy...
A fundamental aspect of knowledge management is capturing knowledge and expertise created by knowledge workers as they go about their work and making it available to a larger community of colleagues. Technology can support these goals, and knowledge portals have emerged as a key tool for supporting knowledge work. Knowledge portals are single-point...
A number of research and software development groups have developed name identification technology, but few have addressed the issue of cross-document coreference, or identifying the same named entities across documents. In a collection of documents, where there are multiple discourse contexts, there exists a manyto -many correspondence between nam...
se structures can be captured as sequences of this form: ((A|N)+|((A|N)*(NP)?)(A|N)*)N where A is an adjective, N is a noun, and P is a preposition [1]. The name extractor [5] considers every sequence of capitalized words (with some exceptions) as a potential name. Names are grouped in sets, associated with a single referent, of a given type, such...
: We descr be nator, a module we developed to extract proper names from natural language text, wh ch s currently be ng ntegrated nto IBM products and serv ces. Us ng fast and robust heur st cs, N, nator locates names n text, determ nes what type of ent ty they refer to -- such as person, place or organ zat on -- and groups together all the var ant...
A number of research and software development groups have developed technology for identifying terms and names in documents and associating them with concepts and named entries, but few have addressed coreference of concepts and entities across multiple documents in a collection. Cross-document coreference is challenging, since a collection of docu...
A number of research and software development groups have developed name identification technology, but few have addressed the issue of cross-document coreference, or identifying the same named entities across documents. In a collection of documents, where there are multiple discourse contexts, there exists a many-to-many correspondence between nam...
: We describe Nominator, a module we developed to extract proper names from natural language text, which is currently being integrated into IBM products and services. Using fast and robust heuristics, Nominator locates names in text, determines what type of entity they refer to -- such as person, place or organization -- and groups together all the...
robabilistic ranking uses a unique feature called "Lexical Affinity" (LA). LA between two terms is a correlation measure of their common occurrences in text. as defined by Maareck (1991). The occurrences of correlated pairs of words in a document are ranked higher than the occurrences of the individual words over greater distances. The analyzed que...
Processing syntactically ill-formed language is an important mission of a text-critiquing system. This chapter discusses how ill-formed input is treated by Epistle, the forerunner of Critique. Misspellings are highlighted by a standard spelling checker; syntactic errors are detected and corrections are suggested; and stylistic infelicities are call...
Introduction: The relationship between syntax and semantics A restrictive versus a non-restrictive approach Fillmore's case theory Chomsky's theory of government and binding Jackendoff's semantic theory The MLP correction theory A theory of semantic decomposition (I) A theory of semantic decomposition (II) An analysis of some event concepts (I) An...
Grammar errors and style weaknesses identified by CRITIQUE, a text
processing system developed at the IBM T.J. Watson Research Center, are
discussed. Linguistic criteria for distinguishing between grammar and
style are drawn first. These criteria are reflected in the messages
issued by CRITIQUE to the user. Then, a computational criterion for
disti...
This paper describes an exploration of the implicit synonymy relationship expressed by synonym lists in an on-line thesaurus. A series of automatic steps was taken to properly constrain this relationship. The resulting groupings of semantically related word senses are believed to constitute a useful tool for natural language processing and for work...