Article

Towards a machine-learning architecture for lexical functional grammar parsing

Chrupała, Grzegorz (2008) Towards a machine-learning architecture for lexical functional grammar parsing. PhD thesis, Dublin City University DOI:550
Source: OAI

ABSTRACT Data-driven grammar induction aims at producing wide-coverage grammars of human languages. Initial efforts in this field produced relatively shallow linguistic representations such as phrase-structure trees, which only encode constituent structure. Recent work on inducing deep grammars from treebanks addresses this shortcoming by also
recovering non-local dependencies and grammatical relations. My aim is to investigate the issues arising when adapting an existing Lexical Functional Grammar (LFG) induction method to a new language and treebank, and find solutions which will generalize robustly across multiple languages.
The research hypothesis is that by exploiting machine-learning algorithms to learn morphological features, lemmatization classes and grammatical functions from treebanks we can reduce the amount of manual specification and improve robustness, accuracy and domain- and language -independence for LFG parsing systems. Function labels can often be relatively straightforwardly mapped to LFG grammatical functions. Learning them reliably permits grammar induction to depend less on language-specific LFG annotation rules. I therefore propose ways to improve acquisition of function labels from treebanks and translate those improvements into better-quality f-structure parsing.
In a lexicalized grammatical formalism such as LFG a large amount of syntactically relevant information comes from lexical entries. It is, therefore, important to be able
to perform morphological analysis in an accurate and robust way for morphologically rich languages. I propose a fully data-driven supervised method to simultaneously
lemmatize and morphologically analyze text and obtain competitive or improved results on a range of typologically diverse languages.

0 0
 · 
0 Bookmarks
 · 
25 Views

Full-text

View
0 Downloads
Available from

Keywords

better-quality f-structure parsing
 
Data-driven grammar induction
 
encode constituent structure
 
existing Lexical Functional Grammar
 
Initial efforts
 
language-specific LFG annotation rules
 
lexicalized grammatical formalism
 
LFG grammatical functions
 
manual specification
 
morphological analysis
 
morphological features
 
morphologically rich languages
 
new language
 
phrase-structure trees
 
Recent work
 
research hypothesis
 
robust way
 
shallow linguistic representations
 
typologically diverse languages
 
wide-coverage grammars