Content uploaded by Anna Babarczy
Author content
All content in this area was uploaded by Anna Babarczy
Content may be subject to copyright.
The Automatic Identification of Conceptual Metaphors in Hungarian Texts: A
Corpus-Based Analysis
Anna Babarczy
1
, Ildikó Bencze M.
1
, István Fekete
1
, Eszter Simon
1,2
1
Budapest University of Technology and Economics
Department of Cognitive Science
H-1111 Budapest, Stoczek u. 2.
E-mail: {babarczy, ibencze, ifekete, esimon}@cogsci.bme.hu
2
Research Institute for Linguistics, Hungarian Academy of Sciences
H-1068 Budapest, Benczúr u. 33.
E-mail: eszter@nytud.hu
Abstract
The present study is a corpus-based analysis of literal versus metaphorical language use. Previous corpus linguistic works have
focused on the linguistic characteristics of the metaphorical expressions. The main question of the present paper is whether the
automatic identification of certain conceptual metaphors could be successful taking the embodiment hypothesis as a starting point. 12
widespread conceptual metaphors were selected from Lakoff & Johnson (1980)
and the metaphor index in Kövecses (2002), where
consistent mapping was observed between a concrete (source) domain and an abstract (target) domain. According to our hypothesis, a
metaphoric sentence should include both source-domain and target-domain expressions. This assumption was tested relying on three
different methods of selecting target-domain and source-domain expressions: a psycholinguistic word association method, a dictionary
method and a corpus-based method The results show that for the automatic identification of metaphorical expressions, the corpus-
based method is the most effective strategy, which suggests that the concept of source and target domains is best characterized by
statistical patterns rather than by psycholinguistic factors.
Keywords: embodiment hypothesis, conceptual metaphors, association, corpus-based, automatic identification
1. The Theory of Metaphor
1.1 The Cognitive Theory of Metaphor
In everyday language use the term metaphor is held to be
a figure of speech which refers to an analogy between two
entities or concepts (e.g., Achilles was a lion). In cognitive
linguistics, in contrast, metaphor is first of all a conceptual
process, thus metaphorical relations are taken to be
conceptual mappings, which characterize not only our
language use but also our everyday life, thought and
behavior (Lakoff & Johnson, 1980). According to the
cognitive linguistic view, conceptual metaphors refer to
the understanding of an abstract concept, also called the
target domain, in terms of a concrete concept of which we
can have direct sensory experience, namely the source
domain. This underlying association between the two
domains is held to be systematic in both language and
thought.
The hypothesis that the representation of abstract concepts
in the mind/brain is grounded in the representation of
concrete knowledge, which in turn is grounded in our
bodily experience of the world, is the main statement of
the embodiment theory in cognitive linguistics (Gibbs,
2006; Kövecses, 2002; Lakoff & Johnson, 1980, 1999).
For example, people universally think and talk about the
abstract concept of “time” with the help of “space”, the
terms of which are acquired through our interaction with
the environment (before, after, under, in etc.).
Consequently, we can argue that the concept of “time” is
structured by the concept of “space” which means that
there is a
TIME IS SPACE
conceptual metaphor in our mind.
This hypothesis is supported by psycholinguistic
experiments: it has been shown, for instance, that sensory-
motor experiences influence the interpretation of
metaphorical expressions on "time" (Boroditsky &
Ramscar, 2002) which means that during the
understanding of metaphors people do physical motion
simulation, i.e. they imagine the actions or events
described by metaphorical expressions (Gibbs & Matlock,
2008). However, other experiments did not find evidence
for the necessity of conceptual metaphoric mappings in
comprehension of metaphorical expressions (Keysar et al.,
2000; Szamarasz, 2006). The problem whether in natural
language use abstract concepts are independent of
concrete concepts still remains an open question.
1.2 The Statistical Learning Theory
Another approach referring to the nature of abstract
knowledge is the statistical learning theory, which
argues that people acquire and structure their abstract
concepts with the help of the statistical properties of
language (Burgess & Lund, 1997; Landauer & Dumais,
1997). This means that novel linguistic symbols are
directly abstracted from known symbols without the
interference of metaphorical processes or embodied
schemes.
The two theoretical approaches do not necessarily exclude
one another since it is conceivable that our abstract
knowledge exploits both sources mentioned above.
According to this integrative point of view (Andrews et
al., 2005, 2007), both the attributive and distributive
properties of words play an important role in symbol
grounding. Attributive properties are non-linguistic
physical attributes associated with a word, while
distributive factors refer to common occurrences of a
word with other linguistic elements.
Based on our discussion so far, the present paper
investigates whether the automatic extraction of
conceptual metaphors in large corpora could be successful
taking the embodiment hypothesis as starting point, and
along with this, whether which strategy is the most
effective: the psycholinguistic word association method or
the corpus-linguistic method based on statistical patterns.
2. Metaphor and Corpus Linguistics
2.1 Corpus-Based Research on Metaphor
Corpus-based studies of metaphorical language use have
already pointed out the inadequacy of the cognitive theory
and also the defects of psycholinguistic experiences.
These critics claim that the theoretical and experimental
research neglect the linguistic attributes of metaphorical
expressions, and they do not use natural data but fictitious
examples, which might be misleading in some cases. For
example, Deignan (2008) demonstrates that according to
corpus-linguistic results the conceptual metaphor
AN
ANGRY GROUP OF PEOPLE IS A WILDFIRE
is more likely to
occur than the metaphor
ANGER IS THE PRESSURE OF
HEATED FLUID IN A CONTAINER
,
even though it is the latter
that is ubiquitously listed in works in cognitive theory.
Observed metaphorical patterns (Stefanowitsch, 2006)
and collocations (Deignan, 2005, 2008) also have
characteristic grammatical features. Similarly, Deignan
(2005) demonstrates that in metaphoric usage the words
have less grammatical liberty compared to their literal
occurrences. For example, the words belong to the source
domain in the metaphorical mapping tend to denote
actions and properties, and thus they occur mainly as
verbs and adjectives. These results show that the logical
relations between concrete entities are not simply
mirrored in abstract language use but undergo some kind
of change. This fact supports the so-called blending theory
(Fauconnier & Turner, 2002), which contends that during
metaphoric language use people create a mixed or blended
domain that has a proper structure and relations, and thus
proper linguistic features.
Taking all the evidence into account, it is clear that the
conceptual theory of metaphor alone is not able to explain
all the phenomena found in texts.
2.2 Methodological Problems in Automatic
Conceptual Metaphor Identification
The default method of metaphor annotation is manual
processing: based on their linguistic intuitions, researchers
mark expressions that they perceive as metaphorical in a
given corpus. Since this method is very labor-intensive
and time-consuming, it is worth experimenting with at
least partly automated techniques, such as searching a
corpus for expressions belonging to the source domain
(e.g., Deignan, 2008) or to the target domain
(Stefanowitsch, 2006) and manually checking the
extracted sentences for metaphoricity. Finally, it is also
possible to search the corpus for sentences containing
characteristic words from both the source and the target
domains of a given conceptual metaphor (e.g., Martin,
2006). The disadvantage of this method is that in this way
we can test only predetermined metaphorical mappings,
and, in contrast to the technique used by Stefanowitsch
(2006), the recovery of novel metaphors is precluded.
However, it has the advantage of a higher level of
automation in the annotation process allowing the
processing of larger corpora. It is this latter strategy that
our study attempts to enhance.
The first step of any of the above three (semi-) automated
methods is that expressions that are likely to characterize
either the source domain or the target domain of a given
metaphor type need to be collected. However, the
identification of the linguistic cues that may characterize a
particular domain is not a straightforward question. A
problem facing automatic metaphor annotation is that, in
general, the domains of conceptual mappings discussed in
the cognitive literature are associated with concepts rather
than specific linguistic forms. Our paper undertakes to
address this issue by testing three different methods of
compiling word lists characterizing the source versus the
target domains of a set of conceptual metaphors. The first
two methods rely on experimental psycholinguistic
evidence and on lexicographic data, while the third
approach is based on the manual analysis of a reference
corpus. In addition to the practical import of the results for
corpus analysis, the experiments also shed light on the
language theoretical issue discussed in Section 1. If either
of the first two methods proves to be more successful, we
have some support for the embodiment hypothesis. If,
however, the third method leads to the best results, the
statistical approach to metaphor proves to be more
plausible.
3. The Study: Automatic Identification of
Metaphors
The main question addressed by the present study is,
therefore, whether the automatic identification of certain
conceptual metaphors is feasible taking the concept of
source-to-target domain mapping as a starting point.
The experiment involved the following phases:
• A set of conceptual metaphors were selected from
the cognitive linguistic literature.
• A corpus was compiled using a variety of text
types.
• Word lists characterizing the source and the target
domains of the selected conceptual metaphors
were compiled using three different methods.
This resulted in three separate sets of source-
target word lists.
• Sentences containing at least one source-domain
word and at least one corresponding target-
domain word were automatically extracted from
the corpus. The three sets of word lists were used
in separate runs.
• The results were manually checked for precision
and recall.
3.1 Resources and Methods
3.1.1. The Conceptual Metaphors
12 widespread conceptual metaphors were selected from
Lakoff & Johnson (1980) and the metaphor index in
Kövecses (2002). The criteria for the selection process
were the following:
• The metaphor had to be general enough to be
found in many types of texts,
• The domains had to be suitable for providing
associations in a psycholinguistic experiment,
and
• There had to be a mapping from a concrete source
domain to an abstract target domain.
Based on the above, the following 12 conceptual
metaphors were chosen:
1.
ANGER IS HEAT
2.
CHANGE IS MOTION
3.
CONFLICT IS FIRE
4.
CONTROL IS UP
5.
CREATION IS BUILDING
6.
MORE IS UP
(
LESS IS DOWN
)
7.
POLITICS IS WAR
8.
PROGRESS IS MOTION FORWARD
9.
RESOURCES ARE FOOD
10.
THE MIND IS A MACHINE
11.
THEORIES ARE BUILDINGS
12.
TIME IS MONEY
3.1.2. The Corpus
The corpus was compiled observing two criteria: a variety
of genres should be represented; and the texts should be
accessible for research purposes in four different
languages. The genres include modern fiction from digital
libraries, popular science articles from the National
Geographic magazine and movie subtitles, the latter of
which was included as a representation of quasi-spoken
language. The criterion of multilingual availability was
needed in view of future plans of creating a multilingual
parallel corpus (Hungarian, English, Spanish and Italian)
with metaphor annotation. As the analysis has only been
completed for the Hungarian texts, the results described in
this paper apply to the Hungarian corpus. The sizes of the
Hungarian texts from the different genres are shown in
Table 1.
Text types Number of text words
National Geographic 68,997
Subtitles 32,148
Fiction 208,384
Total 309,529
Table 1: The content of the corpus.
The texts were converted to plain text format with UTF-8
character encoding. The morphological analyzer Hunpos
(Halácsy et al., 2007) was used to tag the Hungarian texts.
Hunpos was chosen because it is a Hidden Markov Model
based open source part-of-speech tagger, which can tag
any language once it has been trained on a pre-tagged
corpus. As the next step, the tagged corpus was converted
to XML format, which was our working format for
metaphor identification.
3.1.3. The Baseline Corpus
In order to obtain an estimate of the performance expected
from an automatic metaphor annotation method a baseline
corpus was constructed on which human inter-annotator
agreement was measured.
The baseline corpus was created by extracting 10%
(approximately 30,000 words) of the entire corpus in
which each genre was represented in the same proportion
as in the main corpus. The baseline corpus was
independently annotated for metaphors by two human
annotators.
The manual annotation followed a pre-defined procedure.
The procedure was based on the criteria defined by
Pragglejaz (2007). For example, classical idioms, i.e.,
fixed collocations which are not decomposable (e.g., pop
the question), “dead metaphors” or those which are
metaphorical only in etymological sense (e.g., the word
depression) were not classed as metaphorical. A rule was
further defined for each type of conceptual metaphor. For
example, in the case of the
MORE IS UP
conceptual
metaphor we applied the following rules: “Every
expression with a ‘quantity’ meaning which can be
visualized as moving along a vertical scale, e.g., price,
lease, temperature, should be annotated as a potential
target domain expression. Every sentence which contains
the word csúcs (‘top’) e.g., csúcsteljesítmény (‘top
performance’), csúcstechnológia (‘peak technology’)
should be annotated as metaphorical.”
At the first attempt, inter-annotator agreement was only
17%. After refining the annotation instructions, we made
a second attempt, which resulted in an agreement level of
48%, which is still a strikingly low value. These results
indicate that the definition of “metaphoricity” is
problematic in itself.
Some typical sources of disagreement between the
annotators are the following:
• In the absence of a statistical measure of semantic
distance, it was difficult to draw the line between words
directly referring to a concept belonging to the source
domain and those indirectly referring to it. For example,
in the case of the conceptual metaphors
ANGER IS HEAT
or
CONFLICT IS FIRE
, the source domain should be an
expression referring to a sort of “heated thing”. However,
in some cases, one or the other annotator included words
indirectly suggesting the presence of heat, such as kiolt
('extinguish'), kihől ( 'get cold') etc. Another case in point
is the phrase a memória élesítése (the sharpening of one’s
memory'), which or may or may not be an instance of the
conceptual metaphor
THE MIND IS A MACHINE
, depending
on whether the annotator is prepared to accept the indirect
association between machines and acts of sharpening.
• A second source of discrepancies was the fuzzy nature
of the boundary between ambiguous words having an
established abstract sense and metaphorical uses of
unambiguous words. For example, the expression
eljutottam a mai napig ('I've gotten to this day') may or
may not represent a
CHANGE IS MOTION
metaphor
depending on whether the Hungarian verb jut (literally:
get somewhere, reach a place by moving the entire body)
is taken only to denote physical movement or to be
ambiguous. The verb alapul ('be founded on something'),
which is derived from the noun alap (‘foundation’) is
similarly problematic since, although az elmélet alapjai
('foundations of the theory') is a good example for
THEORIES ARE BUILDINGS
, the verb derived from the
concrete noun can only have an abstract sense. The
question is, therefore, how far we should go in diachronic
or morphological analysis when making a decision of
metaphoricity.
• The level of inter-annotator agreement was further
lowered by discrepancies in the classification of
metaphorical expressions. Consider the following
example from the novel The Master and Margarita: "az
öreg elıbb megdöntötte mind az öt bizonyítékot, és aztán,
mintegy magamagából csúfot őzve, ı maga felállított egy
hatodikat."the old man first demolished all five
arguments and then, as if mocking himself, constructed
a sixth of his own'. This phrase were classified by one of
the annotators as a
THEORIES ARE BUILDINGS
metaphor,
while the other considered it to pertain to a
CREATION IS
BUILDING
type. Similarly, it is difficult to make an
informed decision on whether the following example
contains a
CHANGE IS MOTION
or a
PROGRESS IS MOTION
FORWARD
metaphor, neither of which appear to be an
intuitively correct choice: a járvány végigsöpört
szülıvárosukon ('the epidemic swept through their
hometown').
3.1.4. The Compilation of the Word Lists
For the automatic identification of metaphors, we
searched the corpus for sentences containing one or more
words characterizing the source domain and one or more
words representing the target domain of a given
conceptual metaphor. Three different methods of
compiling the word lists were tested: a) word association
experiment, b) dictionary of synonyms, and c) reference
corpus.
The first method is based on the assumption that the
expressions people associate with a key word for the
source domain and a key word for the target domain can
provide a lexical profile for a given metaphor type. The
word associations were collected in an online experiment.
138 students from the Budapest University of Technology
and Economics participated in the experiment. One key
word for each source and target domain (e.g., anger,
building, change, up, war) appeared on the screen one at a
time in randomized order and the participants had one
minute to type words they associated with the key word.
When the minute was up, the keyword disappeared and
participants were instructed to click a button when they
were ready for the next key word.
The lists obtained in the association experiment were
normalized: multiword expressions, proper names and
antonyms were filtered out, abbreviations were
completed, and finally, the words were stemmed by the
Hunmorph open source morphological analyzer (Trón et
al., 2005).
For each of the 12 conceptual metaphors, the resulting
two word association lists (one containing associations
provided for the source domain, and another providing
associations for the target domain) were taken to
constitute the metaphor’s lexical profile.
For the second method, the word lists obtained from the
association experiment were expanded with the synonyms
listed for the association words in the Magyar szókincstár
[Hungarian Word Thesaurus] (Kiss, 2007). Dialectal,
slang and obsolete expressions were omitted. Compared
to the association list, the size of the word lists
substantially increased (see Table 2). For the third -
corpus-based - method, the word lists for each source and
target domain were extracted from the manually annotated
baseline corpus. Due to the low level of inter-annotator
agreement obtained for the baseline corpus, the union of
sentences annotated as metaphorical by the two annotators
were used for compiling the corpus-based lists of source
and target domain words.
Method
Words
Psycho-
-linguistic Synonyms Corpus-
based
Source
domain 1239 6348 126
Target
domain 674 5094 120
Table 2. Number of words in source- and target-domain
lists compiled by the three methods
3.1.5. The Annotation Process and its Verification
Based on the three sets of word lists, the XML test corpus
was automatically annotated producing three files in
which the sentences were marked with tags showing the
type of conceptual metaphor the system identified. Each
of the three annotation versions were then verified
manually using the graphical interface of the GATE
application (Cunningham et al., 2002). Because of time
constraints, the manual verification was completed for
10% of the test corpus, where the different genres were
represented in the same proportion as in the entire corpus.
In this sub-corpus, a total of 155 sentences were identified
as metaphorical by two human annotators.
3.2 Results
The results of the three methods were quantified by the
precision and recall measures (Table 3). Precision shows
the proportion of the sentences correctly tagged as
metaphorical by the automatic system, while the recall
measure shows the percentage of metaphorical sentences
successfully identified by the system. The F-measure is
the weighted harmonic mean of these values, i.e. the final
indicator of the system’s performance.
Method Recall Precision F-
measure
Association 3.8% 7.5% 5.6%
Dictionary 18.1% 4.5% 11.3%
Corpus 31.3% 55.4% 43.3%
Table 3: Results of the three methods.
The results reveal that the association method covered
substantially fewer metaphorical sentences containing
both a source and a target expression than the other two
methods. This psycholinguistic method also performed
very poorly in terms of precision. When the association
word lists were expanded with synonyms, recall
somewhat improved but only at the cost of a decline in
precision. The corpus-based method was very clearly the
most successful of the three strategies. Taking all our
results into account, we must contend that the hypothesis
that the co-occurrence of psycholinguistically typical
source domain and target domain words in a sentence is a
good predictor of metaphoricity receives no empirical
support. Exploiting the statistical properties of texts leads
to considerably better but still not satisfying results.
3.3 Problem Cases
It is clear from the above discussion that deciding whether
a sentence is metaphorical or not is far from being a
straightforward task. The general experience of our
experiments is that if certain elements are difficult for a
human language user to find in a text, then the automatic
identification of these words also brings poor results. One
problem is that in several cases we must look beyond a
single sentence. The manual annotation identified several
sentences that were metaphorical but did not contain
words from both the source and the target domains, i.e.
they were problematic with regard to recall. There were
sentences in which a word denoting a concrete action in
its literal interpretation (source domain) referred to a
metaphorical event, which could only be deduced from
the extra-sentential context.
In other cases, the metaphoricity of the sentence was
signaled by a single word which incorporated both the
source and the target meaning.
Precision values were lowered by the frequent occurrence
of sentences which contained both a source and a target
expression but were not metaphorical. A typical example
is given below:
Mérnökök és vezetık tanakodnak kisebb csoportokban a
23 emelet magas fúrótorony tövében. (‘Small groups of
engineers and managers are discussing their options at
the base of the 23-storey tall oil-rig.’)
The word manager is a target-domain expression and the
adjective tall is a source-domain expression for the
metaphor
CONTROL IS UP
but the two words are
conceptually unrelated in this particular sentence.
4. Conclusions
The present paper investigated the automatic
identification of conceptual metaphors using corpus-
linguistic analyses, and found that the concept of source
and target domains is best characterized by statistical
patterns rather than by psycholinguistic factors. Since the
main objective of our study was to find the most effective
way of automatically identifying conceptual metaphors in
natural texts, we did not carry out a detailed grammatical
analysis of the examples or explore the possible
connection between the type of texts and the type of
metaphors occurring in them. However, it seems that our
research supports previous results of corpus-linguistic
analyses, in particular those regarding collocations and the
linguistic form of metaphorical expressions. This is also
confirmed by the fact that, while the lists compiled on the
basis of the association experiment had a very weak
predictive force, the targeted selection of the words
characteristic to conceptual domains brought the best
result, which means that not every association suggests
metaphoricity but only the common co-occurrences of
certain words and expressions. For example, in Hungarian
the co-occurrence of the verb pazarol (waste) and the
noun idı (time), or the verb gerjeszt (induce) and the noun
harag (anger) within a single sentence almost always
signals a metaphor.
Our analyses also found several examples highlighting the
importance of grammatical form: for example, in the case
of the conceptual metaphor
RESOURCES ARE FOOD
,
according to the reference corpus method the source
domain is represented mainly by verbs (fogyaszt
‘consume’, felfal ‘devour’, táplál ‘feed’), while the
majority of words collected in association experiment are
nouns (edény ‘dish’, fagylalt ‘ice cream’, reggeli
‘breakfast’ etc.). This observation supports the results
obtained by Deignan (2005) showing that for the majority
of metaphorical expressions, words referring to the source
domain are verbs or adjectives. The author argues that this
is because in metaphorical language use people try to
describe abstract entities, thus they take words denoting
behaviors, features or actions from the concrete source
domains. Of course, the confirmation of these hypotheses
requires a more comprehensive analysis of the metaphors
found so far. Our plans for the future involve the
expansion of the reference corpus and the extraction of a
larger word list for source and target domains. At the
same time, we intend to analyze the English, Spanish and
Italian versions of the texts, and to compare the results
with the Hungarian data, since cross-linguistic analyses
might reveal important factors in the conceptual nature of
metaphorical expressions.
5. References
Andrews, M., Vigliocco, G., Vinson, D. (2005). The role
of attributional and distributional information in
semantic representation. In B. Bara, L. Barsalou, & M.
Bucciarelli (Eds.), Proceedings of the Twenty Seventh
Annual Conference of the Cognitive Science Society.
Andrews, M., Vinson, D., Vigliocco, G. (2007).
Evaluating the Contribution of Intra-Linguistic and
Extra-Linguistic Data to the Structure of Human
Semantic Representations. In Proceedings of the
Cognitive Science Society.
Boroditsky, L., Ramscar, M. (2002). The roles of body
and mind in abstract thought. Psychological Science,
13, pp. 185–188.
Burgess, C., Lund, K. (1997). Representing abstract words
and emotional connotation in high-dimensional
memory space. In Proceedings of the Cognitive Science
Society, Hillsdale, NJ: Lawrence Erlbaum Associates,
pp. 61–66.
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.
(2002). GATE: A Framework and Graphical
Development Environment for Robust NLP Tools and
Applications. In Proceedings of the 40th Anniversary
Meeting of the Association for Computational
Linguistics (ACL'02), Philadelphia.
Deignan, A. (2005). Metaphor and corpus linguistics,
Amsterdam/Philadelphia: John Benjamins.
Deignan, A. (2008). Corpus linguistics and metaphor. In
R.W. Gibbs Jr. (Ed.), The Cambridge Handbook of
Metaphor and Thought, Cambridge: Cambridge
University Press, pp. 280–294.
Fauconnier, G., Turner, M. (2002). The way we think:
conceptual blending and the mind’s hidden
complexities. New York: Basicbooks.
Gibbs, R.W. (2006). Embodiment and cognitive science,
New York: Cambridge University Press.
Gibbs Jr., R.W., Matlock, T. (2008). Metaphor,
imagination and simulation. Psycholinguistic evidence.
In R.W. Gibbs Jr., (Ed.), The Cambridge Handbook of
Metaphor and Thought, Cambridge: Cambridge
University Press, pp. 161–176.
Keysar, B., Shen, Y., Glucksberg, S., Horton, W.S.
(2000). Conventional language: How metaphorical is
it? Journal of Memory and Language, 43, pp. 576–593.
Kiss, G. (2007). Magyar szókincstár [Hungarian Word
Thesaurus], Budapest: Tinta.
Kövecses, Z. (2002). Metaphor. A Practical Introduction,
Oxford: University Press.
Lakoff, G., Johnson, M. (1980). Metaphors we live by,
Chicago: University of Chicago Press.
Lakoff, G., Johnson, M. (1999). Philosophy in the Flesh:
The Embodied Mind and Its Challenge to Western
Thought, New York, NY: Basic Books.
Landauer, T.K., Dumais, S.T. (1997). A solution to Plato's
problem: the Latent Semantic Analysis theory of
acquisition, induction and representation of knowledge.
Psychological Review, 104(2), pp. 211–240.
Martin, J.H. (2006). A corpus-based analysis of context
effects on metaphor comprehension. In A.
Stefanowitsch,& S.Th. Gries (Eds.), Corpus-based
approaches to metaphor and metonymy, Berlin/New
York: Mouton de Gruyter, pp. 214–236.
Halácsy, P., Kornai, A., Oravecz, Cs. (2007). HunPos - an
open source trigram tagger. In Proceedings of the 45th
Annual Meeting of the Association for Computational
Linguistics Companion Volume Proceedings of the
Demo and Poster Sessions, Association for
Computational Linguistics, Prague, Czech Republic,
pp. 209–212.
Pragglejaz Group. (2007). MIP: A method for identifying
metaphorically used words in discourse. Metaphor and
Symbol, 22(1), pp. 1–39.
Stefanowitsch, A. (2006). Words and their metaphors: a
corpus-based approach. In A. Stefanowitsch & S.Th.
Gries (Eds.), Corpus-based approaches to metaphor
and metonymy, Berlin/New York: Mouton de Gruyter,
pp. 63–105.
Szamarasz, V.Z. (2006). Az idı téri metaforái: a
metaforák szerepe a feldolgozásban. Világosság, 47(8-
9-10), pp. 99–109.
Trón, V., Németh, L., Halácsy, P., Kornai, A., Gyepesi,
Gy., Varga, D. (2005). Hunmorph: open source word
analysis. In Proceedings of the ACL 2005 Workshop on
Software, pp. 77–85.