ArticlePDF Available

The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods



We are in the midst of a technological revolution whereby, for the first time, researchers can link daily word use to a broad array of real-world behaviors. This article reviews several computerized text analysis methods and describes how Linguistic Inquiry and Word Count (LIWC) was created and validated. LIWC is a transparent text analysis program that counts words in psychologically meaningful categories. Empirical results using LIWC demonstrate its ability to detect meaning in a wide variety of experimental settings, including to show attentional focus, emotionality, social relationships, thinking styles, and individual differences.
Journal of Language and Social Psychology
29(1) 24 –54
© 2010 SAGE Publications
DOI: 10.1177/0261927X09351676
The Psychological Meaning
of Words: LIWC and
Computerized Text
Analysis Methods
Yla R. Tausczik1 and James W. Pennebaker1
We are in the midst of a technological revolution whereby, for the first time, researchers
can link daily word use to a broad array of real-world behaviors. This article reviews
several computerized text analysis methods and describes how Linguistic Inquiry and
Word Count (LIWC) was created and validated. LIWC is a transparent text analysis
program that counts words in psychologically meaningful categories. Empirical results
using LIWC demonstrate its ability to detect meaning in a wide variety of experimental
settings, including to show attentional focus, emotionality, social relationships, thinking
styles, and individual differences.
computerized text analysis, LIWC, relationships, dominance, deception, attention,
James J. Bradac (1986, 1999) celebrated the many ways that scientists could simulta-
neously study both language and human communication. He understood the value of
highly controlled laboratory studies and, at the same time, the importance of exploring
the ways people naturally talk in the real world. Of particular importance to him, how-
ever, was that language research replicates its theories and findings across a wide
array of methods and samples. This article draws heavily from Bradac’s approach to
research by applying a new array of computer-based text analysis tools to the study of
everyday language.
1University of Texas at Austin, Austin, TX, USA
Corresponding Author:
James W. Pennebaker, Department of Psychology, University of Texas at Austin, Austin, TX 78712, USA
Tausczik and Pennebaker 25
The words we use in daily life reflect who we are and the social relationships we
are in. This is neither a new nor surprising insight. Language is the most common and
reliable way for people to translate their internal thoughts and emotions into a form
that others can understand. Words and language, then, are the very stuff of psychology
and communication. They are the medium by which cognitive, personality, clinical,
and social psychologists attempt to understand human beings.
The simultaneous development of high-speed personal computers, the Internet, and
elegant new statistical strategies have helped usher in a new age of the psychological
study of language. By drawing on massive amounts of text, researchers can begin to
link everyday language use with behavioral and self-reported measures of personality,
social behavior, and cognitive styles. Beginning in the early 1990s, we stumbled on
the remarkable potential of computerized text analysis through the development of our
own computer program—Linguistic Inquiry and Word Count (LIWC; Pennebaker,
Booth, & Francis, 2007). We are now witnessing new generations of text analysis
coming from computer sciences and computational linguistics.
This article is divided into three sections. The first is a brief history of text analysis
in psychology. The second focuses on our own efforts to develop LIWC along with
some of the basic psychometrics of words. The third explores the links between word
usage and basic social and personality processes.
Computerized Text Analysis: A Brief History
The roots of modern text analysis go back to the earliest days of psychology. Freud
(1901) wrote about slips of the tongue whereby a person’s hidden intentions would
reveal themselves in apparent linguistic mistakes. Rorschach and others (e.g.,
Holtzman, 1950; Rorschach, 1921) developed projective tests to detect people’s
thoughts, intentions, and motives from the way they described ambiguous inkblots.
McClelland and a generation of thematic apperception test (TAT) researchers (e.g.,
McClelland, 1979; Winter, 1998) found that the stories people told in response to
drawings of people could provide important clues to their needs for affiliation, power,
and achievement. In all cases, trained raters read the transcripts of people’s descrip-
tions and tagged words or phrases that represented the dimensions the investigators
were studying.
More general and less stimulus-bound approaches began to evolve in the 1950s.
Gottschalk and his colleagues (e.g., Gottschalk & Gleser, 1969; Gottschalk, Gleser,
Daniels, & Block, 1958) developed a content-analysis method by which to track
Freudian themes in text samples. The original Gottschalk method required patients to
talk in a stream of consciousness way into a tape recorder for 5 minutes. The language
samples were transcribed and broken down into grammatical phrases. Judges, then,
evaluated each phrase to determine the degree it might reflect one or more themes
related to anxiety (e.g., death, castration), hostility toward self or others, and various
interpersonal and psychological topics. The Gottschalk method later was used in the
psychiatric diagnoses of cognitive impairments, alcohol abuse, brain damage, and
mental disorders. Attempts to translate the original Gottschalk–Gleser scoring scheme
26 Journal of Language and Social Psychology 29(1)
to a computer program have proven difficult with modest correlations to the judge-
based “gold standard” (e.g., Gottschalk & Bechtel, 1993).
The first general purpose computerized text analysis program in psychology was
developed by Philip Stone and his colleagues (Rosenberg & Tucker, 1978; Stone,
Dunphy, Smith, & Ogilvie, 1966). Using a mainframe computer, the authors built a
complex program that adapted McClelland’s need-based coding schemes to any open-
ended text. The program, called General Inquirer, relied on a series of author-developed
algorithms. The General Inquirer and other programs like it (e.g., Hart’s, 1984, DICTION
program; Martindale, 1990) have proven valuable in distinguishing mental disorders,
assessing personality dimensions, and evaluating speeches. One limitation of these
approaches is that they have relied on the manipulation and weighting of language
variables that were not visible to the user.
The first truly transparent text analysis method was pioneered by Walter Weintraub
(1981, 1989). Weintraub, a physician by training, became fascinated by the everyday
words people used—words such as pronouns and articles. Over the span of a decade,
he hand-counted people’s words in texts such as political speeches and medical inter-
views. He noticed that first-person singular pronouns (e.g., I, me, my) were reliably
linked to people’s levels of depression. Although his methods were straightforward
and his findings consistently related to important outcome measures, his work was
largely ignored. His observation that the simple words of everyday speech reflected
psychological state nevertheless was prescient. (See also the work of Mergenthaler,
1996, who developed a computer program TAS/C that taps abstraction and emotion in
psychotherapy sessions.)
The Development of LIWC and
the Psychometrics of Words
In the 1980s, we discovered that when people were asked to write about emotional
upheavals in their lives they subsequently evidenced improvements in physical health
(e.g., Pennebaker & Beall, 1986). The first group of writing studies generated hun-
dreds of writing samples that revealed deeply moving human stories. Intuitively, the
ways the stories were written should have been related to whether people’s health
improved or not. In an attempt to link the stories with health outcomes, judges were
asked to read the emotional essays and to rate them along multiple dimensions. Some
of the categories included the degree to which the stories were organized, coherent,
personal, emotional, vivid, optimistic, and evidenced insight.
Relying on judges’ ratings yielded three important findings: (a) even with in-depth
training, judges do not agree with each other in rating most dimensions when evaluat-
ing a broad range of deeply personal stories; (b) rating essays by multiple judges is
extremely slow and expensive; and (c) judges tend to get depressed when reading
depressing stories.
To find a more efficient evaluation method, we turned to the promise of computer-
ized text analysis programs to assess the essays. At the time, no simple text analysis
program existed. Consequently, Martha Francis and the second author began the task
Tausczik and Pennebaker 27
of developing one. Our goal was to create a program that simply looked for and
counted words in psychology-relevant categories across multiple text files. The result
has been an ever-changing computer program named Linguistic Inquiry and Word
Count, or LIWC (pronounced “Luke”).
The Logic and Development of LIWC
The LIWC program has two central features—the processing component and the diction-
aries. The processing feature is the program itself, which opens a series of text files—which
can be essays, poems, blogs, novels, and so on—and then goes through each file word by
word. Each word in a given text file is compared with the dictionary file.
For example, if LIWC were analyzing the first line of the novel Paul Clifford by
Edward Bulwer-Lytton (1842):
It was a dark and stormy night
the program would first look at the word “it” and then see if “it” was in the
It is and is coded as a function word, a pronoun, and, more specifically, an imper-
sonal pronoun. All three of these LIWC categories would then be incremented. Next,
the word “was” would be checked and would be found to be associated with the cat-
egories of verbs, auxiliary verbs, and past tense verbs.
After going through all the words in the novel, LIWC would calculate the percent-
age of each LIWC category. So, for example, we might discover that 2.34% of all the
words in a given book were impersonal pronouns and 3.33% were auxiliary verbs. The
LIWC output, then, lists all LIWC categories and the rates that each category was used
in the given text.
The dictionaries are the heart of the LIWC program. A dictionary refers to the col-
lection of words that define a particular category. When LIWC was first created, the
goal was fairly modest. We simply wanted the computer to calculate the percentage of
positive and negative emotion words within a text. To do this, we needed to specify
exactly which words to look for. Based on our judges’ ratings, we also wanted to include
measures of thinking styles—for example, signs of self-reflection, and causal think-
ing. Over several weeks, the number of categories we were interested in expanded
from the original 2 to more than 80.
Across the 80 categories, several language dimensions are straightforward. For
example, the category of articles is made up of three words: “a,” “an,” and “the.”
Other dimensions are more subjective. For example, the emotion word categories
required human judges to evaluate which words were suited for which categories. For
all subjective categories, an initial selection of word candidates was gleaned from
dictionaries, thesauruses, questionnaires, and lists made by research assistants. Groups
of three judges then independently rated whether each word candidate was appropriate
to the overall word category.
28 Journal of Language and Social Psychology 29(1)
All category word lists were updated by the following set of rules: (a) a word
remained in the category list if two out of three judges agreed it should be included;
(b) a word was deleted from the category list if at least two of the three judges agreed
it should be excluded; and (c) a word was added to the category list if two out of three
judges agreed it should be included. This entire process was then repeated a final time
by a separate group of three judges. The final percentages of judges’ agreement for this
second rating phase ranged from 93% to 100% agreement.
The initial LIWC judging took place between 1992 and 1994. A significant LIWC
revision was undertaken in 1997 and again in 2007 to streamline the original program
and dictionaries. Text files from several dozen studies, totaling more than 100 million
words were analyzed. Some low base rate word categories were deleted and others
were added. For details of the process and specific findings, see Pennebaker, Chung,
Ireland, Gonzales, and Booth (2007).
The Psychometrics of Word Usage
Unlike the typical development of a new measurement instrument, verifying the valid-
ity and reliability of word usage is trickier. Consider how psychologists typically
develop and test a new measurement instrument. For questionnaires, for example,
after specific questions have been generated and initially tested, the investigator com-
putes reliability statistics to be sure that all items are correlated with the sum of the
remaining items. Generally, a factor analysis of the items is run to see if the items reflect
more than one dimension. Next, the investigator computes the test–retest reliability of
the questionnaire. And, finally, there are a series of validation tests to see if the question-
naire correlates with or predicts real-world behaviors that it is supposed to measure.
Word categories are unlike questionnaire items. Words are rarely normally distrib-
uted, they generally have low base rates, and standard measures of reliability are not
always appropriate. Consider, for example, the category of articles—“a,” “an,” and
“the.” All three words serve the same function, which is to signal the upcoming use of
a concrete noun. From a classically trained psychometric perspective, for us to con-
sider “articles” to be a coherent, internally consistent category, use of the three words
should be highly correlated with each other—with Cronbach’s a of at least .60 or .70,
it is hoped. Tragically, words do not adhere to traditional psychometric laws that we
see in questionnaires. For example, our lab frequently relies on a random assortment
of about 2,800 text files that includes a wide range of text genres, including blogs,
experimental essays, poetry, books, science articles, and natural speech transcripts to
examine the psychometrics of words. Within this text corpus, articles represent 5.43%
of all words used (where “a” = 1.96, “an” = 0.19, “the” = 3.27). The intercorrelation
among these words is low but highly significant (“a” with “an” = .13, “a” with
“the” = .09, “an” with “the” = .09), resulting in Cronbach’s a of .14 (for a summary of
all reliability statistics, see Pennebaker et al., 2007).
Note that assessing the psychometrics of word use is even more complicated than
what the above statistics suggest. To get reliability data for a questionnaire, we typi-
cally give people the same test of often-redundant questionnaire items on two occasions.
Tausczik and Pennebaker 29
In theory, the questionnaire has exactly the same meaning on the two administrations.
Asking people to, say, describe themselves on two occasions will generally evoke dif-
ferent types of responses. For example, within the open-ended response itself, people
generally don’t repeat themselves (meaning one rarely gets good split-half reliability).
Second, if people tell an experimenter who they are today, they will likely change their
stories next time either because they have changed a bit or they want the experimenter
to have a fuller sense of who they were from the previous time. Furthermore, saying
the same thing as they did to the person on the first occasion would be redundant and,
perhaps, a bit rude. In short, the psychometrics of word use pose a new set of problems
that questionnaires avoid.
Content Versus Style Words
When LIWC was first developed, the goal was to devise an efficient system that could
tap both psychological processes and the content of what people were writing or talk-
ing about. Within a few years, it became clear that there are two very broad categories
of words that have different psychometric and psychological properties. Content words
are generally nouns, regular verbs, and many adjectives and adverbs. They convey the
content of a communication. To go back to the phrase “It was a dark and stormy night”
the content words are: “dark,” “stormy,” and “night.” Intertwined through these content
words are style words, often referred to as function words. Style or function words are
made up of pronouns, prepositions, articles, conjunctions, auxiliary verbs, and a few
other esoteric categories. In the phrase these words are “it,” “was,” “a,” and “and.”
Although we tend to have almost 100,000 English words in our vocabulary, only
about 500 (or 0.05%) are style words. Nevertheless, style words make up about 55%
of all the words we speak, hear, and read. Furthermore, content and style words tend
to be processed in the brain very differently (Miller, 1995).
From a psychological perspective, style words reflect how people are communicat-
ing, whereas content words convey what they are saying. It is not surprising, then, that
style words are much more closely linked to measures of people’s social and psycho-
logical worlds. Indeed, the ability to use style words requires basic social skills.
Consider the sentence, “I will meet you here later.” Although grammatically correct,
the sentence has no real meaning unless the reader knows who “I” and “you” refer to.
Where is “here” and what is meant by “later”? These are all referents that are shared
by two people in a particular conversation taking place at a particular time. To say this
implies that the speaker knows that the listener shares the same knowledge of these
style words (cf. Chung & Pennebaker, 2007).
Caveats concerning computer text analysis. Psychologists are always looking for mea-
sures that reveal the secret, hidden, or distorted “real” self. Freud’s popularity was
partly attributable to his assertion that subconscious thoughts, emotions, and experi-
ences drove our behavior. People continue to be enthralled with his methods of dream
analysis, slips of the tongue, and other psychoanalytic claims. This trend continues
with a new generation of measures and theories that rely on a host of implicit measures
such as the implicit association test (IAT; Greenwald, McGhee, & Schwartz, 1998),
30 Journal of Language and Social Psychology 29(1)
priming strategies, and various imaging techniques such as functional MRI that all
hold out the promise of discovering the “real” person. Many people consider the anal-
ysis of language—especially function or style words—to do the same. And, indeed,
they sometimes can reveal social psychological processes that people are not able to
easily conceal.
Despite the appeal of computerized language measures, they are still quite crude. Pro-
grams such as LIWC ignore context, irony, sarcasm, and idioms. The word “mad,” for
example, is currently coded as an anger word. When people say things such as “I’m mad
about him,” or “He’s as mad as a hatter” the meaning and intent of their utterances will be
miscoded. LIWC, like any computerized text analysis program, is a probabilistic system.
The study of word use as a reflection of psychological state is in its earliest stages.
As described below, studies are providing evidence that function words can detect
emotional and biological states, status, honesty, and a host of individual differences.
Nevertheless, the imprecise measurement of word meaning and psychological states
themselves should give pause to anyone who relies too heavily on accurately detecting
people’s true selves through their use of words.
The Social and Psychological
Meaning of Words
The words we use in daily life reflect what we are paying attention to, what we are
thinking about, what we are trying to avoid, how we are feeling, and how we are orga-
nizing and analyzing our worlds. The 80 language categories in LIWC have been
linked in hundreds of studies to interesting psychological processes. In this section, we
give a brief discussion of psychological processes and a small set of related of lan-
guage categories. The section concludes with a comprehensive summary of findings
about the correlates of word categories from a large group of studies.
Attentional Focus: Pronouns and Verb Tense
Tracking people’s attention reveals information about their priorities, intentions, and
thoughts. Infants, for example, focus on objects that display novelty, complexity, and
motion (Berlyne, 1960), which shows the extent to which they are focused on learning. Our
attention can oscillate from our external worlds to our internal feelings or sensations (e.g.,
Pennebaker, 1982). If we are playing a game of tennis, we might bruise our arm and not
notice because our full attention is on the game itself. Alternatively, if the injury is signifi-
cant, the pain may be so attention grabbing that we no longer are aware of the game at all.
Tracking language use such as tracking people’s gaze can tell us where they are
attending. At the most superficial level, content word categories explicitly reveal
where individuals are focusing. Those thinking about death, sex, money, or friends
will refer to them in their writing or conversation. Function words, such as personal
pronouns, also reflect attentional allocation. People who are experiencing physical or
emotional pain tend to have their attention drawn to themselves and subsequently use
more first-person singular pronouns (e.g., Rude, Gortner, & Pennebaker, 2004). When
Tausczik and Pennebaker 31
people sit in front of a mirror and complete a questionnaire, they use more words
such as “I” and “me” than when the mirror is not present (Davis & Brock, 1975). As
we might expect, positive ads focus on the political candidate producing the ad and
negative ads focus on their opponent; use of pronouns quickly reveals these differences
(Gunsch, Brownlow, Haynes, & Mabe, 2000). Gunsch and colleagues show that more
self-references (e.g., “I,” “we”) were present in positive political ads compared with
mixed and negative political ads, whereas more other-references (e.g., “he,” “she,”
“they”) were present in negative ads compared with positive and mixed ads.
Attention can reveal not just who someone is attending to but how they are processing
the situation. Students who wrote about their experiences with teasing varied in the pro-
nouns they used depending on whether they were teasing others or were being teased by
others (Kowalski, 2000). Participants used more first-person singular and fewer third-
person pronouns (e.g., “he,” “she”) when describing an event when they were being teased
compared with when they described an event were they were teasing someone else. In both
cases, the focus is on the person who was teased—the victim of the event. There was a
significant interaction with sex and use of third-person pronouns; male participants used
more third-personal pronouns when describing an event in which they were being teased
than female participants. Compared with women, men may focus more on the perpetrator
of the event when they are the victim, although it remains unclear why this is the case.
Whereas personal pronouns provide information about the subject of attention,
analyses of the tense of common verbs can tell us about the temporal focus of atten-
tion. In the same study of political ads, the authors found that positive ads used more
present and future tense verbs, and negative ads used more past tense verbs (Gunsch
et al., 2000). From the tense of the verbs and the personal pronouns used, we can infer
that negative ads focus on past actions of the opponent, and positive ads focus on the
present and future acts of the candidate.
Studying attention also gives us a deeper understanding of how people are processing
a situation or event. Participants were asked to either recall an event that they had dis-
cussed with someone else, or an undisclosed event; there were significant differences in
the verb tense used in the two conditions (Pasupathi, 2007). Participants used greater past
tense in discussing a disclosed event and greater present tense in discussing an undis-
closed event. Verb tense differences could indicate increased psychological distance and
a higher degree of resolution for disclosed events compared with undisclosed events.
Pronouns and verb tense are useful linguistic elements that can help identify focus,
which, in turn, can show priorities, intentions, and processing. Some care should be
taken in evaluating how pronouns and verbs are used. An exception to the pronoun-
attention rule concerns first-person plural pronouns—“we,” “us,” and “our.” Sometimes
“we” can signal a sense of group identity, such as when couples are asked to evaluate
their marriages to an interviewer, the more the participants use “we,” the better their
marriage (Simmons, Gordon, & Chambless, 2005). “We” can also be used as the Royal
We, such as when the advisor announces to his or her graduate students that “we need
to analyze that data.” The use of “we” in this case actually means “you students” rather
than “you students and I” (see also use of the Royal We by political figures, such as
Rudoph Guiliani in Pennebaker & Lay, 2002).
32 Journal of Language and Social Psychology 29(1)
Emotionality: Positive and Negative Emotions
The degree to which people express emotion, how they express emotion, and the
valence of that emotion can tell us how people are experiencing the world. People
react in radically different ways to traumatic or important events; how people react
may say a lot about how they cope with the event and the extent to which the event
plays a role in the future. At the heart of reacting and coping with events is people’s
emotional response.
Research suggests that LIWC accurately identifies emotion in language use. For
example, positive emotion words (e.g., love, nice, sweet) are used in writing about a
positive event, and more negative emotion words (e.g., hurt, ugly, nasty) are used in
writing about a negative event (Kahn, Tobin, Massey, & Anderson, 2007). LIWC rat-
ings of positive and negative emotion words correspond with human ratings of the
writing excerpts (Alpers et al., 2005).
Use of emotion words has also been used as a measure of the degree of immersion.
Holmes et al. (2007) found that among women trying to cope with intimate partner
violence, using more positive and negative emotion words to describe the violence led
to increased feelings of physical pain over the four writing sessions. The authors con-
clude that higher use of emotion words showed more immersion in the traumatic
event, which led to increased experience of physical pain.
Language emotionality extends beyond the simple expression of more or less emo-
tion; use of emotion words relate to other key language elements. In an examination of
the random assortment of around 2,800 texts described earlier, emotion words were
negatively correlated with articles (r = -.33), prepositions (r = -.38), and relativity
words (r = -.40). These language features as we discuss later, may be important in cog-
nitive complexity and thinking styles. Emotion words were positively correlated with
pronoun use (r = .29), auxiliary verb use (r = .29) and negation use (r = .32). All correla-
tions are highly significant, p < .001. The nature of these correlations suggests a deeper
importance of the expression of emotion and thinking styles, and social awareness.
Social Relationships
Language at its most basic function is to communicate. Words provide information
about social processes—who has more status, whether a group is working well
together, if someone is being deceptive, and the quality of a close relationship. Word
choice provides information about person perception (Semin & Fiedler, 1988). Certain
language clues give away relationships. Pronouns reveal how an individual is refer-
encing those in the interaction and outside of it. Word count explains who is dominating
the conversation and how engaged they are in the conversation. Assents and positive
emotion words measure levels of agreement. Other language cues are specific to the
interaction; here we offer a few situations that have been studied.
Tausczik and Pennebaker 33
Status, Dominance, and Social Hierarchy
Higher-status individuals speak more often and freely make statements that involve
others. Lower-status language is more self-focused and tentative. In a study of groups
of three crew members, a captain, a first lieutenant, and a second lieutenant engaging
in several flight simulations, the use of greater first-person plural correlated with
higher rank (Sexton & Helmreich, 2000). The authors found the opposite pattern for
question marks: Higher-ranked crew members asked fewer questions compared with
lower-ranked crew members. Across five studies in which status was either experi-
mentally manipulated, determined by partner ratings, or based on existing titles,
increased use of first-person plural was a good predictor of higher status, and in four of
the studies increased use of first-person singular was a good predictor of lower status
(Kacewicz, Pennebaker, Davis, Jeon, & Graesser, 2009). Leshed, Hancock, Cosley,
McLeod, and Gay (2007) reported that members of small groups are rated as being
more involved and task focused by their teammates if they use more words; support-
ing the assertion that total word count may also indicate status.
Social Coordination and Group Processes
More communication, more unity, and positive feedback may promote better group
performance. Word count can act as a proxy for amount of communication; in some
circumstances, more first-person plural may show group cohesion; and assents
and question marks show how individuals are responding to each other. In the
study of flight crews simulating easy and difficult flights, increased group word count,
increased use of first-person plural, and increased use of question marks in early simu-
lations predicted better team performance (Sexton & Helmreich, 2000). However,
groups of 4 to 6 participants working on a joint task that used less first-person plural
rated their group as having more group cohesion, although first-person plural was
unrelated to group performance (Gonzales, Hancock, & Pennebaker, in press). The
type of first-person plural pronouns may be important, if “we” is being used to pro-
mote interdependence as in “we can do this;” it may increase group cohesion if, on the
other hand, it is being used to indirectly assign tasks as it may lead to resentment.
Increased use of assents (e.g., agree, OK, yes) could signal increased group consensus
and agreement; however, the timing of assents is important. Later in a group task, assents
may signal consensus, early assents may indicate blind agreement by unmotivated group
members (Leshed, Hancock, Cosley, McLeod, & Gay, 2007).
Honesty and Deception
Deceptive statements compared with truthful ones are moderately descriptive, dis-
tanced from self, and more negative. Newman, Pennebaker, Berry, and Richards
(2003) investigated lying behavior in five experiments; in each experiment, lying was
operationalized differently. Across the studies when participants were lying they used
more negative emotion, more motion words (e.g., arrive, car, go), fewer exclusion
34 Journal of Language and Social Psychology 29(1)
words, and less first-person singular. More motion words and fewer third-person pro-
nouns were also significant predictors of deception by prisoners instructed to lie or tell
the truth about videos they had watched (Bond & Lee, 2005). Hancock, Curry, Goorha,
and Woodworth (2008) expanded these findings to study lying within pairs of partici-
pants over instant messenger. They found a similar pattern of language use when a
participant was lying. They also found that the people being deceived, the partners of
the participants lying, also changed their language. When one participant was lying
both used a higher total word count, less first-person singular, and more sense words.
Motion, exclusion, and sense words all indicate the degree to which an individual
elaborated on the description of the scenario. Deceptive statements are balanced in
descriptiveness because enough description is required to convince the other person of
an untruthful statement but too much information might reveal inaccuracies. Using
different linguistic measures, researchers found that non-naïve individuals assigned to
be deceptive compared with non-naïve individuals assigned to be truthful or naïve
individuals who were truthful used some language features that showed less diversity
and complexity (Zhou, Burgoon, Nunamaker, & Twitchell, 2004). Exclusive words
are also a marker of complexity. Complexity may be reduced in deceptive speech
because of the cognitive load required to maintain a story that is contrary to experi-
ence, and the effort taken to try to convince someone else that something false is true.
Close Relationships
Pronoun use is very important in showing the quality of a close relationship, because
it shows how individuals are referring to each other. Surprisingly, first-person plural
(“we”) has not been found to be related to higher relationship quality, instead use of
second person (“you”) is more important in predicting lower-quality relationships.
Simmons, Chambless, and Gordon (2008) found that use of second-person pronouns
was negatively related to relationship quality. They found in a study of relatives of
participants suffering from either obsessive–compulsive disorder or panic attacks with
agoraphobia that there were differences in the use of pronouns and that these differ-
ences signaled the extent to which they had a poor relationship with the patient.
Relatives who used more second person in a taped interview with the patient scored
higher on measures of criticism and having an overinvolved emotional reaction to the
patient’s condition. In this study, use of second person showed hostility and willing-
ness to confront the patient. In a study of archived instant message conversation
between heterosexual romantic partners shows a marginal trend that increased use of
second person by the male participant predicted lower ratings of relationship satisfac-
tion (Slatcher, Vazire, & Pennebaker, 2008). Researchers have hypothesized that
increased use of first-person plural in conversations between romantic partners should
lead to increased ratings of relationship satisfaction and stability. In fact in the study
of instant message transcripts of romantic partners shows that increased use of first-
person singular by the women leads to higher ratings of satisfaction for both
individuals, use of first-person plural is unrelated to the satisfaction. Higher positive
emotion words for men lead to increased relationship satisfaction as well.
Tausczik and Pennebaker 35
These are only a few possible interactions and related language categories. Patterns
of language use are a rich tool in studying interactions, because so much of the inter-
play between individuals is carried out through language. However, language use
depends on the situational context. For example, in a cooperative coordination context,
higher total word count may signal better communication and agreement, whereas in
a negotiation context it may signal a breakdown in agreement.
Thinking Styles: Conjunctions, Nouns, Verbs,
and Cognitive Mechanisms
Language can track what information people are selecting from their environment by
monitoring attentional focus. By the same token, natural language use provides impor-
tant clues as to how people process that information and interpret it to make sense of
their environment. Thinking can vary in depth and complexity; this is reflected in the
words people use to connect thoughts. Language changes when people are actively
reevaluating a past event. It can also differ depending on the extent to which an event
has already been evaluated.
Depth of thinking can vary between people and situations; certain words can reveal
these differences. Cognitive complexity can be thought of as a richness of two compo-
nents of reasoning: the extent to which someone differentiates between multiple competing
solutions and the extent to which someone integrates among solutions (Tetlock, 1981).
These two processes are captured by two LIWC categories—exclusion words and con-
junctions. Exclusive words (e.g., but, without, exclude) are helpful in making distinctions.
Indeed, people use exclusion words when they are attempting to make a distinction
between what is in a category and what is not in a category. Exclusive words are used at
higher rates among people telling the truth (Newman et al., 2003) and by Gore compared
with Kerry and Edwards (Pennebaker, Slatcher, & Chung, 2005). Conjunctions (e.g.,
and, also, although) join multiple thoughts together and are important for creating a
coherent narrative (Graesser, McNamara, Louwerse, & Cai, 2004).
Prepositions (e.g., to, with, above), cognitive mechanisms (e.g., cause, know,
ought), and words greater than six letters are all also indicative of more complex lan-
guage. Prepositions, for example, signal that the speaker is providing more complex
and, often, concrete information about a topic. “The keys are in the box by the lamp
under the painting.” Within published journal articles, authors use more prepositions
in the discussion than the introduction or abstract. Discussions are often the most
complex part of an article because results must be integrated and differentiated from
past findings (Hartley, Pennebaker, & Fox, 2003).
The use of causal words (e.g., because, effect, hence) and insight words (e.g., think,
know, consider), two subcategories of cognitive mechanisms, in describing a past
event can suggest the active process of reappraisal. In a reanalysis of six expressive
writing studies, Pennebaker, Mayne, and Francis (1997) found that increasing use of
causal and insight words led to greater health improvements. This finding suggests
that changing from not processing to actively processing an event in combination of
emotional writing leads to better outcomes. In these experiments, increasing use of
36 Journal of Language and Social Psychology 29(1)
casual and insight words may be analogous to making reconstrual statements. In other
work, use of reconstrual in combination with discussion of a traumatic events has
shown to have the best health outcomes (Kross & Ayduk, 2008). Participants in
describing a painful relationship breakup used more cognitive mechanisms, particu-
larly causal words, in describing the breakup and postbreakup compared with the
prebreakup (Boals & Klein, 2005). The authors argue that causal words are used in the
most traumatic parts, the breakup and postbreakup, because they are being used to
create causal explanations to organize the participant’s thoughts.
The language that people use to discuss an event can reveal something about the extent
to which a story may have been established or is still being formed. When people are
uncertain or insecure about their topic, they use tentative language (e.g., maybe,
perhaps, guess) and more filler words (e.g., blah, I mean, you know). Participants who
recounted an event that they had already disclosed to someone else used fewer words
from the tentative category than participants who recounted an undisclosed event
(Pasupathi, 2007). Possibly, higher use of tentative words suggests that a partici-
pant has not yet processed an event and formed it into a story. Similarly, Beaudreau,
Storandt, and Strube (2006) found that in recounting a personal story younger partici-
pants used more filler words compared with older participants. However, there was no
difference in filler words when the two groups described a story based on a picture. In
this experiment, use of filler words may suggest the degree to which the story was well
formed, presumably older participants had more perspective on the personal life events
and may have recounted them many more times than the younger participants.
Individual Differences
The self-focus, cognitive complexity, social references, and emotional tone inherent in
language use can help identify individual differences. These linguistic characteristics
differ with age, sex, personality, and mental health. Language use, like any behavioral
manifestation, can reflect individual differences. These language features can be used
to make predictions about individuals and also may underlie causal processes that
create some individual differences.
As people age, they become less self-focused, refer more to the moment, and do not
decline in verbal complexity. Pennebaker and Stone (2003) examined the writing of
participants of varying ages in emotional writing studies. In a second experiment, the
authors examined the text of published authors from the span of their writing career.
Across these two studies, first-person singular decreased with time, whereas insight
words, future tense verbs, and exclusive words increased. The authors observe these pat-
terns of language use both in studies of different individuals at different points in their
lives, and of authors over the course of their life. From the results, they reason that there
are shifts in self-focus as people age and, counter to expectations, attention to time is
more present and future oriented, and verbal complexity may increase or at least stay the
same as people age, evidenced by insight words and exclusive words.
Sex differences in language use show that women use more social words and refer-
ences to others, and men use more complex language. A meta-analysis of the texts
Tausczik and Pennebaker 37
from many studies shows that that the largest language differences between males and
females are in the complexity of the language used and the degree of social references
(Newman, Groom, Handelman, & Pennebaker, 2008). Males had higher use of large
words, articles, and prepositions. Females had higher use of social words, and pro-
nouns, including first-person singular and third-person pronouns. There were also
large effect sizes for use of swear words, feeling words, and present tense verbs. The
fact that there are predictable differences in language used between sexes makes it
possible to predict the sex of the user without knowledge of the true sex. An open
research question remains what it means if a participant uses sex atypical language.
Studies measuring personality in participants through writing samples (Pennebaker
& King, 1999) and spoken dialogue (Mehl, Gosling, & Pennebaker, 2006) have shown
that some LIWC categories correspond with big-five personality traits. For example,
Mehl and colleagues found that for both males and females higher word count and fewer
large words predicted extraversion. Pennebaker and King showed that other LIWC cat-
egories showing complexity of language (such as articles, exclusive words, causal
words, and negations) were less frequent in the writing of people who scored high on
extraversion. Social and emotional language also differed with respect to extraversion;
people who scored high on extraversion used more social words, more positive emotion,
and less negative emotion. The findings from these two studies partially support tradi-
tional personality models. Models of extraversion would predict that extraverts engage
in more social interaction, and have a more positive response to that engagement. Also,
these models would predict that people high in extroversion would be less inhibited in
their language production, possibly leading to less complex language.
Depressed and suicidal individuals are more self-focused, express more negative
emotion and sometime use more death-related words. Studies on depression and sui-
cide show that language features can be markers of mental health. Depressed patients
are more likely to use more first-person singular and more negative emotion words
than participants who have never been depressed in emotional writings (Rude et al.,
2004). Suicidal poets in their published works compared with matched nonsuicidal
poets use more first-person singular and more death-related words (Stirman &
Pennebaker, 2001). This individual difference may show an attentional difference,
that is, more self-focus in response to emotional pain, or it may indicate a thinking
pattern that is a predilection for experiencing depression (see also work by Wolf,
Sedway, Bulik, & Kordy, 2007, dealing with the language of anorexia).
The function and emotion words people use provide important psychological cues to
their thought processes, emotional states, intentions, and motivations. We have sum-
marized some of the LIWC dimensions that reflect language correlates of attentional
focus, emotional state, social relationships, thinking styles, and individual differences.
This review is, by definition, brief and selective. Word use is highly contextual and
many of the findings may not hold with different groups of people or across a wide
range of settings. More of the research results have come from labs in the United
38 Journal of Language and Social Psychology 29(1)
States working with college-aged students, often in highly contrived settings. Very
little work has explored the differences between spoken and written language.
As can be seen in the appendix, an increasing number of studies are beginning to
link daily word use to broader social and psychological processes. What is most strik-
ing has been the relatively fast growth of the language–behavior research endeavor.
The connections between language and social psychology are changing at an accel-
erating rate. When journals such as the Journal of Language and Social Psychology
were founded, most research was based on written text or transcriptions of spoken
text, all of which were hand-typed, hand-scored, and stored in a filing cabinet for later
analyses. Researchers interested in language and social processes have historically
been trained in laboratory methods whereby participants were run, one at a time, in
highly controlled settings to best capture the links between language use, cognitive
processing, and communication dynamics.
Innovations in word analysis—as exemplified by Google and Yahoo—are challeng-
ing the social psychological methodologies most of us have grown up with. In the
amount of time it takes to run a single participant in a social psychology language study,
we can now download thousands of personal writings, interaction transcripts, or other
forms of text that can be analyzed in seconds. The Internet world provides a far more
diverse population from which to draw as well as access to a wide range of languages.
The availability of natural language use and our computational resources are trans-
forming language analysis and modern social science. LIWC represents only a
transitional text analysis program in the shift from traditional language analysis to a
new era of language analysis. Newer text analysis will be able to analyze more com-
plex language structure while retaining LIWC’s transparency. Studies have begun to
look at n-grams, groups of two or more words together in the same way we have used
LIWC to look at frequencies of single words (Oberlander & Gill, 2006). Text analysis
methods should also increase in flexibility, allowing the researcher to examine lan-
guage categories specific to his or her research program. New techniques to
automatically extract conceptually related words should be expanded to incorporate
related patterns of language style with related content words. From research using
LIWC, it has become clear that language style information is critical to understanding
a person’s state of mind.
Research using these new text analysis methods will also be expanded to capture
cultural differences mirrored in language use. Language style conveys subtle informa-
tion about social relations. The relevant social information can vary greatly between
language and cultures (cf. Maass, Karasawa, Politi, & Suga, 2006). Indeed, some of
the most striking cultural differences in language—such as markers of politeness, for-
mality, and social closeness—are inherent in function words rather than content words
(Boroditsky, Schmidt, & Phillips, 2003).
We are standing on the threshold of a new era of language analysis. One can easily
imagine how Jim Bradac would have celebrated the possibilities of tracking natural lan-
guage across hundreds of millions of people and an unknown number of contexts. The
expanding galaxy of computer-based text analysis methods have the potential to add to
our current ways of thinking about language and, in Bradac’s (1999) words, “burn ever
brighter and illuminate the universe increasingly from their different places” (p. 11).
Summary Table Linking LIWC Word Categories to Published Research Studies
Linguistic processes
Word count
Dictionary words
Words >6 letters
Total function words
Total pronouns
Personal pronouns
First-person singular
First-person plural
Second person
Third-person plural
(Percentage of all words
captured by the
(Percentage of all words
longer than 6 letters)
I, them, itself
I, them, her
I, me, mine
We, us, our
You, your, thou
She, her, him
They, their, they’d
Words in Category
Psychological Correlates
Talkativeness, verbal fluency
Verbal fluency, cognitive
Informal, nontechnical
Education, social class
Informal, personal
Personal, social
Honest, depressed,
low status, personal,
emotional, informal
Detached, high status,
socially connected to
group (sometimes)
Social, elevated status
Social interests, social
Social interests, out-group
awareness (sometimes)
Published Articles
2, 9, 18, 19, 20, 24, 32, 35, 36, 39, 40, 48,
53, 54, 57, 60, 66, 70, 72, 73, 74, 86,
89, 103, 115
3, 7, 39, 43
19, 42, 43, 65, 66, 85, 89
3, 19, 20, 27, 35, 36, 42, 43, 73, 74, 79,
89, 90, 93, 103, 115
1, 19, 36, 43, 55, 89, 90, 119
58, 79
1, 3, 4, 5, 11, 13, 18, 27, 35, 36, 46, 55,
56, 64, 65, 66, 68, 69, 72, 73, 74, 78,
80, 81, 87, 89, 90, 92, 93, 94, 100, 101,
105, 108, 109, 112, 113, 115
1, 4, 13, 18, 35, 46, 55, 64, 65, 74, 78, 81,
87, 90, 93, 94, 97, 100, 103, 104, 105,
106, 113
1, 18, 27, 41, 55, 90, 100, 105, 106
1, 3, 14, 36, 39, 55, 64, 66, 80, 87, 88,
90, 95
1, 3, 14, 39, 55, 64, 80, 87, 88, 95
Appendix (continued)
Indefinite pronouns
Common verbs
Auxiliary verbs
Past tense
Present tense
Future tense
Swear words
Psychological processes
Social processes
Affective processes
It, it’s, those
A, an, the
Walk, went, see
Am, will, have
Went, ran, had
Is, does, hear
Will, gonna
Very, really, quickly
To, with, above
And, but, whereas
No, not, never
Few, many, much
Second, thousand
Damn, piss, fuck
Mate, talk, they, child
Daughter, husband
Buddy, friend, neighbor
Adult, baby, boy
Happy, cried, abandon
Words in Category
Psychological Correlates
Use of concrete nouns,
interest in objects and
Informal, passive voice
Focus on the past
Living in the here and now
Future and goal oriented
Education, concern with
Informal, aggression,
Social concerns, social
Published Articles
19, 36, 43, 74, 79, 80, 89, 92, 115
58, 79
1, 13, 37, 62, 68, 73, 79, 87, 89, 91, 93,
13, 36, 37, 42, 62, 68, 73, 87, 89, 90, 93,
13, 26, 37, 41, 62, 64, 76, 90, 93, 114
43, 79, 89, 92, 115
24, 39, 40, 48, 79, 89, 90, 114, 115
19, 79
58, 73, 74, 81, 98
1, 18, 23, 27, 32, 35, 41, 55, 78, 79, 85,
88, 89, 90, 93, 95, 97, 115, 116
18, 95
18, 95
1, 11
12, 27, 28, 32, 33, 34, 40, 44, 50, 54, 57,
58, 60, 62, 69, 77, 85, 86, 119
Appendix (continued)
Positive emotion
Negative emotion
Cognitive processes
Love, nice, sweet
Hurt, ugly, nasty
Worried, nervous
Hate, kill, annoyed
Crying, grief, sad
Cause, know, ought
Think, know, consider
Because, effect, hence
Should, would, could
Maybe, perhaps, guess
Always, never
Block, constrain, stop
And, with, include
But, without, exclude
Words in Category
Psychological Correlates
Social/verbal skills,
emotional stability
Cognitive complexity,
Published Articles
2, 3, 4, 5, 6, 8, 10, 12, 15, 17, 21, 22, 23, 25, 28, 30,
31, 33, 36, 37, 38, 41, 45, 46, 47, 48, 49, 50, 51,
53, 54, 55, 57, 59, 60, 61, 62, 64, 66, 67, 68, 69,
70, 71, 73, 74, 75, 76, 77, 81, 82, 85, 89, 91, 93,
94, 96, 99, 107, 108, 109, 110, 113, 115, 117, 118
2, 3, 4, 6, 10, 12, 13, 16, 17, 20, 21, 22, 25, 28, 29,
30, 31, 33, 35, 37, 40, 44, 45, 46, 47, 48, 50, 51,
52, 53, 55, 57, 59, 61, 62, 63, 64, 66, 67, 70, 71,
72, 73, 74, 76, 79, 80, 81, 82, 84, 85, 89, 91, 92,
93, 94, 96, 99, 102, 107, 113, 115, 117, 119, 121
6, 28, 50, 66, 68, 77, 84, 85, 92
6, 28, 33, 50, 58, 66, 72, 74, 92
6, 28, 33, 38, 50, 63, 66, 77, 84, 90
2, 3, 5, 8, 13, 18, 21, 23, 31, 32, 34, 46, 47, 49, 55,
58, 61, 68, 69, 71, 75, 83, 84, 85, 86, 89, 92, 93,
102, 104, 119, 120
1, 4, 18, 19, 25, 35, 37, 45, 53, 59, 68, 73, 76, 89, 90,
91, 92, 93, 97, 99, 111, 113, 115, 118, 119, 121
10, 13, 16, 20, 35, 37, 39, 45, 53, 72, 76, 89, 90,
91, 93, 97, 99, 115, 121, 122
10, 16, 18, 19, 49, 63, 74, 89, 115
18, 19, 24, 37, 38, 49, 73, 87, 89, 98, 115
1, 16, 18, 19, 49, 90, 111
41, 60, 73, 74, 89, 115
24, 49, 73, 80, 89, 92, 93, 115
Appendix (continued)
Perceptual processes
Biological processes
Personal concerns
Spoken categories
Observing, heard, feeling
View, saw, seen
Listen, hearing
Feels, touch
Eat, blood, pain
Cheek, hands, spit
Clinic, flu, pill
Horny, love, incest
Dish, eat, pizza
Area, bend, go
Arrive, car, go
Down, in, thin
End, until, season
Job, majors, xerox
Earn, hero, win
Cook, chat, movie
Apartment, kitchen,
Audit, cash, owe
Altar, church, mosque
Bury, coffin, kill
Agree, OK, yes
Er, hm, umm
Blah, Imean, yaknow
Words in Category
Psychological Correlates
Agreement, passivity
Informal, Unprepared
Published Articles
14, 37, 120
13, 41
13, 88
34, 36, 37, 49, 116
36, 94, 96, 112
68, 94
49, 110
14, 37, 80
14, 120
1, 13, 41, 64, 93, 119, 120
36, 60, 103
41, 94
1, 2, 4, 35, 64, 68, 91, 94
48, 60, 81
9, 74
Tausczik and Pennebaker 43
Appendix (continued)
References Cited in the Table
1. Alexander-Emery, S., Cohen, L. M., & Prensky, E. H. (2005). Linguistic analysis of
college aged smokers and never smokers. Journal of Psychopathology and Behavioral
Assessment, 27, 11-16.
2. Alvarez-Conrad, J., Zoellner, L. A., & Foa, E. B. (2001). Linguistic predictors of trauma
pathology and physical health. Applied Cognitive Psychology, 15, 159-170.
3. Arguello, J., Butler, B. S., Joyce, E., Kraut, R., Ling, K. S., Rosé, C., et al. (2006). Talk to
me: Foundations for successful individual-group interactions in online communities.
In Proceedings of the CHI’06 conference on human factors in computing systems
(pp. 959-968). New York: Association for Computing Machinery Press.
4. Baddeley, J. L., & Singer, J. A. (2008). Telling losses: Functions and personality correlates
of bereavement narratives. Journal of Research in Personality, 42, 421-438.
5. Baikie, K. A., Wilhelm, K., Johnson, B., Boskovic, M., Wedgwood, L., Finch, A., et al.
(2006). Expressive writing for high-risk drug dependent patients in a primary care clinic:
A pilot study. Harm Reduction Journal, 3, 34-42.
6. Bantum, E. O., & Owen, J. E. (2009). Evalulating the validity of computerized content
analysis programs for identification of emotional expression in cancer narratives. Psych-
ological Assessment, 21, 79-88.
7. Barnes, D. H. (2007). Letters from a suicide. Death Studies, 31, 671-678.
8. Batten, S. V., Follette, V. M., Rasmussen Hall, M. L., & Palm, K. M. (2002). Physical
and psychological effects of written disclosure among sexual abuse survivors. Behavior
Therapy, 33, 107-122.
9. Beaudreau, S. A., Storandt, M., & Strube, M. J. (2006). A comparison of narratives told
by younger and older adults. Experimental Aging Research, 32, 105-117.
10. Beevers, C. G., & Scott, W. D. (2001). Ignorance may be bliss, but thought suppression
promotes superficial cognitive processing. Journal of Research in Personality, 35,
11. Block-Lerner, J., Adair, C., Plumb, J. C., Rhatigan, D. L., & Orsillo, S. M. (2007). The
case for mindfulness-based approaches in the cultivation of empathy: Does nonjud-
gmental, present-moment awareness increase capacity for perspective-taking and empathic
concern? Journal of Marital & Family Therapy, 33, 501-516.
12. Blonder, L. X., Heilman, K. M., Ketterson, T., Rosenbek, J., Raymer, A., Crosson, B.,
et al. (2005). Affective facial and lexical expression in aprosodic versus aphasic stroke
patients. Journal of the International Neuropsychological Society, 11, 677-685.
13. Boals, A., & Klein, K. (2005). Word use in emotional narratives about failed romantic
relationships and subsequent mental health. Journal of Language and Social Psychology,
24, 252-268.
14. Bond, G. D., & Lee, A. Y. (2005). Language of lies in prison: Linguistic classification
of prisoners’ truthful and deceptive natural language. Applied Cognitive Psychology,
19, 313-329.
44 Journal of Language and Social Psychology 29(1)
Appendix (continued)
15. Bono, J. E., & Ilies, R. (2006). Charisma, positive emotions and mood contagion. The
Leadership Quarterly, 17, 317-334.
16. Brett, J. M., Olekalns, M., Friedman, R., Goates, N., Anderson, C., & Lisco, C. C.
(2007). Sticks and stones: Language, face, and online dispute resolution. Academy of
Management Journal, 50, 85-99.
17. Broderick, J. E., Junghaenel, D. U., & Schwartz, J. E. (2005). Written emotional
expression produces health benefits in fibromyalgia patients. Psychosomatic Medicine,
67, 326-334.
18. Burke, P. A., & Dollinger, S. J. (2005). A picture’s worth a thousand words: Language use
in autophotographic essay. Personality and Social Psychology Bulletin, 31, 536-548.
19. Centerbar, D. B., Schnall, S., Clore, G. L., & Garvin, E. D. (2008). Affective incoherence:
When affective concepts and embodied reactions clash. Journal of Personality and Social
Psychology, 94, 560-578.
20. Chung, C. K., & Pennebaker, J. W. (2008). Variations in the spacing of expressive writing
sessions. British Journal of Health Psychology, 13, 15-21.
21. Cohen, A. S., Minor, K. S., Baillie, L. E., & Dahir, A. M. (2008). Clarifying the linguistic
signature: Measuring personality from natural speech. Journal of Personality Assessment,
90, 559-563.
22. Cohn, M. A., Mehl, M. R., & Pennebaker, J. W. (2004). Linguistic markers of psychological
change surrounding September 11, 2001. Psychological Science, 15, 687-693.
23. Corter, A. L., & Petrie, K. J. (2008). Expressive writing in context: The effects of a
confessional setting and delivery of instructions on participant experience and language
in writing. British Journal of Health Psychology, 13, 27-30.
24. Creswell, J. D., Lam, S., Stanton, A. L., Taylor, S. E., Bower, J. E., & Sherman, D. K.
(2007). Does self-affirmation, cognitive processing, or discovery of meaning explain
cancer-related health benefits of expressive writing? Personality and Social Psychology
Bulletin, 33, 238-250.
25. DiNardo, A. C., Schober, M. F., & Stuart, J. (2005). Chair and couch discourse: A study
of visual copresence in psychoanalysis. Discourse Processes, 40, 209-238.
26. Dino, A., Reysen, S., & Branscombe, N. R. (2009). Online interactions between group
members who differ in status. Journal of Language and Social Psychology, 28, 85-94.
27. Djikic, M., Oatley, K., & Peterson, J. B. (2006). The bitter-sweet labor of emoting: The
linguistic comparison of writers and physicists. Creativity Research Journal, 18, 191-197.
28. D’Souza, P., Lumley, M., Kraft, C., & Dooley, J. (2008). Relaxation training and written
emotional disclosure for tension or migraine headaches: A randomized, controlled trial.
Annals of Behavioral Medicine, 36, 21-32.
29. Eid, J., Johnsen, B., Helge, R. N., & Saus, E. R. (2005). Trauma narratives and emotional
processing. Scandinavian Journal of Psychology, 46, 503-510.
30. Epstein, E. M., Sloan, D. M., & Marx, B. P. (2005). Getting to the heart of the matter:
Written disclosure, gender, and heart rate. Psychosomatic Medicine, 67, 413-419.
Tausczik and Pennebaker 45
Appendix (continued)
31. Friedman, S. R., Rapport, L. J., Lumley, M., Tzelepis, A., VanVoorhis, A., Stettner, L.,
et al. (2003). Aspects of social and emotional competence in adult attention-deficit/
hyperactivity disorder. Neuropsychology, 17, 50-58.
32. Gill, A. J., French, R. M., Gergle, D., & Oberlander, J. (2008). The language of emotion
in short blog texts. In Proceedings of the CSCW’08 computer supported cooperative
work (pp. 299-302). New York: Association for Computing Machinery Press.
33. Gillis, M. E., Lumley, M. A., Mosley-Williams, A., Leisen, J. C. C., & Roehrs, T. (2006).
The health effects of at-home written emotional disclosure in fibromyalgia: A randomized
trial. Annals of Behavioral Medicine, 32, 135-146.
34. Gortner, E. M., & Pennebaker, J. W. (2003). The archival anatomy of a disaster: Media
coverage and community-wide health effects of the Texas A&M bonfire tragedy. Journal
of Social and Clinical Psychology, 22, 580-603.
35. Groom, C. J., & Pennebaker, J. W. (2005). The language of love: Sex, sexual orientation,
and language use in online personal advertisements. Sex Roles, 52, 447-461.
36. Guastella, A. J., & Dadds, M. R. (2006). Cognitive-behavioral models of emotional
writing: A validation study. Cognitive Therapy and Research, 30, 397-414.
37. Hamilton-West, K. E. (2007). Effects of written emotional disclosure on health outcomes
in patients with ankylosing spondylitis. Psychology & Health, 22, 637-657.
38. Hancock, J. T., Curry, L. E., Goorha, S., & Woodworth, M. (2008). On lying and
being lied to: A linguistic analysis of deception in computer-mediated communication.
Discourse Processes, 45, 1-23.
39. Hancock, J. T., Landrigan, C., & Silver, C. (2007). Expressing emotion in text-based
communication. In Proceedings of the CHI’07 conference on human factors in computing
systems (pp. 929-932). New York: Association for Computing Machinery Press.
40. Handelman, L. D., & Lester, D. (2007). The content of suicide notes from attempters and
completers. Crisis, 28, 102-104.
41. Hartley, J. (2003). Improving the clarity of journal abstracts in psychology: The case for
structure. Science Communication, 24, 366-379.
42. Hartley, J., Pennebaker, J. W., & Fox, C. (2003). Abstracts, introductions and discussions:
How far do they differ in style? Scientometrics, 57, 389-398.
43. Heberlein, A. S., Adolphs, R., Pennebaker, J. W., & Tranel, D. (2003). Effects of damage
to right-hemisphere brain structures on spontaneous emotional and social judgments.
Political Psychology, 24, 705-726.
44. Hemenover, S. H. (2003). The good, the bad, and the healthy: Impacts of emotional
disclosure of trauma on resilient self-concept and psychological distress. Personality and
Social Psychology Bulletin, 29, 1236-1244.
45. Hoyt, T., & Pasupathi, M. (2008). Blogging about trauma: Linguistic measures of
apparent recovery [Electronic version]. Journal of Applied Psychology, 4.
46. Jones, S. M., & Wirtz, J. G. (2006). How does the comforting process work? An
empirical test of an appraisal-based model of comforting. Human Communication
Research, 32, 217-243.
46 Journal of Language and Social Psychology 29(1)
Appendix (continued)
47. Joyce, E., & Kraut, R. E. (2006). Predicting continued participation in newsgroups.
Journal of Computer-Mediated Communication, 11, 723-747.
48. Junghaenel, D. U., Smyth, J. M., & Santner, L. (2008). Linguistic dimensions of
psychopathology: A quantitative analysis. Journal of Social and Clinical Psychology,
27, 36-55.
49. Kahn, J. H., Tobin, R. M., Massey, A. E., & Anderson, J. A. (2007). Measuring emotional
expression with the Linguistic Inquiry and Word Count. American Journal of Psychology,
120, 263-286.
50. Kiesler, S., Lee, S., & Kramer, A. D. I. (2006). Relationship effects in psychological
explanations of nonhuman behavior. Anthrozoos, 19, 335-352.
51. King, E. B., Shapiro, J. R., Hebl, M. R., Singletary, S. L., & Turner, S. (2006). The stigma of
obesity in customer service: A mechanism for remediation and bottom-line consequences
of interpersonal discrimination. Journal of Applied Psychology, 91, 579-593.
52. Klein, K., & Boals, A. (2001). Expressive writing can increase working memory capacity.
Journal of Experimental Psychology: General, 130, 520-533.
53. Knight, J. L., & Hebl, M. R. (2005). Affirmative reaction: The influence of type of
justification on nonbeneficiary attitudes toward affirmative action plans in higher
education. Journal of Social Issues, 61, 547-568.
54. Kramer, A. D. I., Oh, L. M., & Fussell, S. R. (2006). Using linguistic features to
measure presence in computer-mediated communication. In Proceedings of the CHI’06
conference on human factors in computing systems (pp. 913-916). New York: Association
for Computing Machinery Press.
55. Kross, E., & Ayduk, O. (2008). Facilitating adaptive emotional analysis: Distinguishing
distanced-analysis of depressive experiences from immersed-analysis and distraction.
Personality and Social Psychology Bulletin, 34, 924-938.
56. Lambie, J. A., & Baker, K. L. (2003). Article details Intentional avoidance and social
understanding in repressors and nonrepressors: Two functions for emotion experience?
Consciousness and Emotion, 4, 17-42.
57. Lee, C. H., Kim, K., Seo, Y. S., & Chung, C. K. (2007). The relations between personality
and language use. Journal of General Psychology, 134, 405-413.
58. Lepore, S. J. (1997). Expressive writing moderates the relation between intrusive thoughts
and depressive symptoms. Journal of Personality and Social Psychology, 73, 1030-1037.
59. Leshed, G., Hancock, J. T., Cosley, D., McLeod, P. L., & Gay, G. (2007). Feedback for
guiding reflection on teamwork practices. In Proceedings of the GROUP’07 conference
on supporting group work (pp. 217-220). New York: Association for Computing
Machinery Press.
60. Lieberman, M. A. (2008). Effects of disease and leader type on moderators in online
support groups. Computers in Human Behavior, 24, 2446-2455.
61. Liehr, P., Takahashi, R., Nishimura, C., Frazier, L., Kuwajima, I., & Pennebaker, J. W.
(2002). Expressing health experience through embodied language. Journal of Nursing
Scholarship, 34, 27-32.
Tausczik and Pennebaker 47
Appendix (continued)
62. Liess, A., Simon, W., Yutsis, M., Owen, J. E., Piemme, K. A., Golant, M., et al. (2008).
Detecting emotional expression in face-to-face and online breast cancer support groups.
Journal of Consulting and Clinical Psychology, 76, 517-523.
63. Lightman, E. J., McCarthy, P. M., Dufty, D. F., & McNamara, D. S. (2007). Using
computational text analysis tools to compare the lyrics of suicidal and non-suicidal
songwriters. In D. S. McNamara & G. Trafton (Eds.), Proceedings of the 29th Annual
Cognitive Science Society. Hillsdale, NJ: Erlbaum.
64. Lillard, A., Nishida, T., Massaro, D., Vaish, A., Ma, L., & McRoberts, G. (2007). Signs
of pretense across age and scenario. Infancy, 11, 1-30.
65. Lockenhoff, C. E., Costa, P. T., Jr., & Lane, R. D. (2008). Age differences in descriptions
of emotional experiences in oneself and others. Journals of Gerontology Series B:
Psychological Sciences and Social Sciences, 63, 92-99.
66. Luterek, J. A., Orsillo, S. M., & Marx, B. P. (2005). An experimental examination
of emotional experience, expression, and disclosure in women reporting a history of
childhood sexual abuse. Journal of Traumatic Stress, 18, 237-244.
67. Lyons, E. J., Mehl, M. R., & Pennebaker, J. W. (2006). Pro-anorexics and recovering
anorexics differ in their linguistic Internet self-presentation. Journal of Psychosomatic
Research, 60, 253-256.
68. Mackenzie, C. S., Wiprzycka, U. J., Hasher, L., & Goldstein, D. (2007). Does expressive
writing reduce stress and improve health for family caregivers of older adults? The
Gerontologist, 47, 296-306.
69. Manne, S. (2002). Language use and post-traumatic stress symptomatology in parents of
pediatric cancer survivors 1. Journal of Applied Social Psychology, 32, 608-629.
70. McCullough, M. E., Root, L. M., & Cohen, A. D. (2006). Writing about the benefits of
an interpersonal transgression facilitates forgiveness. Journal of Consulting and Clinical
Psychology, 74, 887-897.
71. Mehl, M. R. (2006). The lay assessment of subclinical depression in daily life.
Psychological Assessment, 18, 340-345.
72. Mehl, M. R., Gosling, S. D., & Pennebaker, J. W. (2006). Personality in its natural
habitat: Manifestations and implicit folk theories of personality in daily life. Journal of
Personality and Social Psychology, 90, 862-877.
73. Mehl, M. R., & Pennebaker, J. W. (2003). The sounds of social life: A psychometric
analysis of students’ daily social environments and natural conversations. Journal of
Personality and Social Psychology, 84, 857-870.
74. van Middendorp, H., & Geenen, R. (2008). Poor cognitive-emotional processing may
impede the outcome of emotional disclosure interventions. British Journal of Health
Psychology, 13, 49-52.
75. van Middendorp, H., Sorbi, M. J., van Doornen, L. J. P., Bijlsma, J. W. J., & Geenen, R.
(2007). Feasibility and induced cognitive-emotional change of an emotional disclosure
intervention adapted for home application. Patient Education and Counseling, 66, 177-187.
48 Journal of Language and Social Psychology 29(1)
Appendix (continued)
76. Morgan, N. P., Graves, K. D., Poggi, E. A., & Cheson, B. D. (2008). Implementing an
expressive writing study in a cancer clinic. The Oncologist, 13, 196-204.
77. Neff, K. D., Kirkpatrick, K. L., & Rude, S. S. (2007). Self-compassion and adaptive
functioning. Journal of Research in Personality, 41, 139-154.
78. Newman, M. L., Groom, C. J., Handelman, L. D., & Pennebaker, J. W. (2008). Gender
differences in language use: An analysis of 14,000 text samples. Discourse Processes,
45, 211-236.
79. Newman, M. L., Pennebaker, J. W., Berry, D. S., & Richards, J. M. (2003). Lying words:
Predicting deception from linguistic styles. Personality and Social Psychology Bulletin,
29, 665-675.
80. Oliver, E. J., Markland, D., Hardy, J., & Petherick, C. M. (2008). The effects of
autonomy-supportive versus controlling environments on self-talk. Motivation and
Emotion, 32, 200-212.
81. Orsillo, S. M., Batten, S. V., Plumb, J. C., Luterek, J. A., & Roessner, B. M. (2004). An
experimental study of emotional responding in women with posttraumatic stress disorder
related to interpersonal violence. Journal of Traumatic Stress, 17, 241-248.
82. Owen, J. E., Giese-Davis, J., Cordova, M., Kronenwetter, C., Golant, M., & Spiegel,
D. (2006). Self-report and linguistic indicators of emotional expression in narratives as
predictors of adjustment to cancer. Journal of Behavioral Medicine, 29, 335-345.
83. Owen, J. E., Klapow, J. C., Roth, D. L., Shuster, J. L., Bellis, J., Meredith, R., et al.
(2005). Randomized pilot of a self-guided Internet coping group for women with early-
stage breast cancer. Annals of Behavioral Medicine, 30, 54-64.
84. Owen, J. E., Klapow, J. C., Roth, D. L., & Tucker, D. C. (2004). Use of the internet
for information and support: disclosure among persons with breast and prostate cancer.
Journal of Behavioral Medicine, 27, 491-505.
85. Owen, J. E., Yarbrough, E. J., Vaga, A., & Tucker, D. C. (2003). Investigation of the
effects of gender and preparation on quality of communication in Internet support groups.
Computers in Human Behavior, 19, 259-275.
86. Pasupathi, M. (2007). Telling and the remembered self: Linguistic differences in memories
for previously disclosed and previously undisclosed events. Memory, 15, 258-270.
87. Pennebaker, J. W., Groom, C. J., Loew, D., & Dabbs, J. M. (2004). Testosterone as a
social inhibitor: two case studies of the effect of testosterone treatment on language.
Journal of Abnormal Psychology, 113, 172-175.
88. Pennebaker, J. W., & King, L. A. (1999). Linguistic styles: Language use as an individual
difference. Journal of Personality and Social Psychology, 77, 1296-1312.
89. Pennebaker, J. W., & Lay, T. C. (2002). Language use and personality during crises:
Analyses of mayor Rudolph Giuliani’s press conferences. Journal of Research in
Personality, 36, 271-282.
90. Pennebaker, J. W., Mayne, T. J., & Francis, M. E. (1997). Linguistic predictors of
adaptive bereavement. Journal of Personality and Social Psychology, 72, 863-871.
Tausczik and Pennebaker 49
Appendix (continued)
91. Pennebaker, J. W., Slatcher, R. B., & Chung, C. K. (2005). Linguistic markers of
psychological state through media interviews: John Kerry and John Edwards in 2004, Al
Gore in 2000. Analyses of Social Issues and Public Policy, 5, 197-204.
92. Pennebaker, J. W., & Stone, L. D. (2003). Words of wisdom: Language use over the life
span. Journal of Personality and Social Psychology, 85, 291-301.
93. Pennebaker, J. W., & Stone, L. D. (2004). What was she trying to say? A linguistic
analysis of Katie’s diaries. In D. Lester (Ed.), Katie’s diary: Unlocking the mystery of a
suicide (pp. 55-80). New York: Brunner-Routledge.
94. Pressman, S. D., & Cohen, S. (2007). Use of social words in autobiographies and
longevity. Psychosomatic Medicine, 69, 262-269.
95. Rellini, A. H., & Meston, C. M. (2007). Sexual desire and linguistic analysis: A
comparison of sexually-abused and non-abused women. Archives of Sexual Behavior,
36, 67-77.
96. Rew, L. (2007). A linguistic investigation of mediators between religious commitment and
health behaviors in older adolescents. Issues in Comprehensive Pediatric Nursing, 30, 71-86.
97. Robertson, K., & Murachver, T. (2006). Intimate partner violence: Linguistic features
and accommodation behavior of perpetrators and victims. Journal of Language and
Social Psychology, 25, 406-422.
98. Rogers, L. J., Wilson, K. G., Gohm, C. L., & Merwin, R. M. (2007). Revisiting written
disclosure: The effects of warm versus cold experimenters. Journal of Social and Clinical
Psychology, 26, 556-574.
99. Rohrbaugh, M. J., Mehl, M. R., Shoham, V., Reilly, E. S., & Ewy, G. A. (2008). Prognostic
significance of spouse we talk in couples coping with heart failure. Journal of Consulting
and Clinical Psychology, 76, 781-789.
100. Rude, S., Gortner, E. M., & Pennebaker, J. (2004). Language use of depressed and
depression-vulnerable college students. Cognition & Emotion, 18, 1121-1133.
101. Schwartz, L., & Drotar, D. (2004). Linguistic analysis of written narratives of caregivers
of children and adolescents with chronic illness: Cognitive and emotional processes and
physical and psychological health outcomes. Journal of Clinical Psychology in Medical
Settings, 11, 291-301.
102. Sexton, J. B., & Helmreich, R. L. (2000). Analyzing cockpit communications: The
links between language, performance, and workload. Human Performance in Extreme
Environments, 5, 63-68.
103. Sharp, W. G., & Hargrove, D. S. (2004). Emotional expression and modality: An analysis
of affective arousal and linguistic output in a computer versus paper paradigm. Computers
in Human Behavior, 20, 461-475.
104. Simmons, R. A., Chambless, D. L., & Gordon, P. C. (2008). How do hostile and
emotionally overinvolved relatives view relationships? What relatives’ pronoun use tells
us. Family Process, 47, 405-419.
105. Simmons, R. A., Gordon, P. C., & Chambless, D. L. (2005). Pronouns in marital
interaction. Psychological Science, 16, 932-936.
50 Journal of Language and Social Psychology 29(1)
Appendix (continued)
106. Slatcher, R. B., & Pennebaker, J. W. (2006). How do I love thee? Let me count the words:
The social effects of expressive writing. Psychological Science, 17, 660-664.
107. Slatcher, R. B., Vazire, S., & Pennebaker, J. W. (2008). Am “I” more important than
“we”? Couples’ word use in instant messages. Personal Relationships, 15, 407-424.
108. Sloan, D. M. (2005). It’s all about me: Self-focused attention and depressed mood.
Cognitive Therapy and Research, 29, 279-288.
109. Soliday, E., Garofalo, J. P., & Rogers, D. (2004). Expressive writing intervention for
adolescents’ somatic symptoms and mood. Journal of Clinical Child and Adolescent
Psychology, 33, 792-801.
110. Stephenson, G. M., Laszlo, J., Ehmann, B., Lefever, R. M. H., & Lefever, R. (1997). Diaries
of significant events: Socio-linguistic correlates of therapeutic outcomes in patients with
addiction problems. Journal of Community and Applied Social Psychology, 7, 389-411.
111. Stirman, S. W., & Pennebaker, J. W. (2001). Word use in the poetry of suicidal and
nonsuicidal poets. Psychosomatic Medicine, 63, 517-522.
112. Stone, L. D., & Pennebaker, J. W. (2002). Trauma in real time: Talking and avoiding
online conversations about the death of Princess Diana. Basic and Applied Social Psych-
ology, 24, 173-183.
113. Swaab, R. I., Phillips, K. W., Diermeier, D., & Husted Medvec, V. (2008). The pros and
cons of dyadic side conversations in small groups: The impact of group norms and task
type. Small Group Research, 39, 372-390.
114. Taylor, P. J., & Thomas, S. (2008). Linguistic style matching and negotiation outcome.
Negotiation and Conflict Management Research, 1, 263-281.
115. Tsai, J. L., Simeonova, D. I., & Watanabe, J. T. (2004). Somatic and social: Chinese Americans
talk about emotion. Personality and Social Psychology Bulletin, 30, 1226-1238.
116. Tull, M. T., Medaglia, E., & Roemer, L. (2005). An investigation of the construct validity
of the 20-Item Toronto Alexithymia Scale through the use of a verbalization task. Journal
of Psychosomatic Research, 59, 77-84.
117. VandeCreek, L., Janus, M. D., Pennebaker, J. W., & Binau, B. (2002). Praying about
difficult experiences as self-disclosure to God. International Journal for the Psychology
of Religion, 12, 29-39.
118. Vedhara, K., Morris, R. M., Booth, R., Horgan, M., Lawrence, M., & Birchall, N. (2007).
Changes in mood predict disease activity and quality of life in patients with psoriasis
following emotional disclosure. Journal of Psychosomatic Research, 62, 611-619.
119. Vrij, A., Mann, S., Kristen, S., & Fisher, R. P. (2007). Cues to deception and ability to detect
lies as a function of police interview styles. Law and Human Behavior, 31, 499-518.
120. Warner, L. J., Lumley, M. A., Casey, R. J., Pierantoni, W., Salazar, R., Zoratti, E. M., et al.
(2006). Health effects of written emotional disclosure in adolescents with asthma: A
randomized, controlled trial. Journal of Pediatric Psychology, 31, 557-568.
121. Watkins, E. (2004). Adaptive and maladaptive ruminative self-focus during emotional
processing. Behaviour Research and Therapy, 42, 1037-1052.
Tausczik and Pennebaker 51
Authors’ Note
The original version of this article was presented as part of the James J. Bradac Memorial
Lecture at the University of California at Santa Barbara in 2008.
Declaration of Conflicting Interests
The text analysis program, LIWC, is a commercial product co-owned by Pennebaker. Proceeds
from his share of the profits are all donated to the University of Texas at Austin. The authors
declared no other conflicts of interests with respect to authorship and/or publication of this
The authors disclosed receipt of the following financial support for the research and/or authorship
of this article:
Army Research Institute (W91WAW-07-C002), DOD-CIFA (H9C104-07-C0019), and
Sandia National Laboratories (26-3963-70).
Alpers, G. W., Winzelberg, A. J., Classen, C., Roberts, H., Dev, P., Koopman, C., et al. (2005).
Evaluation of computerized text analysis in an Internet breast cancer support group. Computers
in Human Behavior, 21, 361-376.
Beaudreau, S. A., Storandt, M., & Strube, M. J. (2006). A comparison of narratives told by
younger and older adults. Experimental Aging Research, 32, 105-117.
Berlyne, D. E. (1960). Conflict, arousal, and curiosity. New York: McGraw-Hill.
Boals, A., & Klein, K. (2005). Word use in emotional narratives about failed romantic rela-
tionships and subsequent mental health. Journal of Language and Social Psychology, 24,
Bond, G. D., & Lee, A. Y. (2005). Language of lies in prison: Linguistic classification of pris-
oners’ truthful and deceptive natural language. Applied Cognitive Psychology, 19, 313-329.
Boroditsky, L., Schmidt, L. A., & Phillips, W. (2003). Sex, syntax, and semantics. In D. Gentner
& S. Goldin-Meadow (Eds.), Language in mind: Advances in the study of language and
thought (pp. 61-79). Cambridge: MIT Press.
Bradac, J. J. (1986). Threats to generalization in the use of elicited, purloined, and contrived
messages in human communication research. Communication Quarterly, 34, 55-65.
Bradac, J. J. (1999). Language1 . . . n and Social Interaction1 . . . n: Nature abhors uniformity.
Research on Language and Social Interaction, 32, 11-20.
Bulwer-Lytton, E. (1842). Paul Clifford. Leipzig, Germany: B. Tauchnitz.
Chung, C. K., & Pennebaker, J. W. (2007). The psychological function of function words. In
K. Fiedler (Ed.), Social communication: Frontiers of social psychology (pp. 343-359). New York:
Psychology Press.
Davis, D., & Brock, T. C. (1975). Use of first person pronouns as a function of increased
objective self-awareness and performance feedback. Journal of Experimental Social Psy-
chology, 11, 381-388.
52 Journal of Language and Social Psychology 29(1)
Freud, S. (1901). Psychopathology of everyday life. New York: Basic Books.
Gonzales, A. L., Hancock, J. T., & Pennebaker, J. W. (in press). Language indicators of social
dynamics in small groups. Communication Research.
Gottschalk, L. A., & Bechtel, R. (1993). Computerized content analysis of natural language or
verbal texts. Palo Alto, CA: Mind Garden.
Gottschalk, L. A., & Gleser, G. C. (1969). The measurement of psychological states through the
content analysis of verbal behavior. Berkeley: University of California Press.
Gottschalk, L. A., Gleser, G. C., Daniels, R., & Block, S. (1958). The speech patterns of schizo-
phrenic patients: a method of assessing relative degree of personal disorganization and
social alienation. Journal of Nervous and Mental Disease, 127, 153-166.
Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-metrix: analysis
of text on cohesion and language. Behavior Research Methods, Instruments, & Computers,
36, 193-202.
Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. K. (1998). Measuring individual differ-
ences in implicit cognition: The implicit association test. Journal of Personality and Social
Psychology, 74, 1464-1480.
Gunsch, M. A., Brownlow, S., Haynes, S. E., & Mabe, Z. (2000). Differential linguistic content of
various forms of political advertising. Journal of Broadcasting & Electronic Media, 44, 27-42.
Hancock, J. T., Curry, L. E., Goorha, S., & Woodworth, M. (2008). On lying and being lied to: A
linguistic analysis of deception in computer-mediated communication. Discourse Processes,
45, 1-23.
Hart, R. P. (1984). Verbal style and the presidency: A computer-based analysis. New York:
Academic Press.
Hartley, J., Pennebaker, J. W., & Fox, C. (2003). Abstracts, introductions and discussions: How
far do they differ in style? Scientometrics, 57, 389-398.
Holmes, D., Alpers, G. W., Ismailji, T., Classen, C., Wales, T., Cheasty, V., et al. (2007). Cogni-
tive and emotional processing in narratives of women abused by intimate partners. Violence
Against Women, 13, 1192-1205.
Holtzman, W. H. (1950). Validation studies of the Rorschach test: Shyness and gregariousness
in the normal superior adult. Journal of Clinical Psychology, 6, 343-347.
Kacewicz, E., Pennebaker, J. W., Davis, M., Jeon, M., & Graesser, A. C. (2009). The language
of social hierarchies. Manuscript submitted for publication.
Kahn, J. H., Tobin, R. M., Massey, A. E., & Anderson, J. A. (2007). Measuring emotional
expression with the Linguistic Inquiry and Word Count. American Journal of Psychology,
120, 263-286.
Kowalski, R. M. (2000). “I was Only Kidding!” Victims’ and perpetrators’ perceptions of teas-
ing. Personality and Social Psychology Bulletin, 26, 231-241.
Kross, E., & Ayduk, O. (2008). Facilitating adaptive emotional analysis: Distinguishing dis-
tanced-analysis of depressive experiences from immersed-analysis and distraction. Person-
ality and Social Psychology Bulletin, 34, 924-938.
Leshed, G., Hancock, J. T., Cosley, D., McLeod, P. L., & Gay, G. (2007). Feedback for guiding
reflection on teamwork practices. In Proceedings of the 2007 international ACM conference on
supporting group work (pp. 217-220). New York: Association for Computing Machinery Press.
Tausczik and Pennebaker 53
Maass, A., Karasawa, M., Politi, F., & Suga, S. (2006). Do verbs and adjectives play different
roles in different cultures? A cross-linguistic analysis of person representation. Journal of
Personality and Social Psychology, 90, 734-750.
Martindale, C. (1990). The clockwork muse: The predictability of artistic change. New York:
Basic Books.
McClelland, D. C. (1979). Inhibited power motivation and high blood pressure in men. Journal
of Abnormal Psychology, 88, 182-190.
Mehl, M. R., Gosling, S. D., & Pennebaker, J. W. (2006). Personality in its natural habitat:
Manifestations and implicit folk theories of personality in daily life. Journal of Personality
and Social Psychology, 90, 862-877.
Mergenthaler, E. (1996). Emotion-abstraction patterns in verbatim protocols: A new way of
describing psychotherapeutic processes. Journal of Consulting and Clinical Psychology, 64,
Miller, G. (1995). The science of words. New York: Scientific American Library.
Newman, M. L., Groom, C. J., Handelman, L. D., & Pennebaker, J. W. (2008). Gender differ-
ences in language use: An analysis of 14,000 text samples. Discourse Processes, 45, 211-236.
Newman, M. L., Pennebaker, J. W., Berry, D. S., & Richards, J. M. (2003). Lying words: Predict-
ing deception from linguistic styles. Personality and Social Psychology Bulletin, 29, 665-675.
Oberlander, J., & Gill, A. J. (2006). Language with character: A stratified corpus comparison of
individual differences in e-mail communication. Discourse Processes, 42, 239-270.
Pasupathi, M. (2007). Telling and the remembered self: Linguistic differences in memories for
previously disclosed and previously undisclosed events. Memory, 15, 258-270.
Pennebaker, J. W. (1982). The psychology of physical symptoms. New York: Springer-Verlag.
Pennebaker, J. W., & Beall, K. S. (1986). Confronting a traumatic event: Toward an understand-
ing of inhibition and disease. Journal of Abnormal Psychology, 95, 274-281.
Pennebaker, J. W., Booth, R. J., & Francis, M. E. (2007). Linguistic Inquiry and Word Count:
LIWC [Computer software]. Austin, TX:
Pennebaker, J. W., Chung, C. K., Ireland, M., Gonzales, A., & Booth, R. J. (2007). The develop-
ment and psychometric properties of LIWC2007 [LIWC manual]. Austin, TX:
Pennebaker, J. W., & King, L. A. (1999). Linguistic styles: Language use as an individual dif-
ference. Journal of Personality and Social Psychology, 77, 1296-1312.
Pennebaker, J. W., & Lay, T. C. (2002). Language use and personality during crises: Analy-
ses of Mayor Rudolph Giuliani’s press conferences. Journal of Research in Personality,
36, 271-282.
Pennebaker, J. W., Mayne, T. J., & Francis, M. E. (1997). Linguistic predictors of adaptive
bereavement. Journal of Personality and Social Psychology, 72, 863-871.
Pennebaker, J. W., Slatcher, R. B., & Chung, C. K. (2005). Linguistic markers of psychological
state through media interviews: John Kerry and John Edwards in 2004, Al Gore in 2000.
Analyses of Social Issues and Public Policy, 5, 197-204.
Pennebaker, J. W., & Stone, L. D. (2003). Words of wisdom: Language use over the life span.
Journal of Personality and Social Psychology, 85, 291-301.
Rorschach, H. (1921). Psychodiagnostik. Leipzig, Germany: Ernst Bircher Verlag.
Rosenberg, S. D., & Tucker, G. J. (1978). Verbal behavior and schizophrenia: The semantic
dimension. Archives of General Psychiatry, 36, 1331-1337.
54 Journal of Language and Social Psychology 29(1)
Rude, S., Gortner, E. M., & Pennebaker, J. (2004). Language use of depressed and depression-
vulnerable college students. Cognition & Emotion, 18, 1121-1133.
Semin, G. R., & Fiedler, K. (1988). The cognitive functions of linguistic categories in describ-
ing persons: Social cognition and language. Journal of Personality and Social Psychology,
54, 558-568.
Sexton, J. B., & Helmreich, R. L. (2000). Analyzing cockpit communications: The links
between language, performance, and workload. Human Performance in Extreme Environ-
ments, 5, 63-68.
Simmons, R. A., Chambless, D. L., & Gordon, P. C. (2008). How do hostile and emotionally
overinvolved relatives view relationships? What relatives’ pronoun use tells us. Family
Process, 47, 405-419.
Simmons, R. A., Gordon, P. C., & Chambless, D. L. (2005). Pronouns in marital interaction.
Psychological Science, 16, 932-936.
Slatcher, R. B., Vazire, S., & Pennebaker, J. W. (2008). Am “I” more important than “we”?
Couples’ word use in instant messages. Personal Relationships, 15, 407-424.
Stirman, S. W., & Pennebaker, J. W. (2001). Word use in the poetry of suicidal and nonsuicidal
poets. Psychosomatic Medicine, 63, 517-522.
Stone, P. J., Dunphy, D. C., Smith, M. S., & Ogilvie, D. M. (1966). The general inquirer: A
computer approach to content analysis. Cambridge: MIT Press.
Tetlock, P. E. (1981). Pre- to post-election shifts in presidential rhetoric: Impression man-
agement or cognitive adjustment. Journal of Personality and Social Psychology, 41,
Weintraub, W. (1981). Verbal behavior: Adaptation and psychopathology. New York: Springer.
Weintraub, W. (1989). Verbal behavior in everyday life. New York: Springer.
Winter, D. G. (1998). A motivational analysis of the Clinton first term and the 1996 presidential
campaign. The Leadership Quarterly, 9, 367-376.
Wolf, M., Sedway, J., Bulik, C. M., & Kordy, H. (2007). Linguistic analyses of natural written
language: Unobtrusive assessment of cognitive style in eating disorders. International Jour-
nal of Eating Disorders, 40, 711-717.
Zhou, L., Burgoon, J. K., Nunamaker, J. F., & Twitchell, D. (2004). Automating linguistics-
based cues for detecting deception in text-based asynchronous computer-mediated commu-
nications. Group Decision and Negotiation, 13, 81-106.
Yla R. Tausczik is a doctoral student in the Department of Psychology at the University of
Texas at Austin. She received her BA at the University of California at Berkeley in 2005. Her
research interests include using language to understand group dynamics and natural language
use in the workplace.
James W. Pennebaker (PhD, University of Texas, Austin) is a professor and chair of the Depart-
ment of Psychology at the University of Texas at Austin. He is the author of multiple books,
including Opening Up: The Healing Power of Expressing Emotions (1997). He has recently pub-
lished in Science, Psychological Science, and Journal of Personality and Social Psychology.
... The words' semantic space revealed by this analysis was used to create a trust lexicon. Word sentiment lexicons can estimate people's emotional state (Tausczik and Pennebaker, 2010) and attitudes on social media (Pang and Lee, 2008) or in conversations (Li et al., 2020). Word sentiment lexicons are typically created through tedious manual labeling of each word in the dictionary, which produces a sentiment rating for each word. ...
Full-text available
Introduction Trust has emerged as a prevalent construct to describe relationships between people and between people and technology in myriad domains. Across disciplines, researchers have relied on many different questionnaires to measure trust. The degree to which these questionnaires differ has not been systematically explored. In this paper, we use a word-embedding text analysis technique to identify the differences and common themes across the most used trust questionnaires and provide guidelines for questionnaire selection. Methods A review was conducted to identify the existing trust questionnaires. In total, we included 46 trust questionnaires from three main domains (i.e., Automation, Humans, and E-commerce) with a total of 626 items measuring different trust layers (i.e., Dispositional, Learned, and Situational). Next, we encoded the words within each questionnaire using GloVe word embeddings and computed the embedding for each questionnaire item, and for each questionnaire. We reduced the dimensionality of the resulting dataset using UMAP to visualize these embeddings in scatterplots and implemented the visualization in a web app for interactive exploration of the questionnaires ( ). Results At the word level, the semantic space serves to produce a lexicon of trust-related words. At the item and questionnaire level, the analysis provided recommendation on questionnaire selection based on the dispersion of questionnaires’ items and at the domain and layer composition of each questionnaire. Along with the web app, the results help explore the semantic space of trust questionnaires and guide the questionnaire selection process. Discussion The results provide a novel means to compare and select trust questionnaires and to glean insights about trust from spoken dialog or written comments.
... These terms are organized hierarchically under concepts and subconcepts for the lexicon. The lexicons are fundamentally different from the well-established Linguistic Inquiry and Word Count (LIWC) lexicon (Tausczik and Pennebaker 2009). Instead of using a priori general categories as in LIWC (e.g., universal emotions), the domain expert generates a posteriori (i.e., data-driven) categories, and their ascribed semiotics are rooted in the specific corpus, as detailed previously. ...
Full-text available
There is growing agreement among researchers and developers that in certain machine-learning (ML) tasks, it may be advantageous to keep a “human in the loop” rather than rely on fully autonomous systems. Continual human involvement can mitigate machine bias and performance deterioration while enabling humans to continue learning from insights derived by ML. Yet a microlevel theory that effectively facilitates joint and continual learning in both humans and machines is still lacking. To address this need, we adopt a design science approach and build on theories of human reciprocal learning to develop an abstract configuration for reciprocal human-ML (RHML) in the context of text message classification. This configuration supports learning cycles between humans and machines who repeatedly exchange feedback regarding a classification task and adjust their knowledge representations accordingly. Our configuration is instantiated in Fusion, a novel technology artifact. Fusion is developed iteratively in two case studies of cybersecurity forums (drug trafficking and hacker attacks), in which domain experts and ML models jointly learn to classify textual messages. In the final stage, we conducted two experiments of the RHML configuration to gauge both human and machine learning processes over eight learning cycles. Generalizing our insights, we provide formal design principles for the development of systems to support RHML. This paper was accepted by D. J. Wu, special issue on the human-algorithm connection. Funding: This work was supported by the Israel’s Ministry of Defence [Grant R4441197567] and the Israel’s Ministry of Science and Technology [Grant 207076]. Supplemental Material: The data files are available at .
... VOLUME 11, 2023 4. LIWC. The LIWC dictionary is widely used in computational linguistics as a source of features for psychological and psycholinguistic analysis [44]. The Spanish LIWC2007 dictionary includes around 70-word categories to analyze different language dimensions like emotions (e.g., sadness, anger, etc.), self-references, and words for perceptual, cognitive, or biological processes in each text. ...
Full-text available
This study examines the communications of English- and Spanish-speaking Twitter users through traditional and deep learning algorithms to automatically recognize whether they live with one of nine mental health conditions. We created two datasets in English and Spanish. The “diagnosed” set comprises the timeline of 1,500 users who explicitly reported in one or more of their posts having been diagnosed with one of the following: ADHD, Anxiety, Autism, Bipolar, Depression, Eating disorders, OCD, PTSD, and Schizophrenia. The “control” set comprises the timeline of 1,700 randomly selected users who had not disclosed a diagnosis. We extracted a variety of text features from the collected data, such as n-grams, q-grams, Part-of-speech (POS) tags, topic modeling, Linguistic Inquiry and Word Count (LIWC), and word embeddings, and trained traditional machine-learning and deep learning classifiers for two tasks: binary classification, to distinguish between diagnosed and non-diagnosed users, and multiclass classification, to identify the specific diagnosis. Overall, XGBoost and convolutional neural network (CNN) performed the best in the two classification tasks. Moreover, lexical attributes based on n-grams and q-grams are the ones that performed well in both datasets. Using our collected datasets, for binary classification, we achieved an AUC of 0.835 on the Spanish Twitter dataset using n-grams of words from one to three (UBT) and 0.846 on the English Twitter dataset with a 5-gram characters (C5) model. In multiclass classification, we obtained an AUC of 0.712 and 0.697 in the Spanish and English Twitter datasets, respectively.
Establishing and maintaining mutual understanding in everyday conversations is crucial. To do so, people employ a variety of conversational devices, such as backchannels, repair, and linguistic entrainment. Here, we explore whether the use of conversational devices might be influenced by cross‐linguistic differences in the speakers’ native language, comparing two matched languages—Danish and Norwegian—differing primarily in their sound structure, with Danish being more opaque, that is, less acoustically distinguished. Across systematically manipulated conversational contexts, we find that processes supporting mutual understanding in conversations vary with external constraints: across different contexts and, crucially, across languages. In accord with our predictions, linguistic entrainment was overall higher in Danish than in Norwegian, while backchannels and repairs presented a more nuanced pattern. These findings are compatible with the hypothesis that native speakers of Danish may compensate for its opaque sound structure by adopting a top‐down strategy of building more conversational redundancy through entrainment, which also might reduce the need for repairs. These results suggest that linguistic differences might be met by systematic changes in language processing and use. This paves the way for further cross‐linguistic investigations and critical assessment of the interplay between cultural and linguistic factors on the one hand and conversational dynamics on the other.
Consumers are drawn to the inherent quality of a product or service and how it is portrayed and described. This study investigated the impact of language used in property titles on pricing strategies and financial performance in peer-to-peer (P2P) accommodations by adopting the Language Expectancy Theory (LET) as a theoretical framework. The findings of this study demonstrate the significance of linguistic styles in property titles for determining P2P room rates, rental volume, and overall performance. Property titles that exhibit formal, logical, and hierarchical thinking are perceived as more sincere, personalized, and informative, resulting in higher rates, more significant rental volumes, and improved overall performance. On the contrary, properties with titles that convey expertise, and confidence, or adopt a positive and upbeat style tend to put lower room rates and yield lower performance yet generate higher volumes. This study expands the application of LET to P2P communication between an Airbnb host and a potential guest.
Online discussions can fuel perceptions of misalignment, disagreement, conflict or even polarization. In this study, we look at everyday diplomatic expressions that could buffer this. We use automated and manual coding to analyze diplomatic behaviour in online discussions and its consequences for discussion sentiment. We analyze Reddit forums with differing norms: civil ( N = 4594 comments), incivil ( N = 2126) and social support subreddits ( N = 1401). The automated content analysis shows that diplomatic behaviour occurs but does not affect the subsequent discussion. The manual analysis reveals why: discussions consist of disjointed statements rather than dialogue, making diplomacy inconsequential. These results have consequences for the field. First, what appears to be an escalating dialogue might actually be a string of personal attitudes broadcasted in a shared space. Second, the usefulness of automated content analysis in studying interaction dynamics is limited because of difficulties distinguishing broadcasting from dialogue.
Conversation—a verbal interaction between two or more people—is a complex, pervasive, and consequential human behavior. Conversations have been studied across many academic disciplines. However, advances in recording and analysis techniques over the last decade have allowed researchers to more directly and precisely examine conversations in natural contexts and at a larger scale than ever before, and these advances open new paths to understand humanity and the social world. Existing reviews of text analysis and conversation research have focused on text generated by a single author (e.g., product reviews, news articles, and public speeches) and thus leave open questions about the unique challenges presented by interactive conversation data (i.e., dialogue). In this article, we suggest approaches to overcome common challenges in the workflow of conversation science, including recording and transcribing conversations, structuring data (to merge turn-level and speaker-level data sets), extracting and aggregating linguistic features, estimating effects, and sharing data. This practical guide is meant to shed light on current best practices and empower more researchers to study conversations more directly—to expand the community of conversation scholars and contribute to a greater cumulative scientific understanding of the social world.
The focus of many psychologists today is not so much on the traits and long-term characteristics of the people who participate in our research as on their reactions to events and situations. Psychologists are concerned with changing transitory psychological states, but have not yet developed fully effective techniques for their assessment. Content analysis of verbal communications can be helpful in assessing such states. Content analysis is based on the assumption that the language in which people choose to express themselves contains information about the nature of their psychological states. This assumption implies a representational or descriptive model of language, in contrast to the instrumental or functional model preferred, for example, by Mahl [1]. Content analysis can be applied only to verbal, not to nonverbal communications. However, although content analysis cannot be applied to nonverbal communications, inferences can be made about people’s states through objective and systematic identification of specified characteristics of their verbal communications [2, 3]. Content analysis of verbal communications is a way of listening to and interpreting people’s communicated accounts of events. When agreement between independent interpretations is achieved, the essential requirement of scientific endeavor (intersubjective agreement) is met [4].
Hypotheses derived from face theory predict that the words people use in online dispute resolution affect the likelihood of settlement. In an event history model, text data from 386 disputes between eBay buyers and sellers indicated a higher likelihood of settlement when face was affirmed by provision of a causal account and a lower likelihood of settlement when face was attacked by expression of negative emotions or making commands. These aspects of language and emotion accounted for settlement likelihood even when we controlled for structural aspects of disputes, such as negative feedback filings and the filer's role as buyer or seller.
Attempting to understand the body’s signals is similar to trying to interpret the noises and sensations of the automobile that we drive. We do not have a computer printout of either the current physiological status of our body or the condition of the various systems of our car. Given this, we are in the position of attempting to understand a large array of ambiguous sensations about which we have at best a modicum of knowledge. Whether we are dealing with human bodies or inanimate cars, the awareness and reporting of symptoms are dependent on psychological or perceptual processes. Throughout this book, a large number of studies have outlined some of the parameters that determine when and why symptoms are reported. Before discussing some of the implications of symptom research, we present the following brief review of our current knowledge about the perception of physical symptoms.
Abstract Recent studies in social psychology have found that the frequency of certain words in people's speech and writing is related to psychological aspects of their personal health. We investigated whether counts of “self” and “other” pronouns used by 59 couples engaged in a problem-solving discussion were related to indices of marital health. One spouse in each couple had a diagnosis of obsessive-compulsive disorder or panic disorder with agoraphobia; 50% of the patients and 40% of their spouses reported marital dissatisfaction. Regardless of patients' diagnostic status, spouses who used more second-person pronouns were more negative during interactions, whereas those who used more first-person plural pronouns produced more positive problem solutions, even when negative behavior was statistically controlled. Moreover, use of first-person singular pronouns was positively associated with marital satisfaction. These findings suggest that pronouns used by spouses during conflict-resolution discussions provide insight into the quality of their interactions and marriages.