Working PaperPDF Available

A Corpus-Based Study of Register and Collocational Variation in the Semiotics of Sexuality

Authors:

Figures

Content may be subject to copyright.
Iowa State University
From the SelectedWorks of Kirk Marshall Wilkins
April, 2014
A Corpus-Based Study of Register and
Collocational Variation in the Semiotics of
Sexuality
Kirk Marshall Wilkins
Available at: h+p://works.bepress.com/kirkwilkins/2/
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 1
!
!
!
!
!
!
!
The Opening of Doors into New Rooms of One's Own: A Corpus-Based Study of Register and
Collocational Variation in the Semiotics of Sexuality!
Kirk Wilkins!
Iowa State University!
!
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 2
!
Abstract!
A number of organizations, including GLAAD (2010), have expressed displeasure with the use
of the term “homosexual,” noting it features problematic connotations. To analyze how this
problematic term is instantiated within language and its recent evolution in comparison to other
terminology for the gay community in discourse, a corpus-based study of recent diachronic,
register, and disciplinary variation in COCA was undertaken. This study also considered the
discourse prosodies that these different terms may establish through their different collocations.
Data derived indicated that the discursive evolution in the representation of the gay community is
not monolithic, but rather myriad. While there has been a decline in “homosexual” overall, the
academic register differentiates itself through a growth in word frequencies for other terms, in
contrast to other registers characterized by mostly universal decline. Particular disciplines,
especially religion, may discuss sexuality especially frequently, but there is also intriguing
variation indicative of equality within medicinal discourse. “Gay” and “homosexual” collocate
in highly divergent manners and give rise to qualitatively different discourse prosodies, with the
former far more positive and open-ended than the latter, which tends to connote salaciousness.
This study suggested that the linguistic representation of the gay community is neither singular
nor absolute and is instead implicated in a process of resignification.
!
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 3
!
Introduction!
Recently the gay rights movement has achieved much, including marital equality, in the
United States. Over a dozen states have now come to legalize same-sex marriage. More and
more people have come to recognize the reality of this situation and for this population: A study
done in 2013 by Pew Research Center indicated over seventy percent of the population now
believes the legalization of same-sex marriage to be inevitable. Besides the fact that people
acknowledge gay marriage as a part of life, people also affirm it: a Gallup poll last year indicated
that over fifty percent of the population now advocates for and expresses support for same-sex
marriages as being equivalent to traditional ones (2013).
Despite questions that have been raised about the problems with this data, with social
desirability bias perhaps affecting the self-reports of attitudes and views (Powell, 2013), support
for and the movement toward equality for LGBTQ populations across society seem underway.
Within a matter of decades, public opinion has evolved significantly, with acceptance of and
support for gay rights becoming increasingly common. As a result of these trends, LGBTQ
populations have experienced a rapid shift in the treatment they have received in society and
have now gained access into a room of their own in a manner not possible throughout the history
of the nation.!
As culture has evolved and changed, one would assume that the discourse and language
surrounding LGBTQ populations should also have evolved and changed. As society has come to
view LGBTQ persons more positively and with less judgment and stereotyping, language
perhaps could serve as a means of indexing this shift. The discourse surrounding this population
may not only reflect, but also even constitute evolving attitudes and beliefs regarding it.
Alongside a more positive and welcoming treatment of LGBTQ persons by communities may
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 4
!
come a more positive and welcoming treatment in language. One possible way by which this
attitudinal evolution may manifest itself is through the differential use of the terms "gay" and
"homosexual", for the former has now come to gain a more positive connotation in contrast to
the negative, judgmental, and narrow connotations associated with the latter.!
The latter term is exceedingly problematic. A recent New York Times article noted that
the word is "loaded" for LGBTQ populations, carrying with it an air of judgment and narrow-
minded understanding of the population in question (Peters, 2014). Besides being frequently
used by conservatives to refer to this population, the word "homosexual" originated as a medical
term to "Other" this population and prevent its full recognition as human beings beyond
sexuality. Instances of the word may thus strike those to whom it applies as alienating,
disparaging, and stifling. The Gay and Lesbian Alliance Against Defamation advises the media
to refrain from using the word "homosexual" in its media reference guide (2010). Describing the
term as "derogatory" and "offensive" (p. 6), GLAAD contends it focuses excessively, if not
exclusively, on sexuality, indexing criminality and shame (p. 16), and overlooks the myriad other
factors constituting the lives and lifestyles of this population. This pejorative may now even
function as a slur in discourse.!
"Gay" and "homosexual" may connote two different persons and meanings altogether,
based on this interpretation. The capacity of language to position and determine who one is and
how one functions within and performs a role has received substantial discussion and elaboration
in the theoretical literature and constitutes a critical concern for those engaged in queer theory.
Language does not simply reflect the world in an objective manner, sufficiently and wholly
describing what is "out there"; instead, it may affect and encourage particular ideologies and
worldviews, generating the very matters of the world of which it speaks. A poststructuralist
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 5
!
account of language, discourse, gender, and sexuality will provide a fruitful framework for
understanding the implications of and clarify what is at stake in the use of “gay” as opposed to
“homosexual.” Such an account for the consequences of the use of these terms upon the social
universe would explicate the lives lived by this population in terms of how they are discursively
positioned within communities.!
Literature Review!
Theoretical Background: Sexual Orientation as Discursive Formation!
As communicative creatures, humans need language and discourse in order to interact
with one another and, more importantly, understand the worlds they inhabit and their places
within them. "Language enables us to make sense of the world, ourselves and others" (Baker
2008, p. 263). However, language does not always succeed, but rather may precede the world it
is alleged to represent and the human agent perceived as its author. As GLAAD has noted,
discourse does far more than simply represent and reflect reality. In addition, it may come to
constitute and create it, legislating what is within the realm of possibility and the normal, coming
to position persons in the world differentially. The constructs and objects of discussion made
available in and through language do not perhaps precede that language, but rather may ensue
from it. Always already there, the language and discourse communities into which we enter
come to determine not only how we may speak, but even who we are.!
Discourse is perhaps a critical avenue by which power and the bourgeoisie may preserve
power and the status quo, coming to foreclose and determine possibilities and locking doors into
rooms of one's own. Hegemonic forces may maintain themselves through means other than the
military and the police, otherwise known as the Repressive State Apparatuses, and may utilize an
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 6
!
even more insidious and subtle avenue for the reinforcement and reification of their ideologies:
culture and discourse, what Althusser refers to as Ideological State Apparatuses (2001). Through
Ideological State Apparatuses, the ideology of those with power comes to be diffused and spread
across society in such a manner that the proletariat accept the framework articulated as not only
inevitable, but also natural and normal.!
One critical function ideology performs through this process is that of interpellation, of
endowing upon individuals and discursively assigning them to particular subject-positions that
were always already there. Through discourse people come to learn about, accept, and take for
granted the roles and functions they serve in society, seeing their positions as natural and normal
through repetition. Besides helping us understand who we ourselves are, interpellation may also
help us understand who the "Other" is. By defining a deviant, by articulating a party through
which we can differentiate ourselves, interpellation provides an understanding of the broader
world we inhabit and how we fit within it. An Althusserian analysis of the human agent would
argue that it is not the cause of, but rather the result of discourse. The human subject is thus not
so much the producer of discourse as it is the product of discourse. "The power imposed upon
one is the power that animates one's emergence" (Butler, 1997, p. 198). The human subject is a
discursive formation, so to speak.!
Like Althusser, Michel Foucault argues that discourse is not determined by so much as it
in itself determines the world and the reality we perceive (2012). In addition, he rejects the
primacy of the human subject. The "regimes of truth" for a particular community at a particular
time will provide the discursive framework that determines what remains epistemologically
available and allowable and permits what is possible. According to him, discourses are
"practices which systematically form the objects of which they speak" (1972, p. 42). These
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 7
!
particular regimes do not describe but rather prescribe the world, not serving to reflect and
represent it so much as construct and constitute it. Calling into question these regimes of truth is
perhaps a problematic exercise, however, for to question the truth of them is to question the truth
of oneself (Butler, 2005, p. 23): discourse and the ensuing processes of interpellation are what
allow for one to become possible, but also impossible. The truth of "homosexuals", according
to Foucault, rests in the fact that they have not always existed as they do in contemporary
discourse. Not until relatively recently in Western culture was "homosexuality" considered a
part of a person's identity, and instead it was perceived as an ephemeral act, a state, that did not
reflect or represent anything permanent, a trait, about a person (2012), not indexing a stable and
inalterable characteristic into which one is born.!
How discourse positions and solidifies LGBTQ populations is not a trivial matter for
LGBTQ populations, for "language is the key process by which we develop accounts of sexuality
and gender" (Baker, 2008, pp. 15). This process often involves heteronormativity: much of
culture and discourse tends to take heterosexuality for granted and assume it is the norm, treating
it as the expectation, while presenting homosexuality as the exception and the deviation, as the
abnormality and the "Other." Heteronormativity, in short, is a naturalized political and
hegemonic institution that manifests itself throughout myriad discursive practices. As a result,
one is faced with the necessity of "coming out" as gay but never as heterosexual.
Because of it, people identifying as heterosexual are under no compulsion and experience
no pressure to come out of the closet, for they already are fully welcome in a room of their own
that has always already been there. This pervasive tendency to discursively naturalize and reify
heterosexuality, described by Rich as compulsory heterosexuality (1980), has only served to
stifle LGTBQ populations and prevented their full flourishing, instead confining them to the
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 8
!
closet because every room is designed in such a way as to explicitly welcome only heterosexuals
without consciously meaning to. Language and discourse may serve to stifle and limit
possibilities before queer individuals, but it may also serve to empower and emancipate them.!
The fact that gender is not reflected and represented but rather created and constituted
through behavior and language creates myriad possibilities not only for maintenance, but also for
subversion. Butler in Gender Trouble argues that the performative behaviors associated with
gender and sexuality ultimately construct, constitute, form gender and sexuality rather than are
derived from them (2006). Drawing upon Searle's theory of speech acts, she suggests that
gender, like language, is not something that is existent, but rather something that is done.
Though discourse may perpetuate and naturalize compulsory heterosexuality and gendered
expectations and norms through repetition, the fact that gender and sexuality are ultimately
performative constitutes an avenue for liberation.!
The act of parodying gendered expectations and norms, such as the use of drag, is an
avenue of imploding, interrogating, and challenging the dominant discursive frameworks on
gender and sexuality. Through this process, we could slowly but surely encourage and enable
society to approach gender and sexuality differently, embracing a new discursive framework and
articulating different discursive formations. By doing this, by practicing this work of
"resignification," people could ensure signifiers and signifieds in discourse would lose the chains
associating them with one another, resulting in freeplay, a proliferation of possible signifiers
with the endless differance of the transcendental signified that never existed, and an opening up
of possibilities not only for LGBTQ populations, but also for all individuals. "One might wonder
what use 'opening up' possibilities' finally is, but no one who has understood what it is to live in
the social world as what is 'impossible,' illegible, unrealizable, unreal, and illegitimate is likely to
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 9
!
pose that question" (Butler, 2006, pp. viii). Through the resignification of what gender means
and entails in discourse, Butler suggests, we could overcome the force of compulsory
heterosexuality.!
Discourse may position LGBTQ populations harmfully, categorizing and leading to the
positioning of them in society in unjust ways, perhaps marginalizing or repressing them. Perhaps
underneath the level of awareness of users, perhaps explicitly and publicly, language has resulted
in the normalization of traditional gender roles and of heterosexuality, excluding queer
minorities from the table and the conversation of society and culture or perhaps separating them
as deviant and abnormal, as the “Other” through which to differentiate oneself. Linguistic acts
of differentiation and exclusion may serve as a means by which to continue to oppress, exclude,
or stereotype specific populations. If sexuality is discursively formed and implicated, then we
must analyze the language involving the matter in order to better understand how discourse is
performing its work on behalf of power.!
However, discourse also constitutes a site for social change and toward a more just world.
As increasing acceptance and inclusion in society becomes the norm, how discourse represents
and constructs these populations should track and follow this attitudinal evolution. Language
functions in such a way as "to construct, maintain, or challenge what are variously referred to by
researchers in different traditions as attitudes, ideologies, interpretive repertoires, or discourses"
(Baker, 2010, p. 121). In order to examine and approach overall discourse in a meaningful
manner, a critical discourse analysis based in corpus linguistics becomes necessary. However,
this methodological approach is problematic and raises many issues and questions.!
Research Background: Critical Discourse Analysis and Corpus-Based Sociolinguistics!
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 10
!
As a means for understanding how discourse may position LGBTQ populations, either to
oppress or to affirm them, the application of critical discourse analysis may prove fruitful.
Critical discourse analysis focuses on the relationship between language and social inequity and
how it creates and reproduces possibly problematic dynamics of power (Fairclough, 2010, p. 8).
Through a study of ideology and its instantiation in language, critical discourse analysis aspires
to establish how society is discursively structured and arranged, drawing attention to and
criticizing injustices and seeking to open rooms whose doors have been locked traditionally.
The fact that critical discourse analysis is designed so as to explore social injustices and
thus remedy them has resulted in a number of questions being raised and criticisms being
advanced with regard to its methodological rigor and objectivity. Baker notes that it is not
neutral "but motivated by the desire to inspire or cause some sort of social change" (2010, p.
122). It is thus possible that studies undertaken may only serve to confirm passionate interests or
hypotheses, raising questions about objectivity and research limitations. In response to this
criticism, Baker argues that those working in critical discourse analysis are at least explicit about
the positions they have adopted and hold (2008, p. 102). In addition, Baker observes that the
"objective researcher" functions just like any other discursive formation under a particular
regime of truth (2005, p. 10).!
Alongside critical discourse analysis, corpus linguistics may serve as an avenue for the
analysis of ideology and power in discourse and language. Corpus linguistics involves the
collection of a large body of texts and the electronic preservation of them in such a manner as to
enable linguistic analysis through particular functions and features. The recent proliferation of
digital technologies has alleviated the process of collecting and analyzing a significant numbers
of texts. A quantitative analysis of the frequency of particular tokens and other linguistic
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 11
!
features across a large body of texts may allow a researcher to form generalizations about
different kinds of variation, such as diachronic or regional variation, in the use of language.
Through an analysis of corpora, a researcher could generate conclusions about regional or
diachronic variation in the use of lexical items.!
In addition, corpus linguistics constitutes a strong and fruitful avenue for critical
discourse analysis. Through a substantive electronic corpus, researchers may form conclusions
not only about how people use language, but also about how language represents people and thus
identify with more clarity and with more accuracy how ideology is discursively manifesting and
instantiating itself. The large number of texts allowed for with and held by corpora allow for the
analysis and study of ideological behaviors and dispositions within society and within a
particular discourse community. Enabling researchers to make assertions about discourses with
increased confidence, corpus data may illuminate what discursive formations are permitted, what
subject-positions are given, and what rooms are open.!
One possible avenue for using corpora to analyze gender and sexuality in discourse is a
consideration of frequency counts. More instances of a particular term may indicate that
particular term is operating in such a way as to differentiate an "Other" and, through identifying
it, reify and naturalize the "normal" as the taken for granted and the expected. Frequency counts
may serve to indicate markedness (Baker, 2010, p. 125). If this is the case, one would expect
"gay man" to appear in a corpus more frequently than "heterosexual man", which is the case
(Baker, 2005, p. 36). The deviant identity category, the "marked" term, will surface more across
corpora than the expected and normal identity category, the "unmarked" term. In this manner,
through discursively identifying the marked but neglecting to draw attention to the unmarked,
compulsory heterosexuality may continue to reproduce and maintain itself. However, it is also
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 12
!
important to note that simply achieving equal frequencies may not result in equal representation
(Baker, 2010, p. 69).!
For this reason, an analysis of collocations and their ensuing connotations is necessary.
Discourse prosodies (certain connotations that may arise through collocations) may indicate the
power dynamics underlying a word. The words surrounding a term may surface automatically
and unconsciously, but still serve to index and indicate ideologies and discursive frameworks
(Baker, 2010, p. 25). Through an analysis of the words immediately preceding or following a
particular term, a more refined and clearer understanding of how a particular term operates may
emerge. For example, a corpus of debates in Parliament indicated that the right-hand
collocations for “gay” tend to emphasize it as a matter of identity or a trait, while right-hand
collocations for “homosexual” tend to position it as a behavior. While one is externally oriented,
the other is internally oriented: one indexes outward behavior and the other does an internal
characteristic of one’s identity (Baker, 2008. p 44)
Meanwhile, in an analysis of collocational environments around "homosexual" in a
corpus of articles from The Daily Mail, Baker found there was a significant relationship between
that term and "alleged" for that particular publication, reflective of its journalistic nature (2014):
this particular source depicted homosexuality as a matter of secrets, shame, and gossip. In a
diachronic analysis of this corpus, Baker articulates seven discourse prosodies that surface
throughout The Daily Mail's discussion of gays and homosexuals: shamelessness, a focus on
practice and acts, the transitory nature of relationships, the existence of a distinct culture, access
to children, being politically militant, and being effeminate (2014, p. 117). The discourse
prosodies of different sources of texts may index their divergent approaches to and constructions
of sexual orientation. In another study (2005), Baker examined collocations in two British
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 13
!
periodicals, The Mirror and The Daily Mail, and found that their underlying attitudes toward gay
individuals became manifested through their collocational environments. While the former
source was more concerned with politics, the latter was with the consequences of equality (p.
90). In addition, the latter was more inclined to focus on homosexuality as behavior than the
former.
Through the analysis of discourse prosodies, through an identification and analysis of
collocations and the connotations they carry, corpus linguists may come to better understand and
explain how specific subjects are positioned and constructed within discourse. The Daily Mail,
based on Baker's research, advanced a particular view of this population through not word
frequency counts, but rather the collocations for the words in question. As this example
demonstrates, an analysis of discourse prosodies may help to illuminate the myriad subtle ways
in which LGBTQ populations still continue to be "Othered" and compulsory heterosexuality is
maintained.!
Word frequency counts, when combined with an analysis of discourse prosodies, may
allow for a corpus-based study of discourse in highly particular and specific ways, revealing
what may otherwise not receive notice and be overlooked. These two approaches - an analysis
of word frequency counts and an analysis of discourse prosodies - inform, supplement, and
enhance one another. Through applying both, corpus linguists could critically examine discourse
and study the ways in which it interpellates and forms particular parties. This combined
approach should help begin to address a critical need in sociolinguistic research and in society in
general, the "need for critical consciousness of the power of everyday language to shape
discourses of gender and sexuality" (Baker, 2008, p. 263). Baker notes there has been a dearth of
research applying corpus linguistics to the two areas of gender and language and queer
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 14
!
linguistics (Baker 2005, p. 6). In addition, the aforementioned corpus-based studies relied upon
corpora texts derived from texts in the United Kingdom. There does not appear to be a
substantial quantity of research in this area using corpora derived from the United States.!
Preliminary Corpus Analysis through Google's N-Gram and COHA!
One would assume, as society has become more lenient and liberal with regard to matters
of gender and sexuality, the use of the problematic term "homosexual" would have declined. If
this alleged pejorative positions the population in question in a harmful manner, its use should,
one would assume, inversely correlate with increasingly liberal social views. On the other hand,
“gay”, a term apparently more positive and holistic, may have grown in use. Since it represents a
trait than a behavior (Baker, 2005) and thus presents a less denigrating portrait of this population
(Gay and Lesbian Alliance Against Defamation, Inc., 2010), a growth in this term would not
constitute as much of an issue and problem for LGBTQ populations. A cursory analysis of
Google's N-Gram Viewer from the Google Books Project suggests that “homosexual” has fallen
out of use, while "gay" has come to occupy a place as the more common and more acceptable
term for indexing this population. The recent growth of "gay" as the discursive norm stands in
contrast to the year 1978, when it appeared in Google's corpus with approximately the same
frequency as "homosexual" (0.000740% for the former as opposed to 0.000745% for the latter).
The gap between these two terms expanded from 0.000005% to 0.00152% over the next
twenty years: Google's corpus indicates that "gay" surged across an increasingly wider gap to
0.002443% in terms of frequency, more than tripling in the frequency of instances, while the
latter only grew slightly to 0.000923%. Since then, the use of the two terms has substantially
declined, with "gay" falling to a frequency of 0.002131% and "homosexual" to one of
0.000536%. In sum, in a span of thirty years these two terms have come to be increasingly
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 15
!
differentiated in the frequency of use, with one now coming to appear in this particular corpus
four times more frequently than the other.
Table 1
Normalized Frequency (Per Million) for Terms since 1900 in COHA
Time Period
“Homosexual”
“Gay”
“Heterosexual”
1900
0
38.33
0.09
1910
0.04
34.80
0
1920
0.19
43.93
0.08
1930
0.73
38.78
0.37
1940
1.97
36.51
0.21
1950
2.32
27.79
0.53
1960
7.72
18.81
1.21
1970
8.27
17.22
1.39
1980
9.95
15.37
3.32
1990
4.97
31.57
3.08
2000
4.33
36.05
3.21
!
COHA, the Corpus of American Historical English, corroborates these findings on the
decline of "homosexual" in Google's corpus. Peaking at a normalized frequency of 9.95
instances per million words in 1980, the term fell over the course of the next two decades to a
frequency of 4.33 instances in 2000. After rising in the second half of the previous century, the
term appears to have entered into a process of gradually falling out of use. As it declines in
frequency, the term "homosexual" may eventually appear in the corpus with the same frequency
as "heterosexual". Table 1 displays the normalized frequency for both "homosexual" and
"heterosexual" in COHA over time. In the year 2000, the difference in frequency for the terms
was only 1.12 per million words. In contrast, "gay" has rapidly grown in use, according to
COHA. In 1980 "gay" appeared in the corpus with a normalized frequency of 15.37, but in 2000
that surged to 36.05, more than doubling in use.
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 16
!
Though Google's corpus and COHA would appear to corroborate and support the theory
that language would index increasingly progressive social views and beliefs, using these corpora
to generalize about the relationship between discourse and culture is problematic. While
GLAAD would surely celebrate these findings, these findings are tenuous at best and call for a
richer and more thorough analysis of corpora. Google Books does not allow for a more nuanced
analysis by register or genre, and neither Google Books nor COHA consider spoken discourse or
disciplinary communities; it is very much possible that these specific discourse communities and
registers are interwoven with these broader frequency trends in variegated and myriad ways.
Such an analysis would help better identify, explore, and explain the intersections of particular
discourse communities and registers and how they are implicated within and involved with these
broader shifts over time. For this reason, the Corpus of Contemporary American English may
help to illuminate and explore these matters further.!
Research Methodology!
The preliminary analysis of corpus data from Google's N-Gram Viewer and the Corpus of
American Historical English served to inform and assist with the generation of research
questions and hypotheses prior to the use of the Corpus of Contemporary American English.
This corpus, containing four-hundred-and-fifty million words and spanning over two decades
through coverage of the years 1990 to 2012, was selected as the primary vehicle for this study
because of the affordances for analysis it provides. Besides allowing for a refined analysis by
registers, disciplinary communities, and genres, it also allows for a diachronic analysis across the
time period from which it has derived texts. This corpus-based analysis of data in COCA
emerged and began with the broad research question of the general variation and difference
between the terms "gay" and "homosexual" as they are represented and used in the corpus in
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 17
!
relation to the findings provided through Google's N-Gram Viewer and COHA. As the
researcher interacted with COCA further and utilized different functions available within it, the
following more refined research questions surfaced and came to guide the research.!
RQ1. How are specific registers implicated in these recent shifts regarding the frequency
of!instances of terminology for gay populations?
RQ2. How are specific disciplinary discourse communities implicated in these recent
shifts regarding the frequency of instances of terminology for gay populations?
RQ3. What different discourse prosodies for gay populations may emerge through the!
collocational connotations for "gay" and "homosexual" respectively?!
RQ3a. How may these discourse prosodies vary among discourse communities?!
To answer RQ1, the "chart" function within COCA allowed for an analysis of variation
across registers over time. COCA provided raw and normalized frequency counts (per million)
that differentiated between registers and time periods. This allowed for an analysis of both
overall register variation and recent diachronic variation among registers. In addition to register
and diachronic variation, the "chart" function afforded the opportunity to access the actual
instances of use in their environments and undertake a qualitative analysis. For RQ2, the "list"
function within COCA was used with the relevant discourse communities selected through the
"sections" feature. Through doing this, both raw and normalized frequency counts for the use of
a particular term within a specific discourse community were obtained. Finally, for RQ3, the
"compare" function was used to examine the various collocations that surfaced between and
differentiated "gay" and "homosexual"; the collocation feature of the system was manipulated
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 18
!
with to consider and account for particular parts of speech, such as adjectives and nouns. The
analysis of the collocational data sorted the data both by relevance (the numerical significance of
the frequency of the particular collocation with a particular term when compared to the
frequency of the collocation with the other term) and raw frequency.!
As noted, frequency counts served as a means for exploring and answering the above
research questions. However, depending on and generalizing from these alone is a complicated
and problematic endeavor, for quantitative data alone does not index and indicate sufficiently
what discourse is doing. For example, it is very much possible that the more frequent use of
"gay" over time may serve to mark this population and perpetuate compulsory heterosexuality,
maintaining heterosexuality as the taken-for-granted and expected sexual orientation. However,
it is just as possible that the growth in the use of the term may derive from an increased
awareness in society of the needs of and injustices experienced by this population; higher
frequency counts in discourse, indicative of the drawing of societal attention to this group’s
marginalization and repression, may just as much and just as strongly suggest a drive and
movement toward and for social justice and marital equality. The tenuousness of generalizing
from frequency counts renders a quantitative analysis insufficient and necessitates the addition of
a qualitative analysis, done throughout the research questions through the “KWIC” function.
Though this qualitative analysis cannot possibly account for and examine all the texts, it may
begin to problematize and bring more nuance to the quick assumptions that a mere quantitative
analysis may enable and encourage. Therefore, both approaches are essential to an analysis of
corpora and contributed to the research undertaken.
Results
RQ1: Frequency by Register
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 19
!
Earlier the possibility that different registers may discursively approach and construct
sexuality in divergent and various ways was raised. Frequency counts provide a basis for
beginning to understand how these different registers may represent sexual orientation and
LGBTQ populations. Providing the normalized frequency counts for particular terms as they are
used across the five different registers in COCA, Table 2 indicates a substantially higher
frequency for the use of “gay” across registers. One particular register also emerges as featuring
higher frequencies: in general the academic texts sampled by COCA tend to use all of the terms
more frequently than other registers, with the only exception being “gay.” This particular term is
used most frequently in newspapers and spoken discourse.
Curious to note about the academic register is that it contains the lowest ratio of “gay” to
“homosexual.” This indicates that within the entire corpus of academic texts in COCA
“homosexual” and “gay” appear with the least difference between them, in contrast to other
registers, where the difference is significantly wider. The discourse of newspapers is
characterized by the greatest ratio of “gay” to “homosexual.” In contrast to academic discourse,
the register of fiction demonstrates a tendency to use all these terms least. Due to the fact that
the register of fiction may utilize avenues other than explicit language to index sexual orientation
and relies on the practice of showing rather than telling, it was decided this register merits more
of a content analysis, due to the inherent limitations of a corpus-based discourse analysis.
Table 2
Register
“Hm”
“Hmy”
“Ht”
“Hty”
“L”
“G”
“Q”
“G”/”Hm”
Spoken
9.85
7.81
4.43
0.38
18.03
64.64
1.04
6.56
Fiction
2.28
0.81
1.73
0.25
8.37
18.59
5.51
8.15
Magazine
6.60
7.78
3.76
0.28
12.64
40.48
2.28
6.13
Newspaper
7.61
8.56
3.93
0.22
20.34
69.39
1.81
9.12
Academic
18.25
11.49
19.26
1.6
25.29
42.80
5.80
2.35
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 20
!
Notes: Throughout this table and other tables, codes will indicate particular terms: “Hm” for
homosexual, “Hmy” for homosexuality, “Ht” for heterosexual, “Hty” for heterosexuality, “G”
for gay, “L” for lesbian(s), “G” for gay, and “Q” for “queer.” Results for “lesbian” and
“lesbians” have been combined to account for the fact that “gay” may precede both singular and
plural nouns, as it generally operates as an adjective.
However, the data set in Table 2 is problematic and limited in that it excludes the term
“straight,” as searching for universal instances of this term across the corpus is problematic due
to the myriad denotations the word may serve and perform. As with “gay”, the use of “straight”
may avoid certain connotations that may surface with the use of “heterosexual.” A more refined
analysis of this term was performed through an analysis of phrases referring to males of both
orientations. Shown in Table 3, this analysis indicated that “straight” was generally less used to
identify one’s sexual orientation than was “gay,” despite the fact that straight men are more
common in the population. Yet again, academic discourse features an exceptional ratio, this one
indicating that “gay” is far more frequently used than “straight.” Both Tables 2 and 3, when
combined, may support Baker’s contention that the frequency counts may indicate marked
populations and categories, with the sexual orientation of LGBTQ populations far more
frequently indexed in a linguistically explicit manner than for heterosexual populations.
However, these higher frequency counts may also derive from the fact that as society
consciously progresses and evolves on issues of gender and sexuality, a conversation about that
shift will inevitably arise within discourse.
Table 3
Normalized Frequency (Per Million) of Terms for Males in COCA
Register
“Gay Man”
“Straight
Man”
“Gay Men”!
“Straight
Men”!
Combined
“Gay”/”Straight
Ratio
Spoken
1.48
0.58
2.61!
0.24!
4.99
Fiction
0.29
0.34
0.80!
0.15!
2.22
Magazine
0.70
0.43
3.66!
0.39!
5.32
Newspaper
1.45
0.41
4.49!
0.29!
8.49
Academic
0.71
0.10
7.50!
0.31!
20.02
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 21
!
Table 4
Normalized Frequency (Per Million) of Terms over Time in COCA
“Homosexual”
Time Period
Spoken
Magazine
Newspaper
Academic
1990-1994
4.66
2.13
2.62
4.56
1995-1999
1.95
1.19
1.66
4.23
2000-2004
1.26
1.56
1.36
3.56
2005-2009
0.81
0.76
0.72
2.80
2010-2012
0.79
0.92
0.79
1.89
“Heterosexual”
Time Period
Spoken
Magazine
Newspaper
Academic
1990-1994
2.05
1.13
1.15
2.13
1995-1999
0.75
0.92
0.68
2.47
2000-2004
0.58
0.79
0.79
4.46
2005-2009
0.34
0.32
0.52
4.81
2010-2012
0.71
0.62
0.69
6.28
“Lesbian(s)”
Time Period
Spoken
Magazine
Newspaper
Academic
1990-1994
7.85
3.71
4.96
1.59
1995-1999
2.91
3.74
4.38
4.41
2000-2004
2.56
1.72
3.13
5.18
2005-2009
1.63
1.23
3.1
5.02
2010-2012
3.41
2.6
3.01
12.17
“Gay”
Time Period
Spoken
Magazine
Newspaper
Academic
1990-1994
15.09
9.95
14.34
2.63
1995-1999
11.74
7.81
12.93
6.28
2000-2004
14.23
7.26
13.77
10.39
2005-2009
10.67
7.89
13.28
11.78
2010-2012
16.18
9.11
14.68
13.54
“Queer”
Time Period
Spoken
Magazine
Newspaper
Academic
1990-1994
0.24
0.85
0.28
0.16
1995-1999
0.13
0.26
0.27
0.91
2000-2004
0.39
0.58
0.39
0.44
2005-2009
0.19
0.30
0.60
1.85
2010-2012
0.06
0.23
0.15
3.53
“G”/”Hm”
Time Period
Spoken
Magazine
Newspaper
Academic
1990-1994
3.24
4.67
5.47
0.58
1995-1999
6.02
6.56
7.79
1.48
2000-2004
11.29
4.65
10.13
2.92
2005-2009
13.17
10.38
18.44
4.21
2010-2012
20.48
9.90
18.58
7.16
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 22
!
The analysis of registers within COCA so far has not accounted for diachronic variation
within and among registers. Needless to say, besides rendering problematic and providing
increased illumination to the frequencies found above, diachronic variation in the frequency
counts within registers would help to uncover how discourse is changing and evolving within
particular communities. Society has significantly changed since the earliest texts included in
COCA were composed, with acceptance of LGBTQ populations now becoming the norm. For
this reason, one may assume that individual registers have changed as well, with frequencies
rising or falling based on how their discursive formations are tracking the broader culture’s own
evolution. Table 4 presents the normalized frequencies for “homosexual,” “heterosexual,”
“lesbian(s),” “gay,” and “queer” over time across different registers. In addition, the table
documents the ratio of the normalized frequencies of “gay” to the normalized frequencies of
“homosexual.”
COCA indicates that across registers the difference in frequency between “homosexual”
and “gay” has expanded over the last two decades within all registers. Besides this, in general,
across registers, both “homosexual” and “heterosexual” have generally declined. These two
declines are not at the same rate within these registers, however: “heterosexual” has declined less
rapidly than “homosexual” with the result that the two are approaching almost equal normalized
frequency counts. Though the academic register also has demonstrated a decline in the
frequency of the use of “homosexual,” it distinguishes itself with the term “heterosexual,” which
has grown in usage in the corpus. Alongside the term “heterosexual,” the academic register
deviates from other registers with the terms with “lesbian(s)” and “queer”: here too the frequency
of the terms has grown, whereas for other registers they have declined and are not used as
frequently. The term “gay” has come to blossom in academic discourse, now coming to be used
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 23
!
with a similar frequency to the other registers after being far less common in the 1990-1994 time
period. Meanwhile, other registers demonstrate overall stability with the term “gay.”
Academic discourse, in general, defies the typical patterns demonstrated by the other
registers, with use of terminology generally rising, with the exception of “homosexual.”
Alongside Google’s N-Gram Viewer and COHA, COCA indicates this term is in the process of
falling out of use not only generally, but also within the academic register. Any term other than
“homosexual,” however, appears to have experienced a rise in use in academic discourse and is
coming to be used with substantially more frequency in comparison to other registers. The
academic register, according to COCA, has developed differently from other registers in terms of
frequency counts for LGBTQ populations. It is the exception.
However, generalizing about all of academic discourse based on this data is problematic
for the reason that disciplines may diverge from one another in how they discursively construct
and represent sexual orientation. It is possible certain disciplines may account for the above
growth in frequency counts. An analysis of particular discourse communities is thus needed in
order to fully explicate and understand what is transpiring.
RQ2: Frequency by Disciplinary Discourse Community
Academic disciplines may articulate and construct discursive formations differently.
Using different frameworks for understanding reality, disciplines may demonstrate variation in
the necessity or nature of approaches to and representations of sexual orientation. Certain
disciplines may not experience as much need to draw attention to or discuss LGBTQ populations
due to the other concerns and issues that compel current research and conversations, a fact that
word frequency counts may reflect. Table 5 provides these normalized frequency counts within
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 24
!
particular academic disciplines for the same terms from Table 2, as they appear within COCA.
In addition, it provides the ratio of the normalized frequency of “gay” over “homosexual.”
Table 5
Normalized Frequency (Per Million) of Terms for Sexual Orientation by Discipline in COCA
Discipline
“Hm”
“Hmy”
“Ht”
“Hty”
“L”
“G”
“Q”
“G”/”Hm”
Education
4.98
3.98
13.03
0.42
20.55
44.48
0.53
8.93
History
7.51
8.98
3.35
0.41
5.14
10.37
1.80
1.38
Geography/Social
Sciences
35.48
17.68
52.29
3.52
72.87
96.97
8.41
2.73
Law/Political
Science
4.77
1.86
7.21
0.12
5.82
25.35
0
5.31
Humanities
17.27
14.76
16.01
2.52
16.85
40.33
23.06
2.34
Philosophy/Religion
55.04
30.12
29.82
3.86
26.56
64.09
1.93
1.16
Science/
Technology
4.33
1.78
1.71
0.14
5.4
9.88
0
2.28
Medicine
6.87
5.82
35.07
0
11.79
29.25
0.15
4.26
The data provided by this table illuminates how particular academic disciplines differ
from one another in the terms for LGBTQ populations that they may employ more frequently.
Religious discourse is characterized by a ratio of almost one for “gay” and “homosexual,” while
the ratios are higher for all other disciplines. Religious discourse also distinguishes itself
through higher normalized frequency counts for and appears to use the words “homosexual” and
“homosexuality” more frequently than other disciplinary areas, with the social sciences
following. The social sciences are an intriguing case, however, for they also use “gay” more
frequently than any other discipline and, like other areas within the human sciences and liberal
arts (education and law/political science), tend to use “heterosexual” more than “homosexual,”
though “gay” is still far more frequently used than either. Another discipline that uses
“heterosexual” more than “homosexual” is medicine. Medicinal discourse distinguishes itself
from other disciplines in the fact that it uses “heterosexual” more than “gay.” Not to be
overlooked, the humanities are worth noting for its abundance of instances of “queer.”
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 25
!
This variation within academic disciplines has helped to identify the particular discourse
communities that have substantially contributed to the significantly higher frequency counts for
the academic register provided in Table 2. It is possible that magazine genres may follow the
patterns for variation found for academic disciplines, with some discussing and approaching
LGBTQ populations in highly particular ways different from others. Table 6 provides the
normalized frequency counts from COCA for magazine genres. Religious magazines, like
religious academic writing, are characterized by an exceptionally more frequent use of the terms
“homosexuals” and “homosexuality.” Religious magazines, like religious academic writing, are
also characterized by a low ratio for frequencies of “gay” to frequencies of “homosexual” in the
corpus. In addition, something that distinguishes the discourse of magazines from academic
discourse is the fact that there are more instances of “homosexuality” than of “homosexual”
within genres, a pattern that corroborates the higher normalized frequency count for
“homosexuality” compared to “homosexual” for magazines in Table 2.
Table 6
Normalized Frequency (Per Million) of Terms for Sexual Orientation by Genre in COCA
Genre
“Hm”
“Hmy”
“Ht”
“Hty”
“L”
“G”
“Q”
“G”/”Hm”
News/Opinion
15.27
16.72
6.50
0.59
14.51
83.21
4.84
5.45
Financial
0.19
0.57
0.19
0.00
0.19
5.33
0
28.05
Science/Technology
4.43
7.91
3.96
0.08
6.1
14.80
0.40
3.34
Society/Arts
2.74
1.83
0.91
0.13
6.93
14.77
2.09
5.39
Religion
45.19
53.15
13.82
0.23
41.21
110.52
0.23
2.45
Sports
0.18
0.28
0.18
0.00
1.57
9.70
0.46
53.89
Entertainment
9.09
8.84
3.93
0.74
20.89
134.15
5.90
14.76
Home/Health
0.19
0.19
0.44
0.13
0.56
4.39
0.38
23.11
Women/Men
3.28
4.51
7.90
0.41
45.53
55.17
5.64
16.82
African-American
3.30
6.61
8.26
1.10
24.22
77.07
1.65
23.35
Children
0
0
0.61
0
0
6.73
6.12
NA
Terms other than “homosexual” and “homosexuality” should receive consideration. The
frequency counts for “gay” indicate that there are more instances of it within the genre of
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 26
!
entertainment, with religious magazines following it. “Heterosexual” is also more frequently
used than “homosexual” within magazines for women and men and for African-Americans,
though “gay” is far more frequently used in comparison to either. In addition, it is curious to
note that while there is an unsurprising dearth of use of terminology involving “sexual” as the
root word in children’s magazines, there are some instances of “gay” and “queer.” An analysis
of the textual environments from which these instances came indicated that all occurrences did
not index LGBTQ populations; all uses functioned within older, more traditional meanings of the
terms. The one remaining instance, the single appearance of “heterosexual,” appeared from a
magazine article, one addressing STD transmission and intended for adults, that was misplaced
in the corpus. Children’s magazines, it appears, do not index in a linguistically explicit manner
LGBTQ populations.
RQ3: Discourse Prosodies
A corpus of debates in Parliament advanced divergent discourse prosodies for the terms
“gay” and “homosexual” (Baker, 2005). An analysis of collocations in the entire body of COCA
led to similar results. Rather than merely considering the collocations that occurred most
frequently, however, they were considered and analyzed in terms of relevance and the extent to
which the particular collocation was used more frequently with either “gay” or “homosexual”
individually in comparison to the other term. Table 7 provides a listing of the right-hand noun
collocations (by no more than two words) for “gay” and “homosexual” that were indexed by
COCA as being relevant. Despite being grounded in relevance, this analysis did consider
frequency: only relevant collocations that were among the first hundred most occurring
collocations were selected for analysis, as the increased frequency for these would ensure higher
and accurate representation in the corpus.
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 27
!
Table 7
Discourse Prosodies Based on Relevant Noun Collocations (R2) in COCA
“Gay”
Discourse Prosodies
“Gay”
Fact of Existence
guy (21.5), lesbian (15.6), Americans (7.5), people (5.5), man
(3.8), guys (3.3), men (3.2), gene (2.3)
Political Activism & Power
pride (57.2), liberation (17.7), activist (8.5), movement (5.0),
activism (4.3), rights (3.9), advocates (2.2)
Community & Organizing
bar (50.8), bars (37.3), club (15.1), clubs (14.3) , leaders (2.8),
organization (2.1), world (2.1)
Intersection with Other Roles
Republicans (13.2), member (10.2), head (9.8), Republican
(7.9), vote (7.9), troops (7.2), bishop (4.0), friends (3.2),
soldier (2.2)
Minors & Access to Them
youths (42.0), adoption (10.9), youth (5.0), teenagers (3.6),
kids (3.4), students (2.6), student (2.4)
Media & Public Visibility
parade (32.4), film (12.8), festival (7.2), porn (6.4), movie
(3.8), newspaper (3.8), character (2.0)
Subject of Social Debate &
Conversation
bashing (17.7), controversy (10.9), studies (8.1), ban (3.8),
issue (2.2)
“Homosexual”
Discourse Prosodies
“Homosexual”
Externally-Oriented Sexual
and/or Transient Behavior
acts (818.3), conduct (542), practices (318.8), activity (276.3),
sodomy (201.9), relations (188.6), rape (148.8), copulation
(138.2), intercourse (116.9), contact (85), behavior (82.7),
practice (74.4), act (63.8), stimuli (63.8), experiences (42.5),
activities (31.9), encounters (31.9), encounter (17.7),
experience (17.5), prostitution (15.9), affair (6.2), lifestyles
(4.0), lifestyle (3.9), lover (3.7), sex (2.4),
Relationships/Community
relationship (7.4), unions (4.7), relationships (4), love (3.9),
Internal Dispositions &
Thoughts
inclination (180.7), feelings (148.8), orientation (101),
tendencies (95.9), desires (95.6), fantasies (74.4), attraction
(53.1), behaviors (53.1), fantasy (53.1), desire (47.8),
inclinations (26.6), identity (2.1)
Power & Visibility
lobby (12.8), subculture (7.1), themes (6.4), agenda (3.2)
Something Suffered
stigma (393.2), prejudice (74.4), nature (63.8), condition
(26.6), panic (13.3), victims (12.4), victim (8.9)
Intersection with Other Roles
persons (101), subjects (53.1), photographer (26.6), teacher
(21.3), teachers (9.3), prostitute (4.6), citizens (3.8), sailor
(3.8), child (2.2), males (2.2), priest (2.1)
Note: “Cowboy” was placed underneath the discourse prosody category of “Media and Public
Display” due to the fact that most of the instances of the term came from discussions of the film
Brokeback Mountain. “Games” was excluded from the data as it references the proper noun Gay
Games.
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 28
!
The numbers following each term indicate the score for it identified by COCA; this score
accounts for the differential frequencies of the terms in the corpus and normalizes them in the
ratios of use. “Gay” occurs 5.31 as many times as “homosexual” in the corpus, and, reversely,
“homosexual” occurs 0.91 as many times. For example, there are 154 raw instances of “acts”
following “homosexual”, in contrast to only one instance of “acts” following “gay.” However,
as there are 0.19 as many instances of “homosexual” as there are instances of “gay,” the true
ratio for this collocation, according to COCA, is actually 818.3. Thus, COCA indicates that
homosexual acts is substantially far more frequent in discourse than is gay acts. This particular
collocation may align with others and thus constitute a pattern, indicating a discourse prosody.
These collocations then underwent analysis to identify possible patterns and discourse prosodies.
Through a qualitative analysis, similar and synonymous terms were combined. The relevant
frequencies were thus sorted into categories, categories that indexed particular discourse
prosodies that seemed present in the data. The particular discourse prosodies that emerged are
indicated in Table 7.
According to the collocational environments indexed as relevant by COCA, the
discourse prosodies for these two terms differ qualitatively from one another. For “homosexual”
the categories that developed through the qualitative analysis are as follows: externally-oriented
sexual and/or transient behavior, relationships and community, internal dispositions and
thoughts, power and visibility, something suffered, and intersections with other roles. The
discourse prosodies for “homosexual” tend to emphasize sexual behavior, often indexing shame,
transience, or secrecy. Besides actual behavior, “homosexual” often collocates so as to express
lust and internal dispositions. In addition, “homosexual” often collocates in such a way as to
connote a political threat, as is the case of homosexual agenda. Another discourse prosody that
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 29
!
appears is one indexing suffering, as if sexual orientation were a burden upon one’s being and
thus a justification and call for compassion, as is instantiated with homosexual condition. One
particular collocation that is of note is “persons,” whose ratio when compared to gay persons is a
significant 101: homosexual persons is much more frequently used across the texts sampled
within the corpus.
Meanwhile, for “gay,” the categories of the matter of existence, political activism and
power, community and organizing, intersections with other roles, minors and access to them,
media and public visibility, and being a subject of social debate and conversation emerged.
While the noun collocations for “homosexual” may proceed in a direction of salacious
connotations, noun collocations for “gay” may focus on empowerment, belonging, and visibility.
For the term “gay,” noun collocations appear to often align sexual orientation with other areas of
identity, such as being American. These collocations index the fact that gay individuals often
belong to other identity categories besides their sexual orientation.
Discourse prosodies for “gay” do more than indicate that there is more to identity than
sexual orientation. In addition, the word often appears before nouns indexing political
aspirations or community. Additional discourse prosodies index the fact that young LGBTQ
populations are present in our schools, denoting the reality that there exist adolescents within our
communities who identify as such. LGBTQ individuals are present and visible in our schools,
but also within the rest of society and the media, as another discourse prosody indicates. This
discourse prosody connotes the fact that there is general public visibility of this community,
especially in the media through such avenues as films and characters in texts, alongside a
representation in porn available on the internet and elsewhere. One additional discourse prosody
involves the fact that this population remains the subject of social controversy and conversation,
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 30
!
in addition to academic study and analysis. These discourse prosodies are generally more
positive.
Needless to say, these categories for discourse prosodies in association with both of these
words are not absolute with firm boundaries; they are tenuous with overlap among them. Some
terms, such as “student” or “Americans,” could fall within multiple categories for “gay.” There
were some curious items, such as “world,” whose specific instances in the corpus underwent
analysis to determine and clarify the context of their use to place them accordingly. Besides the
nouns that follow instances of “gay” and “homosexual,” though, the adjectives that precede them
may also give rise to particular discourse prosodies.
Adjectival collocations are far less frequent in the corpus: while the first right-hand noun
collocation for “gay” (men) has a raw frequency of 1804, the first left-hand adjectival collocation
for “gay” (other) has a raw frequency of only eighty-four (the actual most frequent adjective
collocating with “gay” in the corpus, National, was removed from the data set, as it references
the National Gay and Lesbian Task Force). Similarly, the most occurring right-hand noun
collocation for “homosexual” is acts, with a raw frequency of 154, while the raw frequency for
the most occurring left-hand adjectival collocation (male) is only forty-seven. Due to the smaller
sample of instances of adjectival collocations in the corpus, identifying a normalized ratio may
not accurately represent language and discourse in society.
Nevertheless, an analysis of the raw frequency counts for the most occurring collocations
may still indicate particular discourse prosodies that may occur with these two terms. Table 8
provides the most occurring adjectival collocations for “gay” and “homosexual.” The adjectival
collocations are consistent with the collocations found in Table 7 with regard to the connotations
of the words indexed and the discourse prosodies that arise. Adjectives that precede “gay” often
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 31
!
indicate relationships to other identity categories, such as race or age. “Homosexual,” on the
other hand, often follows adjectives that indicate whether or not one has acted upon one’s sexual
orientation; like the noun collocations, these adjectival collocations tend to indicate external
behavior or internal tendencies one experiences. The discourse prosodies that these adjectival
collocations indicate are congruent with the discourse prosodies found through noun
collocations, suggesting overall that these particular terms of “gay” and “homosexual” carry
divergent connotations and discursively mark this population in different manners.
Table 8
Raw Frequencies for Adjectival Collocations (L1) in COCA
“Homosexual”
“Gay”
Male
47
Other
84
Female
13
Young
83
Closeted
10
Black
69
Active
8
American
41
Practicing
7
Older
33
Open
7
Largest
29
Consensual
6
Local
28
Only
5
Modern
23
Passive
5
White
23
Repressed
5
Closeted
19
One should begin to interpret and analyze this collocational data with some caution in
mind, however. As noted earlier, generalizing about language and discourse based on this data is
a problematic and tenuous enterprise due to the fact that different registers, disciplines, and
genres may contribute differently to the overall corpus. An analysis of collocations and their
ensuing discourse prosodies within disciplines is perhaps necessary, but also problematic in that
there is a small quantity of instances provided by the corpus. Due to the small number of texts
sampled and the low frequency of instances overall within disciplines, finding normalized ratios
through the “score” function is problematic. Therefore, the analysis of frequency counts and
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 32
!
normalized ratios, while tempting, should take place with caution and prudence. The corpus may
not adequately and accurately represent these specific discourse communities as a whole.
Nevertheless, it is possible that some disciplinary variation in collocations may exist, variation
that COCA may indicate despite the aforementioned limitations.
Table 9
Ten Most Occurring Noun Collocations (R2) by Raw Frequency by Discipline in COCA
“Homosexual”
Religion/Philosophy
Social Sciences
Medicine
Acts - 52
Stigma - 36
Men - 8
Teacher - 36
Activity - 24
Man - 4
Persons - 19
Behavior - 18
Adolescents - 2
Teachers - 18
Men - 17
Intercourse - 2
Activity - 16
Community - 15
Relationships - 1
Orientation - 13
Experiences - 14
Propensity - 1
Men - 12
Experience - 13
Practices - 1
Couples – 8
Friend - 11
Patient - 1
Relations - 8
Stimuli - 11
Partnerships - 1
Males - 7
Relations - 9
Identity - 1
“Heterosexual”
Religion/Philosophy
Social Sciences
Medicine
Couples - 20
Men - 79
Women - 93
Men - 16
Women - 49
Men - 31
Candidates - 13
Activity - 22
Transmission - 15
Women - 10
Couples - 20
Youths - 12
Relationships - 5
Relationships - 17
Counterparts - 7
Parents - 5
Intercourse - 15
Participants - 5
Marriage - 5
Users - 15
Intercourse - 5
Jobs - 5
Contact - 13
Peers - 4
Homosexual - 4
Groups - 13
Individuals - 4
Person - 4
Group - 12
Sex - 3
Table 9 provides the raw frequency counts for noun collocations for “homosexual” and
“heterosexual” for the disciplines of religion/philosophy, geography/social sciences, and
medicine, as they are sampled within COCA. These particular disciplines were selected based
on the higher normalized frequency counts reported earlier. As part of this analysis, the term
“heterosexual” was selected in favor of “gay” to better identify how disciplines may distinguish
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 33
!
gay individuals from heterosexual ones in their discourse prosodies, due to the lower frequency
counts.
Due to the limitations of this corpus in terms of the number of texts sampled and resulting
low frequency counts for these collocations, forming general discourse prosodies is not possible.
For example, all of the instances of homosexual teacher and homosexual teachers in religious
discourse derive from one particular text and thus cannot speak for all of religious discourse.
However, there seems in general to be equality in the representation of both sexual orientations,
with sexual activity and general relationships and belonging indexed through collocations across
disciplines. Religious discourse, however, does differentiate itself in that, while it does draw
attention to the sexual behavior of gay individuals through its noun collocations, it does not
similarly highlight the sexual behavior of heterosexual individuals. This “top ten” list is the only
one lacking any reference to sexual activity, with the first noun collocations serving such a
function being heterosexual contact, heterosexual intimacy, and heterosexual acts, all having in
religious academic writing only two instances respectively; this is in contrast to the fifty-two
instances of homosexual acts.
Table 10
Ten Most Occurring Noun Collocations (R2) by Raw Frequency for Religious Magazines
“Homosexual”
“Heterosexual”
Persons
17
Marriage
5
Inclination
10
People
5
Behavior
10
Person
3
Acts
7
Marriages
3
Priests
7
Priests
2
Tendencies
6
Women
1
Person
5
Unions
1
People
5
Spouses
1
Orientation
5
Relationships
1
Men
4
Couples
1
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 34
!
The question then surfaces as to whether religious discourse in general draws attention to
and marks the sexual activity of gay individuals, while ignoring and leaving unmarked and
unremarked the sexual activity of heterosexual ones. Table 10 contains the ten most occurring
noun collocations for “heterosexual” and “homosexual” within religious magazines. Two of the
top ten most occurring nouns collocating with “homosexual” align with the discourse prosody of
externally-oriented sexual and/or transient behavior, while another three align with the discourse
prosody of internal dispositions and thoughts. Yet again there is an absence of any noun
collocation denoting sexual activity with “heterosexual.” One should also note that “persons” is
especially frequent as a noun collocation with “homosexual” but not with “heterosexual,” as is
also the case with religious academic writing, according to Table 9. While this plural noun is
frequent with the discussion of gay individuals, it does not appear frequently in the discussion of
heterosexual individuals.
Discussion
Evolving Heteronormativity in Discourse
This study began in the theoretical basis of the discursive and cultural pervasiveness of
compulsory heterosexuality, the fact that heterosexuality is unmarked and expected as the norm
and considered normal, if not inevitable. The corpus data suggested that the tenacity of
compulsory heterosexuality may still linger: LGBTQ populations generally remain marked and
the explicit subject of much more frequent conversation in society than heterosexuals. However,
as noted earlier, it is problematic to interpret higher frequency counts as being indicative of
social injustice and differential treatment of groups, because, besides indicating markedness,
higher frequency counts may also indicate attention is being drawn within society to the
experience and marginalization of a particular group. The higher frequencies found for academic
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 35
!
discourse may support the latter interpretation. As society embraces more accepting views on
sexual orientation, the myriad implications of this shift in media, policy, and society are going to
result in increased presence of the topic in discourse.
The three research questions sought to analyze the register, disciplinary, genre, and
collocational variation for the two terms of “gay” and “homosexual.” The latter term has been
seen as problematic and as a pejorative in both the popular and the academic literature. Though
“homosexual” appears to have entered a process of falling out of use across all registers, it
appears it may not be alone: other terms, such as “lesbian” and “heterosexual,” have generally
declined, though they have experienced a slight rise in frequency since 2010. Nevertheless, it
appears terminology for sexual orientation has generally been falling over time according to the
normalized frequency counts, with the exception of “gay.”
The use of “gay” has for the most part remained stable over time across most of the
registers, coming to be used with more frequency in comparison to the declining “homosexual,”
although “gay” has come to acquire increased frequency over time in the academic register. In
addition, the academic register has come to use all of these terms, besides “homosexual,” much
more frequently over time, surpassing other registers in terms of frequency and demonstrating
substantially higher use of the terms. Especially interesting is the growth of “heterosexual” in
contrast to the decline of “homosexual” within the academic register.
The Curious Case of Academic Discourse and Disciplines
Interpreting these results for academic discourse is generally problematic, for conclusions
are tenuous and mostly based on intuition. One could argue that this diachronic variation in
frequency is either due to academic discourse coming to discuss matters of sexuality more
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 36
!
frequently or due to a general increased tendency toward explicit indication of all sexual
orientations in the literature. For example, the context for one particular recent token was this:
“This commission of inquiry, albeit unofficial and inchoate, was a political and legal response to
the escalating fears expressed by Indian Agents, missionaries, and colonial bureaucrats regarding
heterosexual contacts [italics added] between European men and aboriginal women” (Mawani,
2010, p. 487-488). Collocational environments such as these imply the possibility that discourse
is coming to explicitly indicate any sexual orientation, including in the absence of contextual
reference to homosexuality. However, it is also possible that this variation may derive from
specific disciplines and not reflect academic discourse as a whole.
The results indicated variation among academic disciplines with regard to the frequency
of particular tokens. The variation among frequency counts for terms across disciplines may
index different discourses for sexual orientation and sexuality across these. All disciplines
demonstrate heteronormativity, with generally higher frequencies for terms for LGBTQ
populations than for heterosexual ones. However, there is stark variation among them in how
this occurs. Important to note is that within the disciplines of education, geography/social
sciences, law/political science, and medicine more instances of “heterosexual” surface in the
corpus when compared to “homosexual,” though “gay” is still a far more abundant term across
these. There is one interesting exception to this pattern, however: medicinal discourse.
Within this particular discipline, as represented in the texts sampled within COCA, the
terms “heterosexualand “gay,” appear with relatively similar frequency. Medicinal discourse
differentiates itself from the other disciplines in that there is relatively equitable normalized
frequency counts for these two terms. What this indicates about medicinal discourse is open to
interpretation. However, an analysis of the tokens through the KWIC function indicated “gay”
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 37
!
and “heterosexual” often occur around each together or together in research analyzing both
sexual orientations. Medicinal research, it appears, may incorporate and consider both sexual
orientations in its research designs, comparing and contrasting the two.
The Great Flood of “Homosexual*” in Religious Discourse
Besides medicinal discourse, especially important to note is philosophical and religious
academic writing, which used 19.56 more instances of “homosexual” per million than the
immediately following discipline in terms of normalized frequency, the social sciences.
Religious discourse tends to use the term “homosexual” with about the same frequency “gay,”
according to the “G”/”Hm” ratios provided in Tables 5 and 6. Additionally, religious discourse,
according to COCA, uses “homosexuality” more frequently: 12.44 more instances of
“homosexuality” per million appear for religion than for the social sciences. If the various forms
of “homosexual” are problematic, as both Baker (2005) and GLAAD (2010) have argued,
religious discourse is characterized by an excessive use of these problematic terms.
Understanding why these higher frequency counts or the generally similar quantity of use
of “gay” and “homosexual” exist in religious discourse is a challenging enterprise; such a
qualitative pursuit would depend heavily on intuition and conjecture. One possibility that some
may propose is that, if “homosexual*” connotes sexual activity, religious discourse in its higher
frequency counts betrays an unusual and hypocritical preoccupation with the activities in which
gay individuals partake in the bedroom. Freudian reaction formation offers a tempting and
possible explanation, considering the research indicating that homophobia is associated with
reaction formation (Weinstein et al., 2012). Reaction formation is a “process of adopting values
or beliefs, or engaging in behaviors, that are in opposition to feelings or impulses experienced
within oneself that are deemed unacceptable” (Weinstein et al., 2012, et al.). The analysis of
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 38
!
noun and adjectival collocations within religious writing (Tables 9 and 10) suggests there is more
interest in same-sex sexual activity than in opposite-sex sexual activity in religious discourse.
Such an interpretation, while possible, is also convenient and overlooks the nuances and
complexity of religious discourse.
The fact of the matter is that religious discourse naturally considers matters of the human
condition, ethics, and behavior extensively. For this reason, it may inform opposition to gay
rights. Much of the opposition to the gay rights movement uses arguments based in religion,
contending that homosexuality is a sin, an unnatural act, and an “objective disorder”
(Congregation for the Doctrine of the Faith, 1986); stronger religiosity has been associated with
opposition to gay marriage (Newport, 2012). In this regard, the higher frequencies found for the
use of “homosexual” reported are to be expected: if this behavior and this identity are considered
wrong by a particular person or discourse, a more pejorative and negative term may come to be
used more frequently. Religious communities are especially concerned about the ethics of sexual
behavior, something that may contribute to the use of these terms.
One cannot assume either that most if not all of the instances of “homosexual” originate
from those expressing opposition to the gay rights movement, as intuitive as such an assumption
may at first appear. A KWIC analysis of the instances of “homosexual” in this discipline
indicated a significant quantity of the instances, numbering in the dozens, came from The
Humanist in texts expressing fervently strong support for the LGBTQ community and fiercer
opposition to the role religion itself has come to play in society. Though supporting gay rights,
The Humanist frequently utilized the problematic term “homosexual” in its own writing, perhaps
reflecting the concern with sexual ethics that may pervade throughout religious writing.
Queer Collocations and Discourse Prosodies as a Site of Resignification
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 39
!
This study indicated that religious discourse tends to use the term “homosexual” more
frequently than other discourse communities. Though some would argue that this term is
problematic in that it explicitly emphasizes sexuality at the expense of other components of the
gay experience, it becomes especially problematic through its discourse prosodies. The
connotations that arise through the nouns and adjectives with which “homosexual” collocates are
salacious at best, indexing external and perhaps temporal behavior or prurient internal
inclinations. The lasciviousness with which “homosexual” comes to be associated through these
discourse prosodies positions these individuals as excessively preoccupied with and engaged in
sexual activity at best and as perverts at worst.
In addition, “homosexual” may collocate in such a way as to represent and construct this
population as a threat possibly impinging upon the status quo of the traditional world we have
known, despite the fact that another discourse prosody for this term encourages us to perhaps
take pity upon and have compassion and mercy for this population. The discourse prosodies for
“homosexual” conflict with and contradict with one another, articulating homosexuality as
something suffered but also something shameful and criminal. One curious collocation that
appears more frequently with “homosexual” is homosexual persons. This plural noun is rare in
language, serving a function similar to the much more frequent “people”: “people” appears in
COCA 786,082 times, compared to 15,608 times for “persons.” Why this specific plural noun
receives significantly more use in this context is a valid question, as the word carries rather cold
and impersonal connotations. Altogether, all of these discourse prosodies do not endow a
positive image and connotation upon this population.
In contrast, the analysis of connotations for “gay” led to different discourse prosodies,
ones that are far more positive and ameliorative in nature. This finding aligns with previous
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 40
!
research performed by Baker (2005, p. 44). These discourse prosodies, one could contend,
indicate rather mediocre, mundane, and monotonous realities: the fact that gay (young) people
exist, the fact that these people belong to other identity categories and are more than their sexual
orientation, and that they belong to and are visible within local and wider communities. Rather
than focusing on one’s sexuality, it seems “gay” emphasizes one’s humanity. Based on the
corpus data and this analysis of collocations and discourse prosodies afforded by COCA, the Gay
and Lesbian Alliance Against Defamation is certainly justified in expressing concerns about how
the term “homosexual” may depict and connote this population.
Needless to say, GLAAD would thus take much satisfaction in the decline in
“homosexual” reported across different corpora and registers. Rather than discursively indexing
what one does, culture is transitioning toward discursively indexing who one is. “Gay” marks
who one is, while “homosexual” marks what one does. “Gay” provides “an alternative, more
positive conceptualisation of homosexuality” (Baker, 2005, p. 2001). By removing the linguistic
and connotational emphasis on how these people’s sexuality is performed, “gay” may provide a
more flexible, more malleable, and more open basis for identifying and describing this
population, not entailing and indexing particular practices in a negative and judgmental way.
New rooms are opening as a result. This population may come to inhabit a room of its own at
last in language.
Conclusion
One could thus argue that the evolving discursive practices adopted and used by society
to describe LGBTQ populations may reflect a process of resignification already underway, with
new signifiers and new signifieds coming to index this population in the freeplay of signs. New
discursive formations are coming to be adopted and represented, created and constructed,
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 41
!
embraced and welcomed. These repetitious shifts in language may reflect a shift in how LGBTQ
populations are interpellated and identified in culturally intelligible ways within discourse and
culture. “Indeed, to understand identity as a practice and as a signifying practice, is to
understand culturally intelligible subjects as the resulting effects of a rule-bound discourse that
inserts itself in the pervasive and mundane and signifying acts of linguistic life” (Butler, 2006, p.
198). The mundane act of using “gay” rather than “homosexual” in discourse will repeat and
reinforce the trends already underway and aid in the discursive positioning of this population in a
rejuvenated and more holistic manner.
However, discourse is not monolithic. Particular discourse communities may
demonstrate variation and diverge from one another in how they treat sexuality. This study has
indicated that to generalize about discourse as a whole will overlook myriad conceptual
differences. The findings on academic register and particular disciplines within it indicate that
there exist variegated and multiple discourses surrounding sexuality, with various processes for
conceptualizing and understanding sexuality and relationships. For example, medicine
demonstrated a discursive tendency toward equal representation and analysis for both sexual
orientations. On the other hand, one particular discourse community that raises many questions
and concerns is the religious discourse community, an area that merits further analysis. Still,
generalizing about discourse communities from this data is problematic for a number of reasons.
Research Limitations and Directions for Future Research
This corpus-based study has sought to explicate and interpret some of the findings in the
results. While these interpretations seek to support and substantiate themselves through
evidence, they do demonstrate an inevitable issue with corpus linguistics: corpora do not explain
and make sense of themselves (Baker, 2005, p. 36). Though the data provided by the corpus
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 42
!
would have framed and informed such interpretations, inevitably prejudices and biases on my
part would have motivated and affected the conclusions made. Individual philosophies,
attitudes, and experiences undoubtedly may shape how the corpus data gains meaning and
significance. I concede that my own experiences and friendships with LGBTQ individuals have
led me to empathize with and advocate for this population. In addition, this may have led me to
frame and interpret the data provided by the corpus in a particular way in this discussion. Were I
instead to object to LGBTQ individuals and their lifestyles, highly different interpretations
would have perhaps emerged. This incontrovertible reality about the analysis of corpus data
raises the possibility that certain patterns or trends have been overlooked and ignored, while
others have been pursued and discussed thoroughly. Analyzing a corpus poses a challenge in
and of itself, but another issue is the corpus selected.
This study depended on COCA for all the data found. However, it is possible that the
analysis of particular disciplines, especially for their collocational environments, is not
representative of the disciplines as a whole, due to the number of texts sampled by COCA. This
study sought to analyze whether different discourse prosodies appear within particular discourse
communities. Low frequencies prevent firm generalizations and instead serve as a basis for the
creation of additional corpora, containing more texts from particular academic disciplines and
magazine genres. Through an analysis of these, increased accuracy and clarity with regard to
unique disciplinary discourse prosodies for sexuality may emerge. In addition, such a corpus
would perhaps prove substantive enough to allow for a broader analysis of diachronic variation
among disciplines.
By relying on COCA, this study also provides a basis for examining the discursive
variation in the semiotics of sexuality only in the United States. Geographic variation in the
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 43
!
representation of LGBTQ populations within other countries and languages did not take place
with this particular study, but constitutes a fertile opportunity for analyzing how different
cultures discursively form and frame sexual orientation. A cursory analysis of information
provided by GloWBE indicates that Canada may tend to use terminology for sexual orientation
less than other Western English-speaking countries, but forming generalizations based on this
fact alone is problematic in that much of the online discourse for the United States sampled in
GloWBE may involve the ongoing legalization of same-sex marriage across the country.
Meanwhile, Canada has had legal same-sex marriage nationwide for almost ten years now. This
difference may account for the different frequency counts, as each country is situated in different
sociopolitical contexts. Nevertheless, future studies could analyze how countries are different
from or similar to one another in how they convey sexuality in language.
Besides accounting for geographic variation around the world, additional studies could
develop corpora that consider the age or political orientation and views of the producer of the
discourse. Polls have indicated that people younger in age tend to support gay rights more than
people older in age; possibly their discourse will reflect this attitudinal difference and merits
analysis. For this reason, political views should also inform the design of future research and
construction of specialized corpora into this matter.
In sum, this study was limited in terms of the social variables that it considered,
examining only register, discourse community, and (recent) time period. There are a number of
social variables not included and analyzed in this study that should guide future research.
Besides social variables, there are also additional linguistic variables. This study did not account
for and include all possible terminology available and groups within the LGBTQ community.
This study was male-centric in that, while it included “lesbian” in its analyses, it did not explore
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 44
!
the term in depth and detail, something that future studies should pursue. Additionally, future
research should analyze language for bisexual, pansexual, asexual, transsexual, and genderqueer
populations. This study, it is recognized, suffers from a bias toward and privileges monosexual
gay male identity as the norm for the LGBTQ community and thus has not sufficiently delved
into how discourse forms these other, more polysexual groups. Language may treat and signify
these specific populations in ways qualitatively different from those identified here. Research
should consciously and deliberately focus on these discursive formations.
As queer individuals come to be included in society, it is perhaps inadequate to limit a
corpus analysis solely to how sexual orientation is represented. As Baker explains,
“Queer…does not want to be ‘tolerated’ or assimilated. Queer is against the ‘normal,’ not the
heterosexual; and because of this, it is able to transcend categories” (Baker, 2008, p. 193). As
LGBTQ populations come to be included in society and new understandings of sexual
orientation emerge, new understandings of gender may also emerge. Heteronormativity, the
tenacious means by which compulsory heterosexuality discursively perpetuates itself, contains
implications for and constructs both gender and sexual orientation. Though this study has
suggested that heteronormativity remains common if not prevalent and pervasive, the increasing
inclusion of the queer community within society may eventually help interrogate and challenge
its hold and relevance. Language and discourse may correspond with this evolution, reflecting
an increase in the use of gender-inclusive and gender-neutral language, besides language that
does not betray gendered stereotypes and expectations. Some research into this area (Baker,
2010) has indicated that this process is beginning to take place.
Many possibilities have been raised for corpus-based studies. Such analyses of corpora
may present information about trends and changes in language, but it overlooks the fact that
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 45
!
language may not explicitly indicate sexual orientation. For this reason, the register of fiction
was excluded from further analysis, as content analyses may operate better in this area. Whether
the text is a novel or a newspaper, a magazine or a journal article, language still features an
audience in addition to an author. Because of this, future research could also analyze the
psychological impact of and reactions to particular terms and collocations in individuals. Such a
study would complement a consideration of word frequency counts and discourse prosodies,
drawing attention to the negative and pejorative or positive and ameliorative connotations and
implications of particular language and discursive practices. This attitudinal and psychological
research would further emphasize what is at stake with these particular signifiers and the
signifieds that they encode.
The Discursive Resignification of Sexuality
The signifiers used within language to indicate sexual orientation constitute a far from
inconsequential site, this analysis of data from COCA has found. The two signifiers at play –
“gay” and “homosexual” – discursively form LGBTQ individuals in highly distinctive fashions
and signify qualitatively different facts of identity, leading these individuals into different rooms
(or perhaps closets). “Homosexual” and “gay” may indeed index two different constructs,
signifying and forming within discourse differently. The terms construct this population in
starkly disparate ways, indicating that there is a freeplay in terms of how LGBTQ individuals are
perceived and positioned within society. By displaying this freeplay of signification and the
contested natures of queer individuals, discourse is caught in the midst of proliferation of
possibilities.
With “homosexual” declining in use, more open-ended and emancipatory language is
growing in use. The declining frequency counts in general indicate that society as a whole is
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 46
!
moving toward an understanding, a signification, that does not inhibit but rather empowers.
Rather than try to constrain, repress, and mark this heteroglossia and diversity of sexuality and
relationships, discourse is perhaps embracing it. Academic discourse may have become more
inclined to discuss the spectrum of sexual orientations, while other registers are moving in
another direction in its acceptance: the gradual disappearance of terms marking sexual
orientation across most registers suggests the possibility that a resignification of sexuality is
underway and that compulsory heterosexuality is losing its force. Though the use of “gay” has
remained stable for the most part according to COCA, it also is often associated with other
identity categories, indicating that sexual orientation is not all that one is. In contrast,
“homosexual*” may not account for all that one is and for all that one can be, instead focusing
on one particular act and not on the existential possibilities beyond the bedroom. As a result,
“gay” is opening new rooms beyond the bedroom for the discursive formation of sexual
orientation.
Within discourse, one is thus not born, but rather becomes gay or homosexual. This
identity is neither absolute nor binding, neither inevitable nor homogeneous. Rather, sexual
orientation is discursively formed and constitutes an evolving and variegated construct. In
embracing the less limited and narrow term “gay,” discourse is expanding the range of existential
possibilities and rooms available to this population. Recognizing the malleability and diversity
within the LGBTQ population, discourse no longer insists, no longer controls, and no longer
asks. Instead of closing doors, it opens them. New more humane subjectivities into which to be
interpellated have emerged as possible.
If […] new modes of subjectivity become possible, this does not follow from the fact
there are individuals with especially creative capacities. Such modes of subjectivity are
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 47
!
produced when the limiting conditions by which we are made prove to be malleable and
replicable, when a certain self is risked in its intelligibility and recognizability in a bid to
expose and account for the inhuman ways in which ‘the human’ continues to be done and
undone (Butler, 2005, pp. 133-134).
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 48
!
Bibliography
Althusser, L. (2001). Ideology and ideological state apparatuses (notes towards an!
investigation). Lenin and philosophy and other essays (B. Brewster, Trans.). New York:!
Monthly Review Press. (Original work published 1970).!
Baker, P. (2005). Public discourses of gay men. London: Routledge.!
Baker, P. (2008). Sexed texts: Language, gender, and sexuality. London: Exquinox.!
Baker, P. (2010). Sociolinguistics and corpus linguistics. Edinburgh: Edinburgh University!
Press.!
Baker, P. (2010). Will Ms ever be as frequent as Mr?: A corpus-based comparison of gendered!
terms across four diachronic corpora of British English. Gender and Language, 4(1), 125!
-149.!
Baker, P. (2014). Using corpora to analyze gender. London: Bloomsbury.!
Bourdieu, P. (1984). Language & symbolic power. Cambridge, MA: Harvard University!
Press.!
Butler, J. (1997). The psychic life of power: Theories in subjection. Stanford: Stanford!
University Press.!
Butler, J. (2005). Giving an account of oneself. New York: Fordham University Press.!
Butler, J. (2006). Gender trouble: Feminism and the subversion of identity (2nd ed.). !
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 49
!
New York: Routledge.
Congregation for the Doctrine of the Faith. (1986, October 1). Letter to the bishops of the
Catholic church on the pastoral care of homosexual persons. Retrieved from
http://www.vatican.va/roman_curia/congregations/cfaith/documents/rc_con_cfaith_doc_1
9861001_homosexual-persons_en.html!
Davies, M. (2008-). The corpus of contemporary American English: 450 million words, 1990!
present. Retrieved from http://corpus.byu.edu/coca/!
Davies, M. (2010-). The corpus of historical American English: 400 million words, 1810!
2009. Retrieved from http://corpus.byu.edu/coha/!
Davies, M. (2013). Corpus of global web-based English: 1.9 billion words from speakers in!
20 countries. Retrieved from http://corpus2.byu.edu/glowbe/!
De Beauvior, S. (2012). The second sex (C. Borde & S. Malovany-Chevallier, Trans.) [Kindle!
DX Version]. Retrieved from amazon.com. (Original work published 1948).!
Fairclough, N. (2013). Critical discourse analysis: The critical study of language (2nd ed.). !
New York: Routledge.!
Fairclough, N. (2013). Language and power (2nd ed.) [Kindle DX version]. Retrieved from!
amazon.com.!
Foucault, M. (2012). The archaeology of knowledge and the discourse on language [Kindle!
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 50
!
DX Version]. Retrieved from amazon.com.!
Foucault, M. (2012). The history of sexuality: An introduction: 1 (R. Hurley, Trans.) [Kindle!
DX Version]. Retrieved from amazon.com. (Original work published 1978)!
Foucault, M. (2012). The order of things: An archaeology of human sciences [Kindle DX!
Version]. Retrieved from amazon.com.!
Friginal, J. & Hardy, J. (2014). Corpus-based sociolinguistics: A guide for students. New!
York: Routledge.
Gamson, J. (1995, August). Must identity movements self-destruct? A queer dilemma. Social
Problems, 42(3), 390-407.!
Gay and Lesbian Alliance Against Defamation, Inc. (2010, May). Media reference guide (8th!
ed.). Retrieved from http://www.glaad.org/files/MediaReferenceGuide2010.pdf!
Gramsci, A. (1971). Selections from the prison notebooks. London: Lawrence & Wishart. !
Michel, J., Shen, Y., Aiden, A., Veres, A., Gray, M.,!Brockman, W., The Google Books Team,
Pickett, J., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M., and
Aiden, E. Quantitative analysis of culture using millions of digitized!books. Science.
(Published online ahead of print: 12/16/2010)!
Jones, J. (2013, May 13). Same-sex marriage supports solidifies about 50% in US. Retrieved!
from http://www.gallup.com/poll/162398/sex-marriage-support-solidifies-above.aspx
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 51
!
Mawani, R. (2010, October). ‘Half-breeds,’ racial opacity, and geographies of crime: Law’s
search for the original ‘Indian.’ Cultural Geographies, 17(40) 487-506.
Newport, F. (2012, December 5). Religion big factor for Americans against same-sex marriage.
Retrieved!from http://www.gallup.com/poll/159089/religion-major-factor-americans
-opposed-sex-marriage.aspx!
Peters, J. (2014, March 21). The decline and fall of the 'H' word. The New York Times.!
Retrieved from http://www.nytimes.com/2014/03/23/fashion/gays-lesbians-the-term!
-homosexual.html?_r=0!
Pew Research Center. (2013, June 6). In gay marriage debate, both supporters and opponents!
see legal recognition as 'inevitable.' Retrieved from http://www.people-press.org/2013!
06/06/in-gay-marriage-debate-both-supporters-and-opponents-see-legal-recognition!
-as-inevitable/!
Powell, R. (2013, September). Social desirability bias in polling on same-sex marriage ballot!
measures. American Politics Research, 41(6), 1052-1070.!
Rich, A. (1980). Compulsory heterosexuality and lesbian existence. Signs, 5, 631-660.
Weinstein, N., Ryan, W. S., DeHaan, C. R., Przybylski, A. K., Legate, N., & Ryan, R. M.
(2012). Parental autonomy support and discrepancies between implicit and explicit sexual
identities: Dynamics of self-acceptance and defense. Journal of Personality and Social
A CORPUS-BASED STUDY OF VARIATION IN THE SEMIOTICS OF SEXUALITY 52
!
Psychology, 102(4), 815–832.
ResearchGate has not been able to resolve any citations for this publication.
Chapter
Althusser extends Marx's notion of reproduction of the means of production beyond the production system to the Ideological State Apparatues and the Repressive State Apparatuses. The Ideologocial State Apparatuses, especially education, ensure that we are reproduced as subjects of the ruling ideology. "Ideology represents the imaginary relationship of individuals to their real conditions of existence...Ideology has a material existence...always exists in an apparatus, and its practice, or practices" (Althusser developing the notion of ideology)
Article
Queer linguistics has only recently developed as an area of study; however academic interest in this field is rapidly increasing. Despite its growing appeal, many books on 'gay language' focus on private conversation and small communities. As such, Public Discourses of Gay Men represents an important corrective, by investigating a variety of sources in the public domain. A broad range of material, including tabloid newspaper articles, political debates on homosexual law and erotic narratives are used in order to analyse the language surrounding homosexuality. Bringing together queer linguistics and corpus linguistics the text investigate how gay male identities are constructed in the public domain.
Book
Shows how techniques from corpus linguistics can be used in sociolinguistic research. This textbook introduces students to the ways in which techniques from corpus linguistics can be used to aid sociolinguistic research. Corpus linguistics shares with variationist sociolinguistics a quantitative approach to the study of variation or differences between populations. It may also complement qualitative traditions of enquiry such as interactional sociolinguistics. This text covers a range of different topics within sociolinguistics: • Analysing demographic variation • Comparing language use across different cultures • Examining language change over time • Studying transcripts of spoken interactions • Identifying attitudes or discourses. Written for undergraduate and postgraduate students of sociolinguistics, or corpus linguists who wish to use corpora to study social phenomena, this textbook examines how corpora can be drawn on to investigate synchronic variation, diachronic change and the construction of discourses. It refers to several classic corpus-based studies as well as the author's own research. Original analyses of a number of corpora including the British National Corpus, the Survey of English Dialects and the Brown family of corpora are complemented by a new corpus of written British English collected around 2006 for the purposes of writing the book. Techniques of analysis like concordancing, keywords and collocations are discussed, along with corpus annotation and statistical procedures such as chi-squared tests and clustering. Paul Baker takes a critical approach to using corpora in sociolinguistics, outlining the limitations of the approach as well as its advantages.
Article
In order to investigate frequency and context of usage of gender marked language, four equal sized and equivalently sampled corpora of British English in a range of written genres (press, fiction, general prose, learned writing), from 1931, 1961, 1991 and 2006 were compared. Terms that were investigated included male and female pronouns, man, woman, boy and girl, gender-related profession and role nouns such as chairman, spokesperson and policewoman, and terms of address such as Mr and Ms. Some reductions in frequencies of male terms were found over time, particularly in terms of decreases of male pronouns and Mr. However, equal frequencies did not necessarily equate with equal representation. A qualitative analysis of man and woman found that while there had been some reductions in gender stereotypes, others were being maintained (such as a lack of adjectives like successful or powerful being applied to words like woman). Additionally, the term girl was still more likely than the term boy to refer to adults, and it was often used in a disparaging or sexual way. The article concludes with a discussion of the sort of linguistic strategies that appear to have been successful in terms of equalising gender representation.