Article

The Frequency and Use of Lexical Bundles in Conversation and Academic Prose

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Importantly, the items are processed of as individual words that perform different (pragmatic or discoursal) functions (Hyland, 2012;Wood, 2010;. They have a long history in linguistics scholarship dating back to Jespersen (1924);Firth (1951) and Hymes (1968) (Conrad & Biber, 2005, Hyland, 2008a. ...
... Adults speakers utilise formulaic language as a production approach to speak with less effort and focus (Yorio, 1980). The items helps speakers keep their fluency in the face of processing demands (Conrad & Biber, 2005) as they convey such information as: physical description; presence or existence; intangible attributes; extensive procedures or occurrences (Biber, et al., 1999, p. 298). They allow learners to grasp realistically pertinent aspects of language at an early stage inspite of finding other language acquisition concepts difficult (Wood, 2015), as there is a limited range of formulaic sequences for foreigh learners to master their language usage (Howarth, 1998;Kogan, et al., 2018). ...
... They allow learners to grasp realistically pertinent aspects of language at an early stage inspite of finding other language acquisition concepts difficult (Wood, 2015), as there is a limited range of formulaic sequences for foreigh learners to master their language usage (Howarth, 1998;Kogan, et al., 2018). From the psycholinguistics point of view, the items are are a crucial factor to consider when creating language descriptions (Conrad & Biber, 2005). Commentators conclude that a student's inability to grasp the word sequence's "textual and interpersonal functions" will affect the student's performance in both spoken and written language situations (Biber & Barbieri, 2007). ...
Article
Full-text available
Humans conceptualise and present their views, feelings, etc. through language. Despite the importance of language, they do not always use freshly minted words or expressions every time they communicate. Instead, they often use readymade chunks (recurrent fixed sequences of two-to-five-word) expressions that linguists term lexical bundles. Adopting the corpus-driven approach, this study examined the use of four-word lexical bundles in the university inaugural lectures in order to determine their presence, usage, structure and the function the bundles serve in the inaugural lectures. Using Antconc 4.2.4, the study searched a purpose-built specialised corpus of 120 inaugural lectures from 12 Faculties. It discovered 40 four-word lexical bundles, used 3264 times in the corpus; and that Mr. Vice-Chancellor Sir, was the most used (312 instances used by all the 12 faculties) while in the course of, in terms of the and Vice Chancellor sir, the were the least used (50 instances, apiece with the range of 10/11) in the inaugural lectures' corpus. The study, then, compared its findings to the British Academic Written English (BAWE) corpus to prove that lexical bundles are genre-specific items. 22 bundles were extracted from the BAWE, 11 of which tallied with the 25 found in the inaugural corpus. The study also established that the structural patterns of the bundles of the inaugural lectures' corpus were made up of noun phrase + of; prepositional phrase + of, anticipatory it + verb/adjective, etc. and that the bundles functioned as stance and referential expressions, discourse organisers as well as honorifics in the lectures. The study will be useful to academic studies designers as it points out the roles of lexical bundles in academic discourse to signal a variety of communication functions as well as study the language patterns of academic discourse.
... In other words, the acquisition of chunks is responsible for the appropriate production of the targeted language (TL), and consequently is a major step in the learning process (Burgos, 2015). To this end, since the acquisition of words occurs in chunks and not in single words in L1, then formulacity is central for the acquisition of languages, and at the same time is responsible for the production of fluent language through retrieving word sequences that are adequately stored in memory (Conrad & Biber, 2005;Perez-Llantada, 2014). ...
... Lexical bundles, which are one type of formulaic language, are known in the literature with various names such as "recurrent word combinations, clusters, phrasicon, n-grams" (Herrnandez, 2013, p.187). Coming in the form of incomplete structures, the term prefabricated units of language is also used to identify lexical bundles, which are not marked by certain positions in phrases or clauses (Allan, 2016;Conrad & Biber, 2005;Lenko-Szymanska, 2014). Extended collocations are one of the other names used to refer to lexical bundles since they are composites of a minimum of 3 words that co-occur statistically in a genre without being distinguished by their idiomaticity or structure (Allen, 2010;Huang, 2015;Lenko-Szymanska, 2014). ...
... Functionally, the analysis of lexical bundles relies on their place in texts. In many cases, they have an apparent function even if they are not in context such as it is necessary to, which refers to a formal obligatory attitude (Conrad & Biber, 2005). A bundle can also be ascribed to more than one function as in the end of the that can imply time or place (Conrad & Biber). ...
... While both Key-LBs in HD and YD are mostly VPbased (i.e., consisting of a verb component), HD has a higher proportion of VPbased Key-LBs (75.54%) than that of YD (61.40%). The result shows that HD is closer to Conrad and Biber's (2005) finding that 90% of the LBs used in spoken British English involve verb components. On the other hand, a higher proportion of PPbased Key-LBs (i.e., bundles starting with a preposition) is found in YD (17.54%) than HD (7.91%). ...
... Excerpts 1 and 2 are just two of the many examples contrasting Hawkes's and the Yangs' preferences for subjectpredicate and topiccomment structures, respectively. Overall, we can safely conclude that Hawkes's Key-LBs follow the spoken English convention in which most of the LBs involve verb components (Conrad and Biber 2005) structured in the form of personal pronouns + verb (Biber 2009 "So now your creditors have gone, you can come out of hiding. You ought to be getting back now in any case. ...
... Hawkes's VP-based Key-LBs, such as I think you and ought to be, are manifestations of subject prominence in English; the Yangs' PP-based Key-LBs, such as if not for and for no reason, may be influenced by topic prominence, in which preposition phrases often serve as adverbials in Chinese. This supports previous research (e.g., Yip 1995;Biber andBarbieri 2007, 2009;Conrad and Biber 2005) that LBs in spoken English are mostly verb phrases, and Chinese speakers tend to use prepositional phrases to topicalize the bare nouns or noun phrases when they speak English. ...
Chapter
The use of lexical bundles (LBs) has been affirmed to be a reliable indicator of translators’ style, as they can reveal the idiosyncrasies beyond the use of words (Mastropierro 2018). Using LBs as an indicator, the current study investigates how fictional dialogues in two full-length English translations of Hongloumeng diverge in style. This corpus-assisted study is based on the first 80 chapters of two full-length Hongloumeng translations, that is, one translated by the British sinologist David Hawkes (who translated the first 80 chapters) and John Minford (who translated the remaining 40 chapters), and the other co-translated by the Chinese translator Xianyi Yang and his British wife, Gladys Yang. The results of our study show that Hawkes used more tokens and types of LBs than did the Yang couple. Further structural and functional analysis revealed that Hawkes overused verb phrases and stance markers, whereas the Yangs overused prepositional phrases and referential markers. The divergences in style are discussed with reference to the translators’ language backgrounds, life experiences, and translation purposes.
... Rarely has the light been shed upon spoken discourse, despite the fact that lexical bundles are the key element in communication of spoken language (McCarthy & Carter, 1997) and such bundles occur in daily communication (Conrad & Biber, 2005;O'Keeffe, McCarthy, & Carter, 2007). To date, the studies of lexical bundles are not as extensive as written discourse since lexical bundles are still viewed in relation to academic discourse. ...
... The result shows that 21 lexical bundles (20.79%) function as discourse organizers in the Doctor Talk corpus. This is in line with previous research (e.g., Biber et al., 2004;Biber & Barbieri, 2007;Conrad & Biber, 2005) showing that discourse organizers are less common than stance bundles in spoken discourse. These lexical bundles serve both sub-types of discourse organizers: topic introduction and topic elaboration/clarification. The bundles were used to introduce topics (15 occurrences, 14.85%) more so than elaborating or clarifying topics (six occurrences, 5.94%). ...
... The other bundle thank you very much was found to show politeness of the speaker. The results support the evidence found in the study by Conrad and Biber (2005) that a few numbers of special conversational functions are found in spoken discourse because they are very purposive. ...
Article
Full-text available
Lexical bundles, or recurrent word strings, are one of the key elements in increasing fluency of linguistic production and in mastering second language learning. In most previous works, lexical bundles were analyzed in specific disciplines. Little research paid attention to spoken discourse, particularly in doctors' conversations (hereby referred to as "Doctor Talks"). This study aimed to investigate four-word lexical bundles in Doctor Talks and their operationalized functions. A Doctor Talk corpus was compiled from a famous medical TV series, Grey's Anatomy, consisting of approximately one million running words (269 episodes from 12 seasons over 11 years). Four-word lexical bundles were identified, using WordSmith Tool version 7.0, and their discourse functions were analyzed, using Biber et al.'s (2004) functional taxonomy as a framework. The results reveal that 99 bundle types are present in the Doctor Talk corpus. Stance bundles are common in this spoken conversation corpus while lexical bundles articulating with special conversation features show the least proportion. The results also show some particular functions used in Doctor Talks as discourse organizers.
... They must be mastered and utilized as fixed terms in suitable settings to develop a conversation that sounds natural and native-like. Conrad and Biber (2005) theoretical model was the most prevalent used in the classification of formulaic expression. However, this study did not aim to classify such expressions. ...
... Studies obtained Conrad and Biber (2005) theoretical model while analyzing the gathered data. Khoiriyah and Mujiyanto (2022) explored the formulaic expressions of Kampung Inggris Pare students. ...
Article
Full-text available
This research aims to investigate the impact of learning lexical chunks on the English as a Foreign Language (EFL) Saudi learners’ (aged 13 to 17) speaking fluency. The study uses an intervention with intermediate Saudi learners comprising lexical chunks based upon the books Collocation in Use and Common Idioms in English. Findings obtained from the post-test show that the experimental groups scored significantly better when compared to their performance in the pre-test of speaking fluency. On the contrary, the difference in the performance of the control group between the pre and post-tests is not significant as far as speaking fluency is concerned. The findings also show that the experimental group participants had favorable sentiments regarding explicit lexical chunk training. The research has theoretical and practical consequences in teaching and learning a foreign/second language.
... By grasping the meanings of words, students can express themselves beyond their current level of proficiency and expand their vocabulary knowledge. Secondly, although lexical bundles exhibit intricate and incomplete structural patterns, they serve as vital building blocks of typical discourse (Conrad & Biber, 2005;Gil & Caro, 2019). While linguists may not recognize them intuitively, they fulfill an essential function. ...
... While linguists may not recognize them intuitively, they fulfill an essential function. Thirdly, lexical bundles constitute the basic construction of discourse in academic registers, particularly in spoken and written forms (Conrad & Biber, 2005). These bundles are crucial for English language students, since they comprise simple expressions and are easy to learn within the normal language acquisition process (Biber et al., 1999;Northbrook et al., 2022). ...
Article
Full-text available
Background. Lexical bundles in textbooks are of paramount importance in foreign language learning. They provide a framework for new vocabulary acquisition and help to build fluency. Despite many studies on lexical bundles, investigations into their usage in EFL textbooks in the Indonesian context are still rare. Purpose. This corpus-based study examines the patterns and structural classifications of lexical bundles in EFL course textbooks for Indonesian senior high school students. As such, it could yield ready-made chunks of English which could be incorporated into students’ spoken and written communication. Method. The AntConc software version 3.5.9 was used to extract lexical bundles from five Indonesian Senior High School English textbooks. These books were endorsed by the government to be used across the country. The corpus revealed that the textbooks had 54,009 lexical bundles. In addition, the bundles were categorized into patterns and structural classifications based on Biber et al. (1999). Results. The results showed the patterns included three-word lexical bundles with 32,527 occurrences, four-word with 11,620, five-word with 6,073, and six-word with 3,789. Furthermore, eleven structural classifications of lexical bundles were found in the textbooks: “noun phrase + of phrase fragment” with 173 occurrences; “noun phrase + other post modifier fragment” with 44; “other noun phrases fragment” with 157; “prepositional phrase + of” with 13; “other prepositional phrases” with 243; “anticipatory it + verb phrase/adjective phrase” with 13; “passive verb + prepositional phrase” with 19; “copula be + noun phrase/ adjective phrase” with 30; “(verb phrase +) that- clause” with 59; and “(verb/adjective +) to- clause” with 239. Conclusion. Three-word lexical bundles were the most frequent in the senior high English textbooks. High frequency implies repetition of the bundles. Also, the other prepositional phrase fragment was the most frequent structural classification. Short bundles may have been intended to help students to retain vocabulary and recall the bundles in the usage. This study, therefore, provides valuable insights into the most common groups of words used in the Indonesian EFL textbooks. Pedagogically speaking, repeated bundles in English textbooks can familiarize EFL students with the patterns, and they can use them in spoken and written communication.
... To begin with, studies focusing on the variable of registers thrived before 2010 (e.g., Conrad and Biber, 2005;Biber and Barbieri, 2007;Biber, 2009;Kim, 2009). For example, Conrad and Biber (2005) contrasted bundle use between academic and conversational prose. ...
... To begin with, studies focusing on the variable of registers thrived before 2010 (e.g., Conrad and Biber, 2005;Biber and Barbieri, 2007;Biber, 2009;Kim, 2009). For example, Conrad and Biber (2005) contrasted bundle use between academic and conversational prose. They found that the most frequent structural type in conversation is 'personal pronoun + lexical VP (+ complement clause)' , which takes up 44% of the bundles, whereas Frontiers in Psychology 03 frontiersin.org ...
Article
Full-text available
This pilot study aims to investigate the differences between varying lengths of bundles in structure and function by comparing the 100 most frequent three-, four-, and five-word bundles in a self-built corpus of dissertations which contains about 3.5 million words. The findings reveal considerable variances between bundles of different lengths in terms of both structure and function. In general, the variances between three-and four-word bundles are greater than those between four-and five-word bundles, and three-and five-word bundles. Structurally, three-and four-word bundles differ significantly in all six main categories. Four-and five-word bundles vary in five categories, while three-and five-word bundles are only different in four categories. Functionally, noticeable variances were observed in research-, text- and participant-oriented bundles between three-and four-word bundles, and three-and five-word bundles. However, four-and five-word bundles only differ significantly in text- and participant-oriented bundles. Interestingly, bundles of varying lengths also vary in patterns that are used to perform the same functions. The results of this study might inform researchers that they need to take bundle lengths into consideration when making generalizations of their findings or comparing bundles between various studies.
... Lexical bundles, which are recurring word sequences that occur in high frequency across texts (Biber et al., 1999), are useful linguistic units that contribute to fluent linguistic production and effective communication (Hyland, 2008). According to Conrad and Biber (2005), approximately 28% of words in conversation and 20% of words in academic prose are produced in the form of three-word and fourword lexical bundles. Professional writers and student writers tend to use different types of lexical bundles in academic writing (Cortes, 2004;Römer, 2009). ...
... They are identified based on two criteria: the frequency of occurrences and the range of texts in which the bundles appear. The frequency threshold in previous studies ranges from 10 to 40 times per million words (Conrad & Biber, 2005;Hyland, 2008;Pan et al., 2016). The frequency threshold ensures that only the most frequently occurring word sequences are identified as lexical bundles. ...
Article
Full-text available
Lexical bundles are frequently recurring word sequences (e.g. as can be seen) that function as building blocks of discourse. This corpus-based study examined the use of four-word lexical bundles in business emails written by three groups of writers: intermediate business English learners, advanced business English learners, and working professionals. The prominent structural and functional characteristics of lexical bundles expressed in business emails were identified and compared across the three groups. The results showed that lexical bundles were related to the extent to which formality and politeness were expressed in written business communications. The advanced business English learners and working professionals used more structural and functional characteristics of lexical bundles that are characteristic of written conventions than did intermediate business English learners. Both intermediate and advanced learner groups used functionally different lexical bundles from those produced by the working professionals.
... When looking at the length of these sequences, it is seen that nine out of 10 were 3-word FSs, with only one 4-word sequence in sub-corpus 2 of Group 1, whereas all of the sequences in sub-corpus 1, 3, 4 and 5 were 3-word FSs. This finding is supported by Conrad and Biber (2005), who found that 3-word FSs are more frequent in the corpora as evidenced in NES written corpora of academic prose. Table 5 displays the most common formulaic sequences in all sub-corpora of Group 2 over time. ...
... The examination of the most frequent FSs across two semesters gave the researcher various FSs differing in length and type. This finding is concurrent with the finding of Conrad and Biber (2005), who found that 3-word FSs are more frequent in the corpora as evidenced in NES written corpora of academic prose. The main 4-word formulaic sequences that were frequently used in learner corpora were on the other hand, is one of the, I strongly believe that, I firmly believe that and one of the most. ...
Article
Full-text available
In recent years, the advent of computer technology and software tools have made it available for more complicated and fully operational facilities for corpus linguistics. Thanks to these developments, the compilation of large collections of naturally occurring texts was made more accurately. In line with these developments, the current study aimed to investigate the usage patterns of three- to four-word sequences in a learner corpus composed of two semesters written data from 85 English as a Foreign Language (EFL) learners. The data was analysed by examining collective trends in terms of usage patterns of formulaic sequences across different time intervals. In the collection of data, the frequency approach was used and the most frequent three- and four-word recurrent formulas were extracted from each sub-corpus of the learner corpora in two groups and these sequences were classified structurally and functionally. Then, the use of these sequences was compared across native (LOCNESS) and non-native data by using the Sketch Engine corpus tool. The findings suggested that although formulaic sequences were used frequently in both learner groups, the frequency and type of these formulaic sequences were less diverse, and the number of formulaic sequences was limited when compared with the native data.
... In addition, Biber et al. found that genres vary in terms of the functional type of LBs they employ (2004). While conversations mostly use stance bundles, a greater number of referential bundles are used in the written register, due principally to the prominence of factual information in academic contexts (Conrad and Biber 2005). ...
... The LBs functioning as referential framing attributes are the most frequent in the present study. Conrad and Biber (2005) and Biber and Federica Barbieri (2007) also report that referential LBs in general outnumber the rest of types in the written academic register, which they ascribe to the importance of presenting factual information in academic contexts. The higher frequency of LBs functioning as referential framing attributes in RAs is to be expected, since as Biber and Conrad point out, RAs discuss a specific research topic and therefore this type of LB is necessary to convey the exact identification of referents (2019). ...
Article
Full-text available
The present article is a corpus-based descriptive/comparative study of lexical bundles (LBs) in two university genres: textbooks (TBs) and research articles (RAs) on applied linguistics. It aims to identify the LBs used in the two genres, compare them on the basis of their functional type and frequency and explore how they are related to genre. To this end, four-word LBs were identified in two corpora drawn from applied linguistics TBs and RAs. The comparative analysis revealed that there are interesting differences between the two genres in terms of discourse functions: the occurrence of LBs in the TBs was lower than in the RAs; attitudinal/modality LBs occurred more frequently in the TBs than in the RAs; epistemic LBs occurred more frequently in the RAs than in the TBs; discourse organizers occurred more frequently in the RAs than in the TBs; and time, place and text reference LBs occurred almost twice as frequently in the RAs. The findings build on research into the variations of genres in terms of the use and functions of LBs in discipline-specific corpora.
... Notably, some but not all criteria must be present to be identified as a familiar item (Wray, 2017). For instance, Wray's criteria could exclude easy to produce, single-word utterances such as "yep" and "right?", but these items are justifiably counted as familiar or automatic by aphasiologists because of their frequent usage and pragmatic role in turn-taking (Code et al., 2009;Conrad & Biber, 2005;House, 2013;Van Lancker Sidtis & Rallon, 2004). In another example, frequency is sometimes prioritized to identify familiar language as used in automated, frequency-based methods (Bruns et al., 2018;Zimmerer et al., 2018), but a number of familiar language types are low in frequency; idioms and proverbs, song lyrics, and trending slang are all low frequency items and yet easily recognized by native speakers and used to signal group membership (Hallin & Van Lancker Sidtis, 2017;Rammell et al., 2017;Van Lancker Sidtis, personal communication, April 5, 2021). ...
... are more difficult to identify because they are less defined in the literature and may be included in other familiar language subtypes such as phrasal interjections, speech formulas, or even pause fillers (e.g., Crible & Pascual, 2020;Fuller, 2000). Two types of multi-word utterances that are considered connotation-free have also been distinguished: formulaic sequences (e.g., first of all), which are structurally complete with a unitary semantic meaning, and lexical bundles (e.g., that depends on), which span phrasal boundaries to perform a bridging function (e.g., Conrad & Biber, 2005;Jeong & Jiang, 2019;Nekrasova, 2009). Although there are other less described familiar language types in the literature, the current study includes the above-mentioned nine types as reviewed in the methods section. ...
Article
Background It is well-established that individuals with nonfluent aphasia produce proportionally more familiar or non-propositional language than neurotypical adults. Much less is known about the types of familiar language used or about the effects of either language context or impairment on usage patterns. Aims The purpose of this study was to identify and compare types of familiar language across several spontaneous speech contexts in individuals with and without aphasia in order to refine models of familiar language use for clinical application. Methods & Procedures Language transcripts from Aphasiabank of 154 individuals with moderate to severe post-stroke Broca’s aphasia and gender- and age-matched controls were coded to identify and classify nine types of familiar language. Language samples included a story-telling task and three conversational topics. Non-parametric comparisons and Spearman’s correlations were used to analyze usage patterns. Outcomes & Results Individuals with aphasia produced significantly higher proportions of formulaic expressions (context-bound, stereotyped utterances) as compared to controls, but proportions of lexical bundles (connotation-free, multi-word utterances) did not significantly differ. Familiar language usage varied by language contexts and level of severity for individuals with aphasia, whereas production patterns of healthy controls were remarkably stable. Conclusions This study offers insights into patterns of familiar language usage affected by linguistic ability and language context. A theoretical framework for conceptualising familiar language will result in improvements to existing interventions.
... The use of LBs has been explored in various genres and registers across diverse academic disciplines in terms of its structure and function (e.g. 2005;Cortes, 2004;Hyland, 2008aHyland, , 2008bNasrabady, 2020;Zare & Valipouri, 2022). ...
Article
Full-text available
This study aims to investigate the frequency of four-word lexical bundles (LBs) in research papers authored by faculty members from the English Department at King Khalid University. Additionally, it seeks to classify the functions of these bundles based on Biber’s (2004) taxono-my. The study’s corpus comprises 171 research papers published between 2016 and 2022. Lexi-cal bundles were identified using three key criteria: frequency, range, and function. WordSmith 4.0 software was employed to extract and analyze the LBs from the corpus. The results reveal variation in the use of LBs, with referential bundles being the most common, followed by dis-course organizers and stance bundles. These findings align with previous research in this area and offer valuable insights for studies on English for Academic Purposes (EAP). They may also benefit EAP instructors, curriculum developers, and policymakers. Keywords: corpus; lexical bundles; academic discourse.
... While it now seems increasingly clear that MWEs are faster and easier to process than nonformulaic language, what still remains open is whether the use of MWEs is influenced by other factors. In register theory, it is widely assumed that speakers adjust their language according to the particular communicative situation (Biber, 2012;Biber and Conrad, 2019;Conrad and Biber, 2005). One of the parameters describing the communicative situation is the relationship between the speaker and the recipient. ...
... Additionally, avoiding vague or imprecise expressions, authors in academic writing emphasize accuracy and objectivity, which is crucial in presenting factual information in academic contexts. [40] Hence, scholars prefer to use more precise and specific language in academic papers to ensure the credibility and scientific validity of their research or arguments. Another possible reason is that our samples are not enough to cover all negative markers in the two corpora which resulted in the frequencies of some negative markers are less than 3 in the specific move. ...
Article
Full-text available
Extensive studies have been conducted in various sections of research articles, including the abstract, introduction, discussion, and other segments. Remarkably, the conclusion, a component of this academic genre, has received relatively scant attention in genre analysis. Conclusions, serving as the concluding segment, play a crucial role in recalling the previously addressed issues, highlighting key research findings, acknowledging limitations, and suggesting implications for further research. In light of this, authors employ an array of interactive resources to engage with their readership effectively. As an essential part of interactive resources, negation is used to be neglected in discourse analysis. Hence, this study aims to explore the relation between negation and moves in research conclusions. Furthermore, this study seeks to examine how negation contributes to rhetorical persuasion with a focus on its functions and distribution across disciplines. This study shows the rhetorical functions of negation and describes the distribution of negation across disciplines and moves. The findings not only indicate that authors from distinct disciplines exhibit varying preferences in the utilization of negation in their research article conclusions, but also provide some pedagogical implications.
... Research on LBs in academic written genres includes studies on LBs in research articles (Shahriari, 2017), theses and dissertations (Hyland, 2008a), textbooks , and student writing (Cortes, 2008;Durrant, 2017), with some research investigating LBs across these written genres (Shirazizadeh & Amirfazlian, 2021). In contrast, research on LBs in spoken genres includes studies on university spoken registers (Biber & Barbieri, 2007), conversation and academic prose (Conrad & Biber, 2005), and spoken academic EFL genres (Wang, 2017). There are also studies on LBs in professional domains, such as legal genres (Breeze, 2013) and pharmaceutical discourse (Grabowski, 2015). ...
Article
See the full text at https://so04.tci-thaijo.org/index.php/LEARN/article/view/274081. Lexical bundles and moves are essential for vloggers to communicate clearly and purposefully within travel vlog discourse. It is crucial for L2 learners and practitioners aiming to enter the industry to master these bundles and understand the moves used in creating travel vlogs. This corpus-based study compiled a list of 239 four-word lexical bundles serving as fixed slots and their 98 variable slots from the Travel Vlog Corpus, which comprises 434,809 running words. These bundles were categorised by function: 79 as stance expressions, 75 as discourse organisers, 80 as referential expressions, and 5 as special conversational functions. The study also identified four move types and their 19 component steps necessary for creating travel vlogs. It emphasised that lexical bundles and moves are critical knowledge with important functions for generating travel vlog discourse. The study concluded by proposing pedagogical implications and discussing future research directions.
... Expressions are learned rapidly in development with natural exposure (Reuterskiöld & Van Lancker Sidtis, 2012), are known to the native speaker, and comprise a significant proportion of naturalistic speech, with estimates at approximately between 20 (Sorhus, 1977) and 24% (Kuiper, 2009;Van Lancker Sidtis & Rallon, 2004). For lexical bundles (non-nuanced expressions), incidence is placed at 50% in conversation (Conrad & Biber, 2004). ...
Article
Communication, specifically the elements crucial for typical social interaction, can be significantly affected in psychiatric illness, especially depression. Of specific importance to conversational competence are familiar expressions (prefabricated expressions known to the language community) including formulaic expressions (conversational speech formulas and idioms; these are high in nuance) and lexical bundles (fixed linguistic segments that are prevalent in naturalistic conversation; low in nuance). The goals of this study were to examine familiar language production in the naturalistic, conversational speech of individuals with treatment-resistant depression before and after receiving surgical deep brain stimulation of the subcallosal cingulate white matter pathways and to compare their performance to healthy adults’ familiar language use. Results revealed fewer conversational speech formulas (typically nuanced) produced by those with depression pre- and post-operatively as compared to healthy controls. There was an increase in the production of non-nuanced familiar expressions (largely lexical bundles) and a decrease in nuanced expressions (formulaic expressions) post-operatively when compared to the pre-operative condition for those with depression. These results conform to a recent model that distinguishes three distinct classes of familiar language, based on linguistic and neurological criteria. This study offers a first look at familiar language in depression and provides a foundation for further study into the pragmatic components of communication to help address the often-reported diminished social connectedness experienced by those with depression.
... Criteria for extracting bundles include chiefly the ngram length and distribution (their occurrence in different texts); normally, no punctuation marks interrupt these ngrams (Grabowski, 2018: 60). Many studies, such as Dalili and Dastjerdi's (2013), Esfandiari and Moein's (2016) and Conrad and Biber's (2004), to name a few, set the cut-off frequency for any word or ngram and their distribution in the corpus to guard against speaker idiosyncrasies. ...
Thesis
Full-text available
The overarching aim of this thesis is to delineate the place of English in Kuwait using Schneider’s (2003; 2007) Dynamic Model. The model posits that postcolonial varieties of English go through five phases: Foundation, Exonormative Stabilisation, Nativisation, Endonormative Stabilisation and Differentiation. Each phase is assessed on the basis of four parameters (historical factors, sociolinguistic features, identity construction and linguistic features) and types of data that allow each parameter to be properly investigated. The historical parameter entails investigating, on the basis of historical data, the arrival and local history of English and other low- and high-intensity contact periods. Sociolinguistic factors entail investigating norms, beliefs, local identity formation processes and the position and role of English in the linguistic contexts from the first contact until the present. The last parameter entails analysing a corpus of samples of actual language use. The levels of analysis are incorporated in my study: a) historical analysis (desk research) to assess the historical parameter, b) corpus-assisted discourse analysis (CDA) largely based on Edwards (2018) to explore the sociolinguistic factors and identity constructions parameters and c) pattern-driven analysis (PDA) (Tyrkk¿ & Kopaczyk, 2018) to assess the linguistic parameter. As for CDA, it analyses the interview subcorpus to explore a) the beliefs about English in Kuwait and the identities it indexes and b) the norms of the English language. Pattern-driven analysis sets to flesh out the linguistic specificities and developments of Kuwaiti English. That is, it identifies distinctive features of English in Kuwait: analysis of word distributions, functions, and meanings of selected linguistic features. To systematically arrive at the analysis of the best linguistic features, I made use of a widely known corpus linguistic methodology in terminology known as "knowledge-rich contexts" (Meyer, 2001) and contextualised it within World Englishes. Accordingly, after conducting four case studies (structural and lexical), pragmatic features (i.e. discourse-pragmatic markers) turned out to be the most salient and thus the ones investigated. The main findings that emerged from the analysis are as follows: Historical facts do not provide clear evidence for indigenisation. Nevertheless, they suggest nativisation is progressing slowly as metropolitan English models keep reasserting themselves through education and media. This parameter suggests that English in Kuwait is somewhere between the second stage and the third stage of the Dynamic Model. On the sociolinguistic end, investigation of language ideological matters and usage patterns suggest a faster rate of nativisation as many younger Kuwaitis who are in great numbers educated through English are currently not only speaking English to expats and non-Arabic speakers but also among themselves in certain contexts such as at home between siblings. This is a fertile ground for the emergence of an indigenised variety or a stage four state. Linguistic facts suggest that English became an important learning target after the Iraqi invasion and due to sustained and increasing use has started to indigenise, in that it is acquiring a unique bundle of features. The analysis of linguistic features suggests that Kuwaiti English is placed between stage 2 and stage 3, as the investigated features (discourse-pragmatic markers), be they superstrate- or substrate-derived components, exhibit patterns distinct from British English and American English.
... Hubbard (2010), for example, has described the use of an exemplary science thesis as the source for developing pedagogical materials for the learning of core academic discourse functions such as defining, contrasting, attribution, hedging, and expressing conditions and findings, among others. More general analyses of spoken and written academic discourse have helped to isolate frequent and recurrent patterns of language use that occur significantly more often in academic than in non-academic contexts (e.g., Biber, 2006;Conrad & Biber, 2004;Simpson-Vlach & Ellis, 2010), the results of which have subsequently been used to develop English for academic purposes instructional resources and which would inform the evolution and implementation of the LIKE approach. ...
... There have been a lot of linguists conducted studies on lexical bundles in spoken discourse (Conrad and Biber, 2005;Heng, Kashiha and Tan, 2014;Darweesh and Ali, 2017;Sykes, 2017;Wang;. However, there are few comparative studies of lexical bundles conducted in spoken discourse (Kwon and Lee, 2014;Kashiha and Heng, 2015). ...
Article
Lexical bundles are multi-word expressions that usually hang together. They are considered as a main factor in building fluency in academic discourse; helping to shape meanings and coherence in a text. The objectives of the study are to analyse non-native and native English teachers’ talk in order to explain (1) the use of structural and functional types of lexical bundles in non-native and native English teachers’ talk, (2) the similarities and differences of lexical bundles used in the talk, (3) the relation between structural and functional types of lexical bundles used in the talk. This study is a qualitative study and designed as a classroom discourse analysis. The data are non-native and native English teachers’ talk. The results reveal that non-native and native English teachers used all types of lexical bundles structurally and functionally. Similarly, both teachers performed lexical bundles in form of verb phrase and they mostly functioned as stance expressions. However, they performed them differently in terms of the sub-types. Non-native English teachers used more 1st/2nd person pronoun+VP fragments while native English teachers employed more WH-questions fragments. Functionally, non-native English teachers used lexical bundles more in showing ability while native English teachers performed them more in showing intention/prediction. Both teachers frequently employed lexical bundles with verb phrase structures that functioned as stance expressions. The use of lexical bundles is important for teachers to perform native-like fluency and improve their oral proficiency.
... As shown in Figure 5, the top ranked APPs do substantially more work than others but the threshold of relevance remains unclear, as is the case with all so-called frequency based studies (Biber, 1999(Biber, , 2006Biber & Barbieri, 2007;Breeze, 2013;Bybee, 2000;Conrad & Biber, 2004;Cortes, 2004;Hyland, 2008Hyland, , 2009. What is common among these studies is that once a threshold has been established, observations are confined to that window of consideration. ...
Preprint
Full-text available
This manuscript describes a usage-based approach to the visual representation of phonemic inventories. For decades, the International Phonetic Alphabet (IPA) consonant and vowel charts have served as referential representation when it comes to the structural description and study of sound systems of the world’s languages. As a fundamental tool in the phonetician’s analytical toolbox, these charts have facilitated documentation of the phonological segments of individual languages as well as theorization about linguistic universals and factors involved in speech processing. This paper reviews established practices before introducing usage-based perspectives and what they can contribute to current understanding. Subsequently, a usage-based framework for description is introduced and an analysis which exemplifies its application is presented.
... Genre identification aids readers in retrieving messages conveyed in given contexts more promptly (11) , Likewise, the recognition of medical abstracts is enabled by the use of conventions in the form of recurrent rhetorical choices, which mainly involve decisions regarding the overall organization of discourse and the linguistic resources employed to reflect their communicative purposes (12) . Extensive research has been conducted on using formulaic language in written academic genres (7,13,14) . especially in medical science research article abstracts (1,2,15) . ...
Article
Full-text available
s are critical in medical contexts. They contain formulaic building blocks called Lexical Frames (LFs), which are high-frequency word sequences with variable slots that can be formed around collocation nodes. LFs are abundant in written academic discourse, and , for this reason, have great importance for the production of abstracts. Extensive research has been conducted on formulaic language, especially on medical genres. Fewer studies, however, have focused on LFs from specialty-specific corpora (.e.g., epidemiology) and their relationship with the rhetorical structure of abstracts. Objective: This study aims to fill this gap by describing the structure of epidemiology abstracts, presenting their rhetorical functions, and identifying the LFs that linguistically realize these functions to help researchers write more conventional abstracts. Methods: We put together three corpora of abstracts in the field, published in English in peer-reviewed journals, and combined genre analysis and Corpus Linguistics principles to identify the linguistic realizations of the rhetorical functions in the texts. First, the rhetorical structure was described; then, the LFs were identified and analyzed. Results: 92% of the texts follow a pre-established pattern, whose structure consists of five to nine sections. Eight saliently frequent nodes (study, result, method, conclusion, review, analysis, patients, and findings) around which the LFs are constructed were identified. Conclusion: Even though both the content and function words that make up the LFs show some variation, it is possible to notice that the LFs elicited typify the linguistic realizations of the corresponding sections' rhetorical functions and, thus, are suitable to the observation of a pattern. For that reason, the data obtained in this study were used to inform the creation of a support framework for the writing of specialty-specific medical abstracts.
... They examined the structure and discourse functions of these LBs and pointed out how many of these sequences were not perceptually salient. They were often found to function as a bridge between two structural units which later caused them to name LBs as "the building blocks of discourse" (Biber & Barbieri, 2007;Biber, 2006;Conrad & Biber, 2004;Chen & Baker, 2010). The main characteristics of LBs are that they are recurrent, fixed, non predefined, and not perceptually salient. ...
Article
Full-text available
The study of phraseology with respect to continuous and discontinuous frames in academic writing has gathered increasing research attention over the past decade. Their prevalence in expert writing and the influence of discipline and genre on their frequency and type have led to studies that attempted to identify the most productive discontinuous frames in specific disciplines. The aim of this study is to investigate the pattern of the N of (the) N, a prolific pattern in expert academic writing, in two Omani corpora of undergraduate Civil Engineering genres, Case Studies (CS) and Methodology Recounts (MR). The two strands of inquiry involve 1) a comparison between the semantic noun categories of the first (N1) and second noun (N2) used in this pattern and; 2) the N1-N2 sequences in the pattern which realize specific discourse functions in these two genres. Strings belonging to this pattern were retrieved from the two corpora through the corpus interface, Sketch Engine. Findings indicate the prevalence of this frame in the two genres and genre influence on the choice of semantic noun categories. It was also found that the N1-N2 sequences in the pattern are used to realise distinct discourse functions in the two genres. This is one of the first corpus-based studies on university student writing in the Middle East and considering that English is the language of instruction and assessment in many of these countries, these findings have significant pedagogical implications. EFL students in such lingua franca contexts can be supported by a more discipline-specific approach.
... The consensus definition of formulaic language, an umbrella term for over 50 terms to describe this concept, is that it has the following three components: a) it consists of two or more words; b) it has a single meaning or function; and c) it is usually, but not always, stored and retrieved as a unitary whole (Schmitt, 2004;Wood, 2015;Wray, 2002). Conrad and Biber (2004) define a lexical bundle as the most frequent recurring fixed lexical sequence in a register that is used by multiple authors or writers. This adds two more important elements to the definition of formulaic language for the purposes of this study, that it has high frequency in addition to widespread usage. ...
Article
Although there are several studies examining second language email writing, few large-scale studies have focused on learning commonly used formulaic sequences such as, Thank you in advance for your assistance, in English emails. A total of 462 Japanese university students were taught how to write emails in English, including learning 27 formulaic sequences, over a 4-week period. The current study uses an experimental design, with a pretest, posttest and delayed posttest, to determine which formulaic sequences taught were correctly produced the most. Results show that after the treatment there was measured success in the initial acquisition of the forms. After the delayed posttest conducted three months later, however, most students could not correctly produce a majority of the email formulaic sequences taught. A discussion of teaching methods regarding the learning and usage of the 27 formulaic sequences is also included.
Article
Full-text available
Lexical bundle research has recently come to the forefront of corpus-driven studies. Previous corpus studies have documented conflicting results regarding the frequency and function of lexical bundles (LBs) in academic prose. To date, however, no study has exclusively investigated LBs in the "discussion" sections of research articles generated by professional native English authors. The current study addressed this gap by examining the frequency, structure, and function of the most frequent four-word LBs. The corpus was composed of the discussions of published research papers authored by native (L1) writers. The data were extracted from five reputable international journals in the field of applied linguistics, consisting of over 300,000 words. Using AntConc, all the lexical sequences were retrieved with a frequency of 10 and a range of 5. The results revealed that LBs were predominantly used by English writers. Structurally, it was found that phrasal bundles were the most frequent in our corpus. The findings also demonstrated that functionally, referential bundles were extensively employed. In addition, stance bundles and discourse organizing bundles were the most prevalent after referential bundles. Finally, the findings are discussed in terms of the implications for non-native writers regarding the use of LBs in academic prose.
Article
Full-text available
This study focused on two distinct Chinese groups of English as a Foreign Language (EFL) learners: those with advanced proficiency and those with basic proficiency. The objective was to explore the mechanisms used by Chinese EFL learners in processing and comprehending verb-noun collocations in an online context. The key metrics under consideration were learner response time, the influence of collocation appropriateness, and the alignment between their native language (L1) and English (L2). The findings reveal that: 1)The advanced learners demonstrated significantly faster reaction times overall, particularly when dealing with both appropriate and inappropriate collocations. 2)Accuracy was higher for appropriate verb-noun collocations compared to inappropriate ones, and the advanced group outperformed the basic group in this regard. 3)Notably, there was no significant difference in processing time between the two proficiency levels for both appropriate and inappropriate collocations. These results provide valuable empirical insights into the factors influencing EFL learners’ online comprehension of verb-noun collocations, highlighting the role of L2 proficiency and the congruence with their L1.
Article
Full-text available
Recent research has suggested that the use of formulaic language such as lexical bundles may be important for helping second language (L2) English students construct arguments and achieve higher proficiency scores in testing situations. However, more research is needed that investigates such issues with learners of lower‐level proficiencies. This study investigates the use of lexical bundles across the argumentative writing of beginning‐ and intermediate‐level L2 English learners (N = 780). Using the Yonsei English Language Corpus, this study examines the frequency, structural features, and functional characteristics of three‐word lexical bundles and their role in achieving six rhetorical moves, including position, claim, counterclaim, rebuttal, evidence, and conclusion. The findings reveal that intermediate learners used more lexical bundles in two moves (i.e., claims and evidence). There were also differences both in the bundle structures used by beginners and intermediate learners, and in the functions realized through those bundles. How lexical bundles are used across lower levels of L2 proficiency, and the implications of these findings for L2 writing instruction, are discussed.
Article
Full-text available
This study aimed to analyse the usage of English lexical bundles among university students who transitioned to Korean universities after extended periods in English-speaking countries. It was assumed that the duration of their stay would lead to distinct English language usage patterns. Ten participants were recruited: those who resided abroad for more than 5 years (six learners) and those who lived abroad for less than 5 years (four learners). Interviews were conducted, yielding an average of 30 minutes for each interview. The results revealed significant differences in word frequency and trigram usage in terms of students’ duration of stay. In other words, the residency experience in English-speaking countries influenced their language use patterns even though their current proficiency remained the same. Students with more extended stays showed greater trigram diversity and native-like patterns with VP and NP-PP fragments. In comparison, students with shorter stays displayed a prevalence of skewed VP fragment use and had narrower trigram usage. The factors contributing to their differing language use patterns should be investigated further when maintaining similar proficiency levels. Despite some limitations, such as excluding the impact of individual motivation, the findings highlight the importance of individualised language learning approaches, even among learners with similar proficiency levels.
Article
Corpus-based studies of lexical bundles have opened new avenues for language teaching research. The fact that naturally occurring language consists of patterns of lexical repetition and multi-word units has given rise to the question of chunkiness in learner language. This study was designed to examine lexical bundles and their functions in a small, specialized learner corpus of opinion paragraphs written by Saudi English as a Foreign Language (EFL) students at the University of Jeddah. The study takes a frequency-driven approach to identify common lexical bundles. A learner corpus of 237 writing tasks produced by Saudi undergraduate students from 11 different sections is compiled and explored. The primary aim was to identify high-frequency five-word lexical bundles and explore their functions in the learner corpus, as well as investigate any distributional differences in bundle use across the various student sections. The findings revealed that learners utilized lexical bundles primarily to serve four key functions: expressing stance, supporting a point, introducing an item, and making recommendations. Notably, variations were observed in the distribution of these functional categories among the different student groups. The study concludes by outlining some pedagogical applications for educators and language practitioners, highlighting the value of learner corpus-informed approaches to enhancing learners’ awareness and mastery of lexical patterning in academic writing. By better understanding the role of formulaic language in learner production, instructors can tailor their teaching to more effectively support students’ linguistic development.
Article
Full-text available
This study investigates the functional similarities and differences of four-word chunks in the academic discourse of aquaculture by Chinese and international scholars based on Hyland’s functional classification method within a corpus-driven approach. The findings reveal that, compared to their international counterparts, Chinese scholars significantly utilize more four-word chunks. Functionally, Chinese scholars frequently employ quantification, structure, framing, and engagement chunks, underscoring the importance they assign to the logic of discourse and the interaction between authors and readers. The infrequent use of description chunks suggests that it is essential for Chinese scholars to fully appreciate the significance of describing research objects, methods, and results in order to convey the foundational and experimental nature of hard science research. Furthermore, the structures of chunks used by Chinese and international scholars to express the same discourse functions differ. The expression of data indication among Chinese scholars appears more solidified. These research results can offer valuable references for academic writing.
Article
Full-text available
Press releases are official (electronic) statements written by corporations and institutions to deliver significant information to the media and the general public. Although theoretically informative, press releases are a self-promotional tool because the pieces of information they deliver are produced by the organization – the source – that writes the press release. The purpose of this paper is to discuss the source of information as linguistically realized in terms of evidentiality and patterns of agency in AstraZeneca’s press releases delivered during the pandemic. More specifically, this paper will offer a corpus-based analysis of all the press releases (62) issued by AstraZeneca during the pandemic to identify the patterns of agency and evidentiality, with the purpose of detecting the extent to which, if any, the company (re)construct its image before and after the deaths supposedly linked to Covid vaccine. The results seem to indicate that different rhetorical and persuasive strategies, as well as image restoring strategies, are employed: while promotion may require booster devices, hedging devices are necessary whenever the press release seems over-confident in the conveying of the pieces of information. As usual, caution is necessary not only to diminish negative face threats but also to prevent possible attacks from future investigations denying cognitive consensus.
Article
Bir dilin söz varlığında ayrık ve tek tek sözcüklerden ziyade birden fazla sözcükten oluşmuş dilde sıklıkla tekrarlanan, birbirine nispeten az ya da çok donuk çeşitli kalıp dil birimleri vardır. Kalıp dil birimleri dillerde yapı, anlam, işlev ve diğer dil bilimsel farklılıklarıyla çeşitlilik gösterir. Sözcük demetleri de kalıp dil birimlerinin bir türüdür. Sözcük demetlerinin belirli dil türüne, metin türüne ve uzmanlık alanına özgü metin oluşturmada etkili söylemsel işlevleri vardır. Türkçede sözcük demetlerinin çoğunlukla 3 sözcüklü olduğu ileri sürülmüş ve daha çok 3 sözcüklü sözcük demetleri belirlenmiştir. Bu çalışmada Türkçe sözcük demetlerinin 3 ve daha fazla sözcüklü oluşabileceği gibi 2 sözcüklü de olabileceği, özellikle 2 ve 3 sözcüklü sözcük demetlerinin önemli bir kısmının sıklığa ve donukluğa bağlı olarak belirli söylem işlevleriyle sözlükselleştiği ve deyimselleştiği ileri sürülmüştür. Sözcük demetlerinin bir kısmı dil bilgisel tamamlanmamış metin işlevi olan çok-sözcüklü birimleri teşkil ederken bir kısmı sözlükselleşmiş ve deyimselleşmiş kalıp dil birimlerinden oluşabilmektedir. Bu çalışmada 2 birimli sözcük demetlerinin belirlenmesinde Türkçe Ulusal Derlemi’nden (TUD) hareket edilmiş, derlem-tabanlı hibrit bir yaklaşım benimsenmiştir. Dil türü, metin türü ve uzmanlık alanından bağımsız olarak Türkçenin genel konuşma ve yazı dilinde pedagojik, leksikolojik ve psiko-dilbilimsel açıdan önemli iki-birimli sözcük demetleri belirlenmiştir.
Article
An emerging body of corpus-based genre analysis studies has examined the connection between different types of formulaic language and rhetorical moves in various genres of academic writing. The current study extends this body of research into the understudied genre of narrative stories and the understudied phraseological unit of lexical collocations. Specifically, we compiled a corpus of narrative stories written by expert writers, extracted a list of frequent collocations from the corpus, developed a rhetorical move framework for narrative stories, examined the distribution of rhetorical stages and moves in the corpus, and explored the connection between collocations and rhetorical moves in the corpus. The findings of our research culminated into an online interface for searching the corpus for collocations and exploring their use in sentences realizing different rhetorical stages and moves in context. We discuss the potential pedagogical value of our findings and the resulting online interface for promoting learner awareness of the connection between linguistic features and rhetorical functions in narrative stories in genre-based pedagogy.
Article
Full-text available
Background & Aims: The importance of "publish or perish" in academic contexts, especially for faculties and graduate students, is an undeniable problem because of its role in determining university achievement around the world. To deal with such problems, academic writers must be fluent in language repertoires (e.g., lexical bundles), which are an essential component of scholarly writing and necessary for creating publishable research articles (RAs). Material & Methods: Hence, the present study reviews 85 empirical RAs that have been done to extract highly frequent 4-word lexical bundles (LBs) published between 2008 and 2021 in ISI and Scopus-indexed journals across various hard sciences disciplines including medical sciences. Additionally, it offers a list of the general academic four-word LBs in the various sections of hard sciences RAs that can be used as a reference list of general LBs for scholarly writing in hard sciences. Results: The review revealed that experts use discipline-specific bundles in each discipline. The findings also revealed that academic writing structurally relies heavily on phrasal bundles and functionally on referential bundles. Conclusion: The current study concludes that it is essential to explore disciplinary linguistic features such as LBs in academic writing to enhance academic success and RA literacy. The results may also be useful in developing appropriate educational materials and activities on LBs for academic writing in hard sciences such as medical sciences. Keywords: English for Medical Purposes, Hard Sciences, Lexical Bundles, Research Articles
Article
Full-text available
This study was conducted to examine the lexical bundles used by nonnative speakers of English and explore any potential L1 influence on L2 lexical bundle use. Following a corpus-based approach, the frequency and types of English four-word lexical bundles in the postgraduate academic writing of Turkish and American students were analyzed, and the bundles unique to the Turkish students were compared with Turkish lexical bundles produced by Turkish post-graduate students. For this purpose, three sub-corpora were compiled: English MA/PhD theses by Turkish, English MA/PhD theses by American, and Turkish MA/PhD theses by Turkish students, all from the area of language teaching. Data analysis showed that the Turkish students used twice as many types of four-word lexical bundles in their English theses (N = 125) as the American students (N = 69). Moreover, 62 lexical bundles were significantly overused by Turkish students, and 37 of these lexical bundles never occurred in the theses of American students. With respect to cross-linguistic influence, the findings showed that Turkish postgraduate students were likely to transfer 24.8% of lexical bundles from their native language, Turkish, to a foreign language, English. Moreover, four-word lexical bundles that were very frequent in Turkish theses were also found to be very frequent in English theses of Turkish students. These findings are discussed in light of previous studies, and pedagogical implications are offered.
Article
Full-text available
This study was conducted to examine the lexical bundles used by non-native speakers of English and explore any potential L1 influence on L2 lexical bundle use. Following a corpus-based approach, the frequency and types of English four-word lexical bundles in the postgraduate academic writing of Turkish and American students were analyzed, and the bundles unique to the Turkish students were compared with Turkish lexical bundles produced by Turkish postgraduate students. For this purpose, three sub-corpora were compiled: English MA/PhD theses by Turkish, English MA/PhD theses by American, and Turkish MA/PhD theses by Turkish students, all from the area of language teaching. Data analysis showed that the Turkish students used twice as many types of four-word lexical bundles in their English theses (N = 125) as the American students (N = 69). Moreover, 62 lexical bundles were significantly overused by Turkish students, and 37 of these lexical bundles never occurred in the theses of American students. With respect to cross-linguistic influence, the findings showed that Turkish postgraduate students were likely to transfer 24.8% of lexical bundles from their native language, Turkish, to a foreign language, English. Moreover, four-word lexical bundles that were very frequent in Turkish theses were also found to be very frequent in English theses of Turkish students. These findings are discussed in light of previous studies, and pedagogical implications are offered.
Article
Full-text available
Lexical complexity, generally understood as a multidimensional construct consisting of lexical density, sophistication, and diversity, has been recognized as an important construct in first language acquisition and second language acquisition. A large variety of lexical complexity measures have been proposed by researchers to study its relationship to second language learners’ writing and/or speaking proficiency. While this line of research has generated fruitful results for languages such as English and other Indo-European languages, less is known about how it may be applicable to a typologically distance language such as Chinese. In this paper, we report the design of a computational tool for automating the measurement of lexical complexity with 25 indices. The Batch Mode of the software supports analyses of a large number of .txt files and outputs results to a .CSV file suitable for importation into statistical packages for further analyses. In an example application, we analyzed 87 texts from three distinct registers (academic prose, fiction, and press reportage) from the Lancaster Corpus of Mandarin Chinese (McEnery and Xiao, The Lancaster Corpus of Mandarin Chinese: A corpus for monolingual and contrastive language study, in Proc. Fourth Int. Conf. Language Resources and Evaluation, eds. M. Lino, M. Xavier, F. Ferreira, R. Costa and R. Silva, (European Language Resources Association, Paris, 2014), pp. 1175–1178) and explored how register variation may manifest in lexical complexity. The ANOVA analyses showed linear increase in multiple lexical sophistication and diversity measures across academic prose, fiction and press reportage. The results showed that press reportage has lower lexical density, but higher lexical sophistication and diversity than academic prose; fiction also has higher lexical diversity than academic prose. This paper concluded with discussions of potential pedagogical applications for Chinese language teaching and learning.
Article
Full-text available
En este artículo investigamos la representación de la oralidad en la novela decimonónica española. Para ello hemos utilizado un enfoque de corpus. En concreto, hemos llevado a cabo un análisis de clusters en un corpus de 111 novelas (c. 9,3 millones de palabras) de nueve autores canónicos: Pedro Antonio de Alarcón, Vicente Blasco Ibáñez, Leopoldo Alas Clarín, Luis Coloma, Armando Palacio Valdés, Emilia Pardo Bazán, José María de Pereda, Benito Pérez Galdós y Juan Valera. El análisis gira en torno a ejemplos utilizados de manera exclusiva en construcciones de estilo directo. Tales ejemplos desempeñan funciones estilísticamente relevantes relacionadas con la descripción de las relaciones interpersonales de los personajes que pueblan los universos ficticios, desde los que se puede explicar la sensación de realismo y autenticidad que caracteriza los movimientos realista y naturalista. El análisis servirá, además, para demostrar el potencial de los enfoques de corpus en el análisis de textos literarios en nuestra lengua.
Article
Full-text available
Background: Lexical Bundles (LBs) have become the focus of many recent corpus linguistics studies. Research has found variable use of LBs in terms of quality and quantity pertaining to different linguistic groups or registers. Still, there is a paucity of research investigating Arab EFL writers’ use and development of such a feature. Purpose: This study investigates the 4-word LBs use and development by Arab EFL learners and expert writers in a corpus of 250000 words regarding their frequency, functions, and structure. Methods: Two corpora were compiled for Arab learners and scholars. The LB use of both groups was compared to investigate the development of LB use. Further, the Arab corpus was analysed against a native reference corpus extracted from the British Academic Written English (BAWE) corpus to compare LB use across the two corpora. Results and Implications: The results imply that there is no noticeable effect of postgraduate education or professional practice on using LBs. The other results, however, are in line with the previous literature in that native speakers’ use of LBs varies in quantity and quality from nonnatives’. The findings reveal that stance LBs are more frequent in the native corpus and that they tend to use more VP-based clausal LBs than their non-native counterparts. These findings offer empirical evidence that EFL writing quality is lower despite the current academic writing instruction they receive. They, therefore, indicate the need to foster academic writing instruction programs to include training on using LBs in learners’ writing at both Bachelor and postgraduate levels. Also, the results are expected to raise teachers’ awareness of how EFL learners use LBs to develop their writing quality and thus adapt their teaching strategies accordingly. Moreover, Arab scholars are called to reconsider their use of effective writing techniques, including LBs, for more effective writing.
Book
Full-text available
This book reviews 50 years of studies of familiar language, including formulaic expressions, lexical bundles, and collocations, providing a psycholinguistic and neurological model of familiar and novel language processing. This dual model proposes separate brain systems for familiar and novel language. Classification systems, definitions, myriad examples, and models are presented.
Article
Full-text available
The purpose of this corpus-based study was to investigate whether different sections in chemistry research articles, i.e., abstract, introduction, and results and discussion, rely on different sets of lexical bundles. Lexical bundles, associated with the above sections, were extracted from a corpus of 4 million words, comprising 1,185 chemistry research articles, using WordSmith Tools 5.0, and were categorized according to their functions. Altogether, 197 key bundles were identified in the three sections of chemistry research articles, 15 in the abstract, 99 in the introduction, and 83 in the results and discussion section. Two functions also emerged for lexical bundles in chemistry research articles, including purpose-oriented bundles, which refer to the aim/aims of the study; and literature-oriented bundles, which are used to refer to the literature. Altogether, the results showed that various sections in chemistry RAs are associated with specific sets of lexical bundles and, as such, deal with different rhetorical functions.
Article
Despite the importance and the ubiquity of medical patient information in many healthcare systems in the world, existing approaches to its production do not seem to result in an effective product as readability measures continually judge most materials too difficult for patients to comprehend. Radiography is one medical setting where understanding patient information materials is particularly important in view of the rising numbers of examinations being performed and the potential risks involved from radiation, though studies consistently show that patients lack basic knowledge regarding the common radiographic exams. The gap in the literature and the concerns relating to patient understanding of radiation risk means there is a pressing need to investigate the linguistic characteristics and the language demands of radiography patient information, though to date very few studies have been carried out of the register and none that use a lexical bundles analysis. This study describes an analysis of 4-word lexical bundles in a corpus of 221 patient information leaflets for radiography which revealed a predominance of bundles more common to dense informational text and classroom instruction than the conversational, everyday language that healthcare writers are encouraged to use. A high frequency of passive structures - usually considered too complex for patients to process and flagged up by readability measures – was also found. An investigation of the discourse functions of the bundles reveals that the underlying purpose of radiography patient information is to instruct, suggesting that a conflict may exist between this and the concepts of patient-centred healthcare.
Book
This book offers a detailed account of how a mixed methods approach, combining corpus linguistics and discourse analysis, can shed light on educational practice. The book is based on a 22,000-word corpus of mathematics lessons in a multicultural secondary school in Ireland with the analysis of classroom data supported by insights from reflective meetings with the participating teacher. It demonstrates how examination of video recordings of lessons and reflective conversations facilitate discursive changes in the classroom and increase teacher awareness of classroom interaction.
Article
Full-text available
本文以口語語料庫為本,描述粵語高頻詞「多謝」(doze) 的話語語用功能,並從浮現語法的角度探討此話語標記的發展。研究語料來自三個香港粵語口語語料庫,語料長度約 56 小時,集中分析「多謝」在會話言談中的互動功能。結果顯示「多謝」在會話中各架構呈現出語法連續體的現象,更虛化到有超越感謝的功能。「多謝」的概念功能為「向別人表達感謝」;語篇功能有「陳述感謝原因」;人際功能則有「回應實質性利益接收」、「回應得到別人非預期的承諾」、「回應恭賀」、「回應恭維」和「拒絕邀請」。本文就研究結果討論關聯理論和禮貌原則在話語標記發展的角色,說明「多謝」以會話參與雙方的利益為核心,透過人際互動浮現出多種功能,並發現「多謝」和「唔該」(mgoi) 的語用差異與社會行為的預期值有關。最後,本文證明共時性的人際互動和規約化推動語言使用的演變,並說明動態性的語言變化,即浮現語法。 This paper describes the discourse-pragmatic functions of doze (‘many thanks’) in spoken Cantonese discourse. Two functions of doze were found which have not previously been described. The audio/video data comes from spoken corpora and consists of 107 instances. The study shows that doze is a multifunctional discourse marker: doze functions as a marker of response to tangible receiving, response to congratulations or a compliment, response for declining an invitation, and response to unexpected promise making. This study suggests that doze is originally a stative verb of expressing gratitude; it retains the meaning of thanking and encodes during conversational interaction the attitudes between the speaker and the listener toward the beneficial event. Some grammaticalization effects involving conventional inferencing are witnessed and demonstrate the evolution of a stative verb to a beneficial reception marker via conversational development. This paper also captures the subtle differences between doze and mgoi in Hong Kong Cantonese. It contributes to the understanding of grammaticalization, Emergent Grammar, and politeness in the Asian context.
Article
Full-text available
The present study investigated and analysed the structures and functions of 3-word to 6-word lexical bundles of 120 English argumentative writing by Chinese EFL students and employed the framework based on Biber et al.'s structural classification and Hyland's functional classification of lexical bundles. It was found that there was generally a negative correlation, both structurally and functionally, between the frequency and the length of lexical bundles although there were some fluctuations in certain specific categories. Results indicated that the participants did not have a good command of lexical bundles, affecting the quality of English argumentative writing in various ways. First, the sampled students used limited types of lexical bundles frequently. They generally lacked lexical richness when employing specific lexical bundles to express their opinions and text-oriented lexical bundles to convey transitional signals. Second, they relied heavily on the anticipatory it structure and did not have the consciousness of using hedges and boosters when expressing their attitude. Third, they virtually did not use lexical bundles involving attributive clauses. Fourth, they are inclined to use colloquial language in writing. The paper includes implications for instruction of effective use of lexical bundles in argumentative writing.
ResearchGate has not been able to resolve any references for this publication.