Does learning a programming language require learning English? A
comparative analysis between English and programming languages
Abstract: Introductory computer programming education is a worldwide, contemporary
phenomenon, propelled by a demand for skilled individuals in the technology industry as well as
the ever-increasing availability and decrease in costs of computing devices. Statistics show that
English is the most spoken language in the world and widely used in the computer programming
field, even though spoken natively only by approximately 5% of the world population. In this
study, we examine the relationship between English and programming languages to establish the
extent to which learning a programming language that uses English adds a burden to non-native
English-speaking (NNES) learners. Learning English is required, desired, or both, depending on the
characteristics of learners' environment. It is beneficial, as English proficiency is a desirable skill.
However, it also presents an overload, as learners need to understand the concepts as well as the
language in which these are taught.
Keywords: computer science education, novice learners, programming languages, English as a
Introductory computer programming education is a worldwide, contemporary phenomenon, propelled by a
demand for skilled individuals in the technology industry as well as the ever-increasing availability and decrease in
costs of computing devices. Governments are currently recognizing the importance of being able to harness the
knowledge to control these devices and create new uses for them (Bajarin, 2014; Dredge, 2014). Such can be
observed by the inclusion of programming classes at early stages in the education curriculum, often starting at the
elementary school level (CBC News, 2015; Dredge, 2014; Silcoff, 2016).
Teaching approaches used for introductory programming take into account the cognitive/psychological
aspects of learning (Aktunc, 2013; Ali & Smith, 2014; Robins, Rountree, & Rountree, 2003), but often neglect the
relationship between computer programming and the natural language used in the programming language, in its
instruction, and spoken natively by learners.
As suggested by Crystal (2012), English is the most spoken language in the world, widely used in fields
such as international trade, knowledge dissemination, and any other that potentially includes people from different
language backgrounds, including computer programming. Nevertheless, the latest world language statistics report
(Simons & Fennig, 2017) indicates that English is spoken natively only by approximately 5% of the world
population, behind Chinese
and Spanish with circa 17% and 6%, respectively. This means that while native
speakers have the benefit of having the most spoken language in the world as their first language, non-native
speakers have to make an extra effort and employ alternate resources (cognition, time, and money) that the former
do not have to.
A survey of the most used programming languages (TIOBE Software BV, 2017) indicates that the most
widely used all make use of English in their learning and support resources as well as in their written components,
A group of related varieties of languages spoken in China
also known as keywords. These written components complement the logical and mathematical symbols and
notations that are used to create instructions. Thus, native and non-native English-speaking computer programming
learners are subject to a difference in assimilation capabilities (Goldenberg, 2008; Harper & Jong, 2004; Janzen,
2009; Lee, 2005) created by the discrepancy between English not being the most natively-spoken natural language
while being the most widely used in programming languages (Ruby & David, 2016).
In this study, we examine the relationship between a natural language, English, and programming
languages from the perspective of non-native English-speaking (NNES) computer programming learners. We
present a comparative analysis of the languages, considering language structure, the acquisition process and social
elements of each. Our intent is to analyze the extent to which learning a programming language that employs
English adds a burden to non-native English-speaking learners.
The resources considered in the writing of this paper comprise of journal articles, reports, and book
chapters retrieved by initial searches conducted on Google Scholar and later broadened using the ERIC, PsycInfo
and Education Source databases, news articles and blog posts in the public domain. We conducted the database
search using multiple keywords, limiting the search to articles published in the last 15 years, to keep the results time-
relevant. The exceptions were cases where the documents described information that served as a foundation for the
knowledge being presently implemented, such as theories, models, and theoretical frameworks. The searches were
not limited by a geographical location or language, but the sources considered were all written in English and mostly
from studies conducted in the United States of America.
A programming language is a formal, constructed language used to create a program (a list of instructions)
to control the behavior of a machine or to express algorithms. This facility has been of immense value to mankind
since, by assigning tasks to a machine, computations that would otherwise take a long time (sometimes even years)
for a human to perform can be done in fractions of seconds by a machine.
Throughout history, there have been many developments in the domain of software engineering advancing
the programming languages and its dependencies. With the advent of the Internet, their importance became more
evident. With the Internet came many new uses, products, and services that could be accessed remotely prioritizing
speed of access and user's comfort and convenience, as these services could be accessed from different locations and
from different devices with access to a computer network. In this new era, the only limit became mankind's
imagination. However, to materialize these ideas there has been a need to harness the control of the machines that
perform these tasks and that provide access to these services. Such is done through computer programming.
Nevertheless, some of the fundamental characteristics and goals of programming languages did not evolve
with time such as the prevalence of its usage of the English language. This fact enabled natural language
interoperability, i.e., programmers with different native languages could learn to and write programs in the same
natural language, thus ensuring that these two programs could be understood by either programmer and that they
were accessing the same resources.
English as a lingua franca
Due to a range of factors such as globalization rate, economic and political power, geographical diffusion,
media communication, and others, English has, to this day, steadily spread worldwide in terms of number of its
users (Kuo, 2006; Seidlhofer, 2004). In time, it has gained the title of the main language of communication among
people whose native languages are different – also known as lingua franca. Apart from not having a mutual first
language, users of English as a lingua franca (ELF) often do not share one religion or culture, choosing English as
means of bridging those differences among themselves (Jenkins, 2006; Seidlhofer, 2004; Seidlhofer, 2005).
There is no doubt that, reaching global dimensions, English is constantly being changed by its non-native,
as much as its native speakers (Jenkins, 2006; Kuo, 2006; Seidlhofer, 2004; Seidlhofer, 2005). As one of its
fundamental characteristics, ELF is not to be observed as a monolithic structure; instead, it encourages its users to
employ their own language varieties in local communication contexts (Jenkins, 2006). ELF users are now
considered to ‘own’ their varieties, being able to shape this new language to suit their needs without being looked
down on or judged, thus forming the future of English (Jenkins, 2006; Kuo, 2006; Seidlhofer, 2004; Seidlhofer,
2005). English Language Learners (ELLs) possess diverse linguistic and cultural backgrounds which can serve for
constructing new understandings. Therefore, ELLs can be effectively taught while considering their languages and
cultures in relation to pedagogical aims (Kuo, 2006; Lee, 2005; Seidlhofer, 2004). Furthermore, taking into
consideration the ever-changing nature of ELF in its forms and uses, it is reasonable to expect that the way it is
taught will change, as well. The learners should be prepared for achieving and maintaining mutual comprehension,
rather than developing language competencies based on the first-language speaker model (Kuo, 2006; Seidlhofer,
2004). However, learners are urged to be aware of the commonly used and intelligible language forms typical for
English speakers from first language backgrounds (Lee, 2005; Seidlhofer, 2005). Similarly, in the cases of
instruction in English in general, teaching strategies have typically failed to help ELLs learn in ways that are
meaningful and relevant to them. One way of understanding and getting engaged with new information is by being
able to express the thinking process, share ideas in groups, or generate new ideas. When this process requires a
certain level of competence in English that ELLs do not possess, they face difficulties (Lee, 2005).
Knowing English has become a valuable asset, very often related to better or more job opportunities. With
the establishment of English as the global language for communication, the focus is often changed from the overall
context of the course taught to general English literacy. Nevertheless, unlike native English speakers, ELLs need to
develop English language and literacy skills in the context of the learning material (Janzen, 2008; Lee, 2005). Some
students may prove to need more explicit guidance in making the connection between their linguistic and cultural
experiences with scientific knowledge and practices, a connection that is undoubtedly beneficial to them. This fact is
applicable to a wide range of subjects, with the Computer Science field included (Lee, 2005).
Comparison between natural and programming languages
Shared structural elements (syntactical, semantic and morphological
No established and validated
Described by various models
(e.g. Haynes (2007), Gardner
Both languages thrive through communities. However, since the means
of language preservation differ for both they are subject to different
language extinction conditions
Table 1: Summary of the Comparison between Programming Languages and English
The first component of comparison that we contemplate is the purpose of each language. Although they
both serve to enable communication, they differ in terms of its participants. In programming languages, the
communication involves a human being who writes the program, i.e. a programmer, and a computer. On the other
hand, natural languages enable communication between human beings.
One prominent distinction between a programming language and a natural language such as English,
resides in the openness to interpretation and uncertainty of meaning. Natural languages have a degree of ambiguity
that often requires contextualization of expressions in order for their meaning to be understood. As a branch of
linguistics, pragmatics looks at these various ways in which context can contribute to meaning, as well as
presupposition and implicature. Aside from the spoken language, meaning can be transmitted by the means of non-
verbal communication that can include intonation, pitch, speed and volume of voice, body posture, gestures, as well
as facial expressions of the speaker.
Programming languages, however, are designed to be unambiguous. Although computing devices have
high computational power, they are not designed to implicitly contextualize instructions, unless programmed to do
so. Moreover, programming a computer to contextualize instructions represents the product of computer
programming (software) rather than the steps involved in the creation of this product. Such is the case of recent
advances in Artificial Intelligence (AI) which allow a computer to generate tasks autonomously based on analysis
and contextualization of information. This process, similar to a computer being able to disambiguate meaning, is
not, in essence, subject to ambiguity due to the fact that the programs that perform these operations are created with
specific and unambiguous instructions.
In this sub-section, we relate the morphological, syntactical, and semantic elements of both natural and
programming languages. Since a programming language does not have an oral component, phonetic and
phonological aspects will not be considered as they look at the study and classification of speech sounds, and
systems of relationships of speech sound, respectively.
As stated before, a programming language is a formal, constructed language. This means that it is
artificially created, in contrast to natural languages which evolved from human beings through use and repetition. A
programming language has a predetermined set of rules and constructs which are specific to its purpose. In this
sense, both natural and programming languages share some same language components. Both adhere to
morphological rules which define the forms of the words used, syntactical, which look at sets of rules that define the
combinations of words and symbols that are considered to be correctly structured, and semantic, concerned with
meanings of these combinations.
The most prominent model for second language acquisition was proposed by Krashen (1982) and consists
of 5 stages, namely pre-production, early production, speech emergence, intermediate fluency, and lastly, advanced
fluency. This model places a greater emphasis on learners’ comprehension and vocabulary, as described by Haynes
(2007). In pre-production, learners do not yet speak or they just repeat what they hear and have a limited vocabulary,
of about 500 words. Early production is characterized by a vocabulary of about 1000 words, and learners are able to
use short memorized language chunks, speaking in one or two-word phrases. Speech emergence occurs when
learners have developed a vocabulary of about 3000 words and can communicate using simple phrases and
sentences. Later on, when considered to be at the intermediate fluency stage, learners start to communicate using
complex sentences and attempt to express their opinions and share their thoughts. Lastly, advanced fluency is
characterized by a near-native ability to communicate in a language.
Another model is proposed by Gardner (2007) which focuses on the motivational aspects of second
language acquisition. Gardner proposes two types of motivation: language learning and classroom learning
motivation. Language learning motivation relates to the motivation to learn (and acquire) a second language. It
encompasses both intrinsic and extrinsic factors that affect learners’ attitudes towards the language learning process.
Contrary to that, classroom learning motivation refers to the motivation that derives from the environment in which
learning takes place, such as teacher, class atmosphere, course content, and materials and facilities. Gardner’s
language acquisition process is described in four stages: elemental, consolidation, conscious expression, and
automaticity and thought. At the elemental stage, students learn the basics of the language (vocabulary, grammar,
pronunciation, etc.). Subsequently, at the consolidation stage, students put the elements of the language together and
achieve some degree of familiarity with the language. In conscious expression, students can communicate thoughts
and ideas, while making a lot of conscious effort. At the fourth and last stage, automaticity and thought, students
merge language and thought. At this stage students “no longer think about the language, but think in the language”
(Gardner, 2007, p.13).
Gardner (2007) places emphasis on the relationship between the learners’ first and second language,
arguing that they often rely on the first language to aid them in expressing an idea in the second language (p.13).
Conversely, programming languages do not have an established and validated learning model. Moreover,
approaches to programming language pedagogy relate more to teaching than to learning. The surveyed literature
covers aspects related to computer programming teaching, learners’ competencies as well as teachers’ perceptions of
the teaching process (Robins et al., 2010; Rolandsson, 2009). A popular example of this perspective is the Chain of
cognitive accomplishments model, suggested by Linn & Dalbey (1989). In this model, instructors organize the
learning material according to three stages, namely features of the language, design skills, and problem-solving
skills. During the first stage, features of the language, the learner is introduced to the elements of the programming
language, such as syntactic and semantic elements, valid keywords, and programming paradigm (procedural, object-
oriented, function-oriented, etc.). Subsequently, during the design skills stage, the instructor exposes the learner to
program creation abilities. This stage involves being familiarized with program templates as well as planning,
testing, and program reformulating phases. Finally, learners are exposed to problem-solving skills. These skills
represent knowledge and strategies abstracted from the specific language that can be applied to new languages and
situations (Rolandsson, 2009).
Another example is the Pedagogical content knowledge (PCK) based approach. Saeli (2012) argues that
PCK represents "the knowledge that allows teachers to transform their knowledge of the subject into something
accessible for their students" (p. 16). Saeli (2012) writes that there is a difference between knowing how to program
and being able to teach programming, with the latter concerning the ability of teachers to represent and formulate the
content so that learner comprehension can occur (p. 18).
English and programming languages thrive through communities of speakers and programmers,
respectively. However, these two languages differ in the absence of a social context. On the one hand, a natural
language needs a community to allow language transmission and evolution. This community is central to the
existence and development of the language. Therefore, a single individual, although capable of expressing
himself/herself in a natural language, cannot achieve communication, if he/she does not have who to communicate
with. This can be attributed to the fact that the purpose of natural languages encompasses a social component: to
establish communication between individuals. Thus, an important factor for the survival of a language is its means
of conservation. Natural languages are known to die or go extinct when there are no more native speakers of the
language left, or no one can speak it at all. This might happen for various reasons, among which are speakers of a
language getting absorbed by a dominant language or simply dying out.
On the other hand, a programming language is created to enable communication between a human and a
machine. Therefore, only these two are necessary in order for the language to achieve its purpose. A community
supports a programming language evolution in the same way that happens for a natural language. But, any single
programmer can make use of the language and even in the event that all members of a certain community of
programmers cease to use this language, it can continue to exist as long as it is properly documented. Having this in
mind, programming languages have an advantage for not having a spoken component. Such fact leads to the
extinction of a programming language being attributed to the lack of its usage, or inexistence of compatible devices
on which to use this language, among other factors.
English and programming languages are inextricably interwoven. Even though they differ in some aspects
such as purpose and ambiguity, they do share a number of structural elements - language components, namely
morphology, syntax, and semantics. We find a language interaction between English and programming languages
plausible due to English being a lingua franca in and outside the computer science field as well as the extensive use
of English in programming languages, as described in the introduction of this paper.
The prevalence of English in the computer programming field presents a multifaceted learning process for
NNES beginner programmers. It is beneficial, as English proficiency is a desirable skill in and outside the computer
science field, but also adds learning material as learners would still need, in varying degrees, to comprehend the
meaning of the words used to describe the concepts being taught, as a step to understanding these concepts.
Another consideration we take into account is the implicit and explicit learning of English that takes place
during the computer programming learning process. Here, we adhere to the premise proposed by Berry and Dienes
(1993) that implicit learning happens unconsciously, when learners acquire information without intending to, all the
while being unaware of the learning. In contrast, explicit learning happens when learners are aware that they are
learning and intend to do so. Translated to introductory computer programming learning and teaching, these two
concepts can be viewed through three factors: keywords and terminologies of constructs in programming languages,
programming environment (computer software such as the operating system, text editor, and compiler, among
others) and learning resources. We propose that due to the language used in keywords and terminologies of
constructs in a programming language as well as in the programming environment, learners are exposed to implicit
learning of English. Words associated with certain tasks or actions can be learned or memorized by the
programmers. Nevertheless, this type of learning might not lead to a working proficiency or fluency, but it does
introduce a language learning component to computer programming learning. Learning resources of programming
languages, such as its documentation, books, tutorials, video lectures, blog posts, and online communities are
predominantly originally written in English. In this case, for a NNES programmer to be able to effectively grasp
these resources he/she will need to make a conscious effort to understand the language in which they are expressed.
Such is the case on which we propose that explicit learning of English takes place.
Analyzing the effects and influences of English on programming languages can be approached from
various, closely related viewpoints, amongst which we will discuss the sociocultural and psychological perspectives.
From the sociocultural perspective, behavioral and cognitive factors influence learners' effective
performance. These factors affect the process learners follow to fit in a targeted group by the means of adopting its
social and cultural rules. When attempting to fit in a group with members with varying levels of English proficiency,
NNES learners are exposed to a situation on which they either follow suit, emulating the behaviors of already
existing members, or maintain their behavior unchanged which can benefit the group with diversity but can also
hinder the learners’ integration. Here, we can distinguish among factors such as language ability (level of English
language competency), prior knowledge of programming languages (acquired from previous education and
experiences, which aids comprehension and application of new knowledge), and contact with other community
members (which can significantly improve adaptation to the community).
The psychological perspective pertains to self-esteem, as well as to professional satisfaction and
fulfillment, and can be described by factors such as belongingness and motivation. The overall sense of
belongingness can be understood as the need to feel a part of the community as opposed to being marginalized or
excluded by peers. Motivation, as a vital factor for effective learning, can be observed from two viewpoints: as
intrinsic (learning English for personal satisfaction or accomplishment as a complement to learning a programming
language) and extrinsic (learning English due to an outside factor, such as a necessity, obligation, reward, or an
In this paper, we examined the relationship between a natural language and a programming language, from
the perspective of NNES computer programming learners. We presented a comparative analysis of the languages in
order to analyze the extent to which learning a programming language that employs English adds a burden to NNES
In computer programming learning, we propose that both languages interact. This interaction can be
observed mostly from the role of English in programming languages rather than the other way round. We describe
this interaction according to sociocultural and psychological factors, considering that the two are closely related, and
affect as well as reflect each other.
The role English plays in computer programming learning was also a subject of our analysis. On the one
hand, English potentially impacts learning computer programming for NNES learners as it becomes an overload to
the content of learning, i.e. students need to understand both the concepts and the language in which these concepts
are being explained. On the other hand, English is an already established language in the computer science field,
among other fields, and thus unavoidable and, in many cases, desirable to know.
Hence, in answering the question proposed in the title of this paper, we conclude that learning English is
required, desired, or both, depending on the concrete factors and characteristics of the learners’ environment. These
reflections led us to consider if, at an introductory stage, learning a programming language with keywords and
supporting resources in learners’ native languages would be beneficial to them. This step could potentially allow
learners to understand basic concepts in a language context they are familiar with, creating a foundation on which
they could later build on when advancing to higher levels of programming competency. Another consideration we
make is whether the existing second language acquisition tools, theories, and frameworks for natural languages
could be used to shed light on the computer programming learning process.
We conclude that teaching introductory computer programming to NNES learners requires further study in
order to find optimal instructional models. Ideally, these models would embed beneficial language characteristics in
the language of instruction, but more importantly, in the content being learned. We consider that including the
opinions of learners would be a very important addition to the discussion proposed in this paper. However, the scope
and complexity of such inclusion goes beyond the analysis we intended to present.
Aktunc, O. (2013). A teaching methodology for introductory programming courses using Alice. International
Journal of Modern Engineering Research, 3, 350-353. Retrieved from
Ali, A., & Smith, D. (2014). Teaching an introductory programming language in a general education course. Journal
of Information Technology Education: Innovations in Practice, 13, 57-67. Retrieved from
Bajarin, T. (2014, June 15). Why basic coding should be a mandatory class in junior high. Time Inc. Retrieved from
Berry, D. C., & Dienes, Z. (1993). Implicit learning: Theoretical and empirical issues. Hove, UK: Lawrence
CBC News (2015, August 31). Back to school: Canada lagging push to teach kids computer coding. CBC/Radio-
Canada. Retrieved from http://www.cbc.ca
Crystal, D. (2012). English as a global language (2nd ed.). Cambridge, England: Cambridge University Press
Dredge, S. (2014, September 4). Coding at school: A parent’s guide to England’s new computing curriculum. The
Guardian. Retrieved from https://www.theguardian.com
Gardner, R.C. (2007). Motivation and second language acquisition. Porta Linguarium, 8, 9-22. Retrieved from
Goldenberg, C. (2008). Teaching English language learners: What the research does - and does not - say. ESED
5234 - Master List. 27. Retrieved from http://digitalcommons.georgiasouthern.edu/esed5234-master/27
Harper, C., & Jong, E. (2004). Misconceptions about teaching English language learners. Journal of Adolescent &
Adult Literacy, 48(2), 152-162. https://doi.org/10.1598/JAAL.48.2.6
Haynes, J. (2007). Getting started with English language learners: How educators can meet the challenge.
Alexandria, VA: Association for Supervision and Curriculum Development
Janzen, J. (2008). Teaching English language learners in the content areas. Review of Educational Research, 78(4),
Jenkins, J. (2006). Current perspectives on teaching world Englishes and English as a lingua franca. TESOL
Quarterly, 40(1): 157-181.
Knuth, D. E. (1984). Literate programming. The Computer Journal. British Computer Society, 27(2): 97–111.
Krashen, S. D. (1982). Principles and practice in second language acquisition. Oxford: Pergamon.
Kuo, I. C. (2006). Addressing the issue of teaching English as a lingua franca. ELT Journal, 60(3): 213-221.
Lee, O. (2005). Science education with English language learners: Synthesis and research agenda. Review of
Educational Research, 75(4), 491-530. https://doi.org/10.3102/00346543075004491
Linn, M.C., Dalbey, J. (1989). Cognitive consequences of programming instruction. In: Soloway, E., Spohrer, J.C.
(Eds.), Studying the Novice Programmer. London, Lawrence Erlbaum Associates, 58–62
Pane, J. F., & Myers, B. A. (2001). Studying the language and structure in non-programmers' solutions to
programming problems. International Journal of Human-Computer Studies, 54(2), 237-264.
Rauch, G. (2009, February 24). The four stages of programming competence [Blog post]. Retrieved from
Robins, A., Rountree, J., & Rountree, N. (2003). Learning and teaching programming: A review and discussion.
Computer Science Education, 54(2), 137-172. http://dx.doi.org/10.1076/csed.18.104.22.16800
Rolandsson, L. (2009). Teachers’ perceptions about learning programming. In Proceedings PATT-22 Conference.
Strengthening the Position of Technology Education in the Curriculum, 24-28 August (p. 361).
Ruby, I., & David, S. (2016, July). Natural-Language Neutrality in Programming Languages: Bridging the
Knowledge Divide in Software Engineering. In International Conference on Learning and Collaboration
Technologies (pp. 628-638). Springer International Publishing.
Saeli, M. (2012). Teaching programming for secondary school: A pedagogical content knowledge based approach
(Doctoral dissertation, Technische Universiteit Eindhoven). Retrieved from
Seidlhofer, B. (2004). Research perspectives on teaching English as a lingua franca. Annual Review of Applied
Linguistics, 24: 209-239. https://doi.org/10.1017/S0267190504000145
Seidlhofer, B. (2005). English as a lingua franca. ELT Journal, 59(4): 339-341. https://doi.org/10.1093/elt/cci064
Silcoff, S. (2016, January 17). B.C. to add computer coding to school curriculum. The Globe and Mail. Retrieved
Simons, G. F., & Fennig, C. D. (2017). Ethnologue: Languages of the world (20th ed.). Dallas, TX: SIL
TIOBE Software BV. (n.d.). TIOBE index for March 2017. Retrieved March 15, 2017, from