ArticlePDF AvailableLiterature Review

The Language of Programming: A Cognitive Perspective



Computer programming is becoming essential across fields. Traditionally grouped with science, technology, engineering, and mathematics (STEM) disciplines, programming also bears parallels to natural languages. These parallels may translate into overlapping processing mechanisms. Investigating the cognitive basis of programming is important for understanding the human mind and could transform education practices.
The language of programming: a cognitive perspective
Evelina Fedorenko1,2,3, Anna Ivanova3, Riva Dhamala4, & Marina Umaschi Bers4
1 Psychiatry Department, Massachusetts General Hospital, USA
2 Psychiatry Department, Harvard Medical School, USA
3 Brain & Cognitive Sciences Department, Massachusetts Institute of Technology, USA
4 Eliot-Pearson Department of Child Study and Human Development, Tufts University, USA
*Correspondence: (@ev_fedorenko) or (@marinabers)
Abstract. Computer programming is becoming essential across fields. Traditionally
grouped with STEM (science, technology, engineering, and math) disciplines,
programming also bears parallels to natural languages. These parallels may translate into
overlapping processing mechanisms. Investigating the cognitive basis of programming
is important for understanding the human mind and could transform education practices.
The growing importance of computer programming
In the automated economy, computer programming is becoming an essential skill across diverse fields and
disciplines. As a result, countries all over the world are exploring the inclusion of computer science (CS)
as mandatory in the school curriculum. For example, the US government launched the “Computer Science
for All” initiative in 2016 to introduce programming at all educational levels. Similar initiatives are taking
place across Europe, Asia, and South America.
The growing importance of CS education underscores the urgency in characterizing the cognitive
mechanisms and the corresponding brain circuits that support the acquisition and use of computer
programming skills (Box 1). This basic knowledge is needed to guide the design of curricula, assessments,
and educational policies regarding when and how to introduce computer science in schools, and inform the
implementation of teaching strategies and disciplinary integration.
Furthermore, an understanding of the cognitive and neural basis of programming can contribute to the
general enterprise of deciphering the architecture of the human mind. Computer programming is a cognitive
invention, like arithmetic and writing. How are such emergent skills acquired? Presumably, they rely on
phylogenetically older mechanisms, many of which we share with other animals. But which mechanisms?
And how do these mechanisms and new domains of knowledge interact with the evolutionarily older and
ontogenetically earlier-emerging ones?
Programming as problem solving
Traditionally, many researchers have construed programming in terms of problem solving, dividing it into
distinct steps: problem comprehension, design, coding, and debugging/maintenance [1]. As a result, when
describing the cognitive underpinnings of programming, researchers have often focused on the early stages
of program planning and the ability to break down a problem into discrete units (later dubbed
computational thinking [2]). Studies that have probed the process of coding itself have often resorted to
evaluating overall cognitive load [3]. Thus, empirical research has lagged behind in exploring the
relationship between mechanisms that underlie programming and other cognitive skills, in particular,
language ability. Despite an abundance of metaphoric descriptions linking computer and natural languages,
such as the use of the terms “syntax” and “semantics” [4], the problem-solving approach has continued to
dominate the discourse in the field of computer science education and sometimes eclipsed research
exploring other cognitive mechanisms potentially involved (see Supplementary Materials for a more
detailed overview).
Beyond STEM: an alternative construal of CS
In spite of the lack of rich and detailed characterization of the cognitive bases of computer programming,
educators have long made assumptions about the relationship between programming and other cognitive
skills. These assumptions have shaped the treatment of CS in schools across the world as
mathematically/problem-solving oriented, and, when integrated in the curricula, CS has been grouped with
STEM disciplines [5]. However, some have argued for an alternative construal of programming an
approach that has become known as coding as literacy” [6]. The key idea is this: when you learn a
programming language, you acquire a symbolic system that can be used to creatively express yourself and
communicate with others. The process of teaching programming can therefore be informed by pedagogies
for developing linguistic fluency.
The term “programming languages” already implies parallels to natural language. However, to rigorously
evaluate the nature and extent of potential overlap in the cognitive and neural mechanisms that support
computer vs. natural language processing, it is critical to delineate the core components of each process and
formulate specific hypotheses about representations and computations that underlie them. Here, we propose
a framework for generating such hypotheses.
With respect to knowledge representations, both computer and natural languages rely on a set of “building
blocks” (words and phrases in natural language, functions and variables in computer languages) and a set
of constraints for how these building blocks can combine to create new complex meanings. Studies dating
back to the 1970s have noted this parallel, as evidenced by the occasional reference to the “semantics” and
“syntax” of programming languages [4], but few have investigated this distinction experimentally (see
Supplementary Materials). Whereas the technical meanings of these terms in linguistics and CS differ,
their usage highlights that both natural and programming languages rely on meaningful and structured
With respect to computations, a multi-step processing pipeline appears to underlie both comprehension and
generation of linguistic/code units (Fig. 1). In comprehension, we start with perceptual input and are trying
to infer the intended meaning of an utterance or to decipher what a piece of code is trying to achieve. In
doing so, we initially engage in some basic perceptual processing (auditory/visual in natural languages and
typically visual in computer languages), and then attempt to recognize the building blocks and the
relationships among them. For longer narratives and extended pieces of code, we need to not only
understand each utterance/line, but also infer an overall high-order structure of the text/program.
In generating linguistic utterances or code, we start with an idea. This idea can be a simple one and require
a single sentence or line of code, or it can be highly complex and require a whole extended narrative or
multi-part program. For simpler ideas or sub-components of complex ideas, we need to figure out the
specific building blocks to use and to organize them in a particular way to express the target idea. For more
complex ideas, we first need to determine the overall structure of the narrative or program. Once we have
a specific plan for what we want to say, or what a piece of code would look like, we engage in actual motor
implementation by saying/writing an utterance or typing up code statements. It is worth noting, however,
that in both generating linguistic texts and computer code, top-down planning may be supplemented with
more bottom-up strategies where certain fragments of the text/code are produced first or borrowed from
previously generated text/code, and then the overall structure is built around those.
During these comprehension/generation processes, we also engage in other, plausibly similar, mental
computations. For example, we can recognize errors others’ or our own and figure out how to fix them
[1]. When processing sequences of words and commands, we plausibly engage in predictive processing: as
we get more input, we construct an increasingly richer representation of the unfolding idea, which, in turn,
constrains what might come next. And during generation of utterances or code, creative thinking comes
into play, affecting the very nature of the ideas one is trying to express, as well as how those ideas are
converted into sentences/code. Finally, we may need to consider the intent of the producer or the state of
mind of our target audience abilities that draw on our mentalizing (Theory of Mind) capacities (although
mentalizing about the computer itself might be maladaptive; see Supplementary Materials).
Figure 1. Hypothesized parallels between natural language (red) and computer programming (blue) at
different processing stages. Both sets of cognitive processes rely on the combinatorial nature of their inputs
and outputs, which consist of phonemes/letters that make up words/identifiers, which combine into
sentences/statements, giving rise to paragraphs and functions, and finally yield texts/utterances/programs
(cf. Supplementary Materials for possible differences).
It is also worth noting that the vast majority of programming languages directly rely on programmers’
knowledge of natural languages (specifically, English). Keywords, variable names, function names, and
application programming interfaces follow naming conventions that indicate their function; it has been
shown that “unintuitive” naming increases cognitive load [7] and hinders program comprehension [8]. The
importance of natural language semantics is further highlighted by the fact than non-native English speakers
often struggle to learn English-based programming languages [9]. Further, computer code is usually
accompanied by comments and, for larger pieces of software, documentation, which serve to scaffold
program comprehension. As such, the process of working with code necessarily involves tight integration
of computer and natural language knowledge.
The machine learning community has already begun to exploit structural similarities between code and
natural language by applying natural language processing (NLP) techniques to analyze code [10].
Developmental psychology researchers have also begun to explore those parallels. A case study [11]
demonstrated that knowing a programming language can facilitate acquisition of reading ability. Pilot
studies with preschoolers and kindergarteners have also shown that programming can facilitate language
processing, as young children’s sequencing abilities significantly improved after they received coding
interventions [12]. Neuroimaging studies of programming, although in their infancy, hint at potential
overlap between language- and code-processing brain regions [13]. Such findings, along with the theoretical
framework presented above, call for direct investigations of cognitive and neural overlap between language
and code processing.
Concluding remarks
The growing importance of computer programming underscores a need to conduct rigorous research
probing the cognitive architecture that underlies programming abilities. We have highlighted potential
parallels between programming and natural languages. Although the comparison is not perfect and some
mental computations likely differ (see Supplementary Materials), future empirical studies should consider
the hypothesis that programming draws on some of the same resources as natural language processing, in
addition to the traditional proposal whereby programming shares computations with math, logic, and
problem solving. If this hypothesis finds empirical support, we need to re-conceptualize the way CS is
taught, especially in early childhood, when children are learning to read and write. CS learners might also
benefit from techniques employed in foreign language classrooms, such as constant exposure to the
language and learning by doing. Making progress in deciphering the cognitive and neural bases of computer
programming may therefore yield fundamental insights about how to optimally design curricula, policy,
and educational interventions, as well as new programming languages for children that might draw on
pictorial and not only textual interfaces [15].
BOX 1: Why now? The urgency of understanding the cognitive underpinnings of computer programming.
Given the growing demand for programming skills, educators have been increasing efforts to make CS
classes available to students. The question of whether CS should be grouped with STEM or with languages
has sparked countless debates, with some citing Dijkstra who claimed that “mastery of one’s native tongue”
is key to competent programming [14] and Papert’s early vision of “learning to program as a second
language” [15], and others pointing to the decades-long tradition of viewing programming as principled
problem-solving. Lawmakers have also weighed in on the issue. In 33 states, CS fulfills a math or science
requirement, with Texas and Oklahoma providing an option to count it as a foreign language
(https:/ At the federal level, legislators have proposed an initiative to award grants to
schools that count CS toward either a math/science or a foreign language requirement.
There is a marked lack of scientific research that would support any of those initiatives. Although
programming may, to some extent, recruit both STEM and language skills, no studies of programming have
so far examined the exact division of labor between these cognitive domains. As new educational policies
are introduced, understanding the cognitive underpinnings of programming can help guide those decisions.
Acknowledgements. This work was supported by an NSF EAGER award (FAIN 1744809, “The cognitive
and neural mechanisms of computer programming in young children: storytelling or solving puzzles?”) to
Marina Bers and Evelina Fedorenko. We also thank three anonymous reviewers for their helpful comments
and suggestions.
[1] Dalbey, J. and Linn, M. C. (1985) The demands and requirements of computer programming: A
literature review. J Educ Comput Res 1, 253-274
[2] Wing, J. M. (2006) Computational thinking. Communications of the ACM, 49, 33-35
[3] Nakagawa, T. et al. (2014) Quantifying programmers' mental workload during program comprehension
based on cerebral blood flow measurement: a controlled experiment. In Comp Proc of the 36th ICSE, pp.
448-451, ACM
[4] Shneiderman, B. and Mayer, R. E. (1975) Towards a cognitive model of programmer behavior.
Computer Science Department, Indiana University
[5] Guzdial , M. & Morrison, B. (2016) Growing computer science education into a STEM education
discipline. Communications of the ACM, 59, 31-33
[6] Bers, M. U. (2019) Coding as Another Language: Why Computer Science in Early Childhood Should
Not Be STEM. In Exploring Key Issues in Early Childhood and Technology: Evolving Perspectives and
Innovative Approaches (Donohue, C., ed), Routledge
[7] Fakhoury, S. et al. (2018) The Effect of Poor Source Code Lexicon and Readability on Developers'
Cognitive Load. In Proc Int'l Conf Program Comprehension (ICPC), 286-296
[8] Lawrie, D. et al. (2006) What’s in a Name? A Study of Identifiers. In Proc Int'l Conf Program
Comprehension (ICPC), 3-12
[9] Guo, P. J. (2018) Non-Native English Speakers Learning Computer Programming: Barriers, Desires,
and Design Opportunities. In Proc CHI, 396. ACM
[10] Allamanis, M. et al. (2018). A survey of machine learning for big code and naturalness. ACM
Computing Surveys (CSUR), 51, 81
[11] Peppler, K. A. and Warschauer, M. (2011) Uncovering Literacies, Disrupting Stereotypes: Examining
the (Dis)Abilities of a Child Learning to Computer Program and Read. International Journal of Learning
and Media, 3, 15-41
[12] Kazakoff, E. R. and Bers, M. U. (2014) Put your robot in, put your robot out: Sequencing through
programming robots in early childhood. J Educ Comput Res, 50, 553-573
[13] Siegmund, J. et al. (2014) Understanding source code with functional magnetic resonance imaging. In
Comp Proc of the 36th ICSE, pp. 378-389, ACM
[14] Dijkstra, E. W. (1982) How do we tell truths that might hurt? In Selected Writings on Computing: A
Personal Perspective, pp. 129-131, Springer
[15] Papert, S (1980): Mindstorms: Children, computers and powerful ideas. Basic Books, Inc.
Supplementary Information
I. Prior work on the relationship between programming and different cognitive functions
Attempts to develop cognitive models of computer programming date back to the 1970s (e.g., Brooks,
1977; Weinberg, 1971). Most researchers have construed programming in terms of problem solving (e.g.,
Dalbey & Linn, 1985; Ormerod, 1990; Pea & Kurland, 1984; Pennington & Grabowsky, 1990). Out of
those, studies that have probed the process of coding itself tended to compare specific programming
constructions (e.g., Sime, Green, Guest, 1977; Ganon & Hornig, 1975) or evaluate overall cognitive load
(Bergersen & Gustafsson, 2011; Ikutani & Uwano, 2014; Nakagawa et al, 2014; Nakamura et al., 2003).
Empirical research has lagged behind in exploring the relationship between mechanisms that underlie
programming and other cognitive skills, in particular, language ability. Early research by Sime, Green and
Guest (1973) used concepts from Chomskyan linguistics (specifically, recursion depth) to test whether
similar cognitive constraints apply to computer language processing. However, the problem-solving
approach still dominated the field, despite an abundance of metaphoric descriptions linking computer and
natural languages, such as the use of the terms “syntax” and “semantics” (Shneiderman & Mayer, 1975)
or parallels drawn between learning to program and learning to read (Pea & Kurland, 1984). Murnane
(1993) identified a strong need for programming to be examined in conjunction with natural language
processing, but few studies have followed through.
A number of researchers have examined the link between learning to program and other cognitive skills,
such as metacognition and creativity (e.g., Clements, 1986; 1995; Liao & Bright, 1991), as well as
abstract reasoning (e.g., Lye & Koh, 2014; Rich et al., 2014). Others have examined effectiveness of CS
concept learning by analogy (Hoc & Nguyen-Xuan, 1990) and the role of context (Chao, Feldon &
Cohoon, 2018). However, rigorous empirical investigations of computer programming abilities are far
and few between; most existing studies are outdated, and thus do not take into account the current
understanding of human cognitive architecture.
II. A need for controlled rigorous experimentation to characterize the cognitive basis of programming
Multiple researchers have highlighted the fact that the field of programming language research has very
few rigorous empirical studies (Hanenberg, 2010; Tishy, 1998; Stefik & Hanenberg, 2017). Kaijanaho
(2015) found that out of 156 analyzed articles in programming language design, only 22 used
randomized controlled design. Uesbeck et al. (2016) report that not a single paper from the
International Conference on Functional Programming (ICFP), which started in 1996, met the standards of
an empirical scientific study (as defined by the WWC Handbook by the Institute of Education Sciences).
Many software researchers shy away from conducting human studies entirely, viewing them as “too
difficult to design, too time consuming to conduct, too difficult to recruit for, and with too high a risk of
inconclusive result” (Buse, Sadowsky & Weimer, 2011; as cited in Ko, LaTosa & Burnett, 2013). As a
result, programming language choices tend to rely on mathematical reasoning rather than empirical
evidence about cognitive processing (Stefik & Hanenberg, 2017), making it impossible to evaluate real-
time code processing in the mind and brain.
III. Differences between computer and natural language understanding
The main aim of the article has been to highlight possible parallels in cognitive mechanisms underlying
comprehension/generation of code and natural language. However, we recognize that not all parallels
will hold at all processing stages. Below, we outline several domains where computer and natural
languages diverge.
Parsing. Visual structure plays a substantially more prominent role in code than in text, which means
that its processing is less sequential (Busjahn et al., 2015). In addition, program structure is much more
predictable (has lower entropy; Hindle et al., 2012), allowing programmers to get the gist of the
program by skimming it instead of reading word by word (although natural language readers also make
use of predictability by skipping low-entropy words; Rayner & Well, 1996). Finally, choosing the order in
which the program should be read is often determined by the control flow, meaning that processing the
meaning of a chunk of code will determine which part of the code the programmer will turn to next.
Semantics. When learning to program, a novice inevitably needs to learn a new set of concepts (Hoc &
Nguyen-Xuan, 1990). Those concepts allow programmers to simulate execution flow within the machine
and remove ambiguity present in natural language (e.g., inclusive vs. exclusive “or”). When the
semantics of newly learned concepts overlaps with natural language semantics, learning is enhanced,
and vice versa (Stefik & Siebert, 2013). Thus, computer and natural languages rely on a partially
overlapping set of conceptual primitives; the fact that the nature of this overlap affects program
learning makes it even more important to study it.
Pragmatics and Theory of Mind. Perhaps the most interesting set of differences between the two
language types is the nature of the message recipient: natural language utterances are directed toward
another human, while programs are mainly intended for machines. Both represent a form of
communication; however, assumptions about natural language communication do not apply to coding,
leading novice programmers to commit errors through improper knowledge transfer (Bonar & Soloway,
1983). Pea (1986) describes this misconception as “the idea that there is a hidden mind somewhere in
the programming language that has intelligent, interpretive powers”. Although not explicit, this
assumption may affect the process of program learning by leading programmers to transfer their
“common ground” assumptions (Clark, Schreuder & Buttrick, 1983) to a computer that may not share
them (Brennan, 1998). That said, the fact that such transfer mistakes happen at all, point to the fact that
novice programmers often resort to their natural language knowledge when learning to code.
References for the Supplementary Information
Bonar, J., & Soloway, E. (1983). Uncovering principles of novice programming. In Proceedings of the 10th
ACM SIGACT-SIGPLAN symposium on Principles of programming languages (pp. 10-13). ACM.
Brooks, R. (1977). Towards a theory of the cognitive processes in computer programming. International
Journal of Man-Machine Studies, 9(6), 737-751.
Brennan, S. E. (1998). The grounding problem in conversations with and through computers. Social and
cognitive approaches to interpersonal communication, 201-225.
Buse, R. P., Sadowski, C., & Weimer, W. (2011). Benefits and barriers of user evaluation in software
engineering research. ACM SIGPLAN Notices, 46(10), 643-656.
Busjahn, T., Bednarik, R., Begel, A., Crosby, M., Paterson, J. H., Schulte, C., ... & Tamm, S. (2015). Eye
movements in code reading: Relaxing the linear order. In 2015 IEEE 23rd International Conference on
Program Comprehension (pp. 255-265). IEEE.
Chao, J., Feldon, D. F., & Cohoon, J. P. (2018). Dynamic Mental Model Construction: A Knowledge in
Pieces-Based Explanation for Computing Students’ Erratic Performance on Recursion. Journal of the
Learning Sciences, 27(3), 431-473.
Clark, H. H., Schreuder, R., & Buttrick, S. (1983). Common ground at the understanding of demonstrative
reference. Journal of verbal learning and verbal behavior, 22(2), 245-258.
Clements, D. H. (1986). Effects of Logo and CAI environments on cognition and creativity. Journal of
Educational Psychology, 78(4), 309-318.
Clements, D. H. (1995). Teaching creativity with computers. Educational Psychology Review, 7(2), 141-
Dalbey, J., & Linn, M. C. (1985). The demands and requirements of computer programming: A literature
review. Journal of Educational Computing Research, 1(3), 253-274.
Hanenberg, S. (2010). Faith, hope, and love: an essay on software science's neglect of human factors. In
Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages
and Applications (OOPSLA '10). ACM, New York, NY, USA, 933-946.
Hindle, A., Barr, E. T., Su, Z., Gabel, M., & Devanbu, P. (2012). On the naturalness of software. In 2012
34th International Conference on Software Engineering (ICSE) (pp. 837-847). IEEE.
Hoc, J. M., & Nguyen-Xuan, A. (1990). Language semantics, mental models and analogy. In Psychology of
programming (pp. 139-156). Academic Press.
Kaijanaho, A. J. (2015). Evidence-based programming language design: a philosophical and
methodological exploration. Jyväskylä Studies in Computing, (222).
Ko, A. J., Latoza, T. D., & Burnett, M. M. (2015). A practical guide to controlled experiments of software
engineering tools with human participants. Empirical Software Engineering, 20(1), 110-141.
Liao, Y. K. C., & Bright, G. W. (1991). Effects of computer programming on cognitive outcomes: A meta-
analysis. Journal of Educational Computing Research, 7(3), 251-268.
Lye, S. Y., & Koh, J. H. L. (2014). Review on teaching and learning of computational thinking through
programming: What is next for K-12?. Computers in Human Behavior, 41, 51-61.
Ormerod, T. (1990). Human cognition and programming. In Psychology of programming (pp. 63-82).
Academic Press.
Pea, R. D. (1986). Language-independent conceptual “bugs” in novice programming. Journal of
educational computing research, 2(1), 25-36.
Pea, R. D., & Kurland, D. M. (1984). On the cognitive effects of learning computer programming. New
ideas in psychology, 2(2), 137-168.
Pennington, N., & Grabowski, B. (1990). The tasks of programming. In Psychology of programming (pp.
45-62). Academic Press.
Rayner, K., & Well, A. D. (1996). Effects of contextual constraint on eye movements in reading: A further
examination. Psychonomic Bulletin & Review, 3(4), 504-509.
Rich, P. J., Bly, N., & Leatham, K. R. (2014). Beyond cognitive increase: investigating the influence of
computer programming on perception and application of mathematical skills. Journal of Computers in
Mathematics and Science Teaching, 33(1), 103-128.
Sime, M. E., Green, T. R. G., & Guest, D. J. (1973). Psychological evaluation of two conditional
constructions used in computer languages. International Journal of Man-Machine Studies, 5(1), 105-113.
Sime, M. E., Green, T. R. G., & Guest, D. J. (1977). Scope marking in computer conditionalsa
psychological evaluation. International Journal of Man-Machine Studies, 9(1), 107-118.
Stefik, A., & Hanenberg, S. (2017). Methodological irregularities in programming-language research.
Computer, 50(8), 60-63.
Stefik, A., & Siebert, S. (2013). An empirical investigation into programming language syntax. ACM
Transactions on Computing Education (TOCE), 13(4), 19.
Tichy, W. F. (1998). Should computer scientists experiment more?. Computer, 31(5), 32-40.
Weinberg, G.M. (1971). The Psychology of Computer Programming. Van Nostrand Reinhold: New York.
References on CS initiatives mentioned in the main body
Balanskat, A., & Engelhardt, K. (2014). Computing our future: Computer programming and coding-
Priorities, school curricula and initiatives across Europe. European Schoolnet.
Bocconi, S., Chioccariello, A., Dettori, G., Ferrari, A., Engelhardt, K., Kampylis, P., & Punie, Y. (2016).
Developing computational thinking in compulsory education. European Commission, JRC Science for
Policy Report.
Burning Glass Technologies (2016). Beyond Point and Click: The Expanding Demand for Coding Skills.
Retrieved at: (Nov 27, 2018) (2018). 2018 State of Computer Science Education. Retrieved at:
Accessed on 2018-11-27.
Fayer, S., Lacey, A., & Watson, A. (2017). STEM occupations: Past, present, and future. Spotlight on
Livingstone, S. (2012). Critical reflections on the benefits of ICT in education. Oxford review of
education, 38(1), 9-24.
... Software programming is a complex and phylogenetically very recent human activity (100 years old), even more than reading/literacy (which started around 5000 BC) or complex mathematics (3000 BC) [1]. Importantly, in a century where computers dominate, there is an increasing interest in understanding the neural correlates of program comprehension [2]. ...
... In neuroscientific terms, there is the debate [2] if programming requires the expert integration of mathematical and language skills, including logical thinking and symbol manipulation. Programming may require a large set of skills beyond mathematical calculations using numbers and might require integration with the language/reading skills at an abstract level. ...
... These are, to our knowledge, the only studies available in the literature about the neuronal correlates of program understanding. The one from Castelhano et al. uniquely reported functional and effective connectivity, but the amplitude findings of the available articles provide relevant insights on the relative weight of each network in programming tasks [2,30,32,33]. The work from Siegmund et al. used detection of syntax errors as contrast condition to investigate the cognitive process of programming/source-code comprehension. ...
Full-text available
Software programming is a modern activity that poses strong challenges to the human brain. The neural mechanisms that support this novel cognitive faculty are still unknown. On the other hand, reading and calculation abilities represent slightly less recent human activities, in which neural correlates are relatively well understood. We hypothesize that calculus and reading brain networks provide joint underpinnings with distinctly weighted contributions which concern programming tasks, in particular concerning error identification. Based on a meta-analysis of the core regions involved in both reading and math and recent experimental evidence on the neural basis of programming tasks, we provide a theoretical account that integrates the role of these networks in program understanding. In this connectivity-based framework, error-monitoring processing regions in the frontal cortex influence the insula, which is a pivotal hub within the salience network, leading into efficient causal modulation of parietal networks involved in reading and mathematical operations. The core role of the anterior insula and anterior midcingulate cortex is illuminated by their relation to performance in error processing and novelty. The larger similarity that we observed between the networks underlying calculus and programming skills does not exclude a more limited but clear overlap with the reading network, albeit with differences in hemispheric lateralization when compared with prose reading. Future work should further elucidate whether other features of computer program understanding also use distinct weights of phylogenetically “older systems” for this recent human activity, based on the adjusting influence of fronto-insular networks. By unraveling the neural correlates of program understanding and bug detection, this work provides a framework to understand error monitoring in this novel complex faculty.
... High dropout rates in introductory computer programming courses remain a major concern for higher education educators across the world [1,20,22]. Computer programming is a complex and multi-faceted task as it requires not only conceptual and procedural knowledge, but also skills to create, modify and comprehend computer code [13,35,38] in order to solve programming problems. Numerous studies [2,19,34] have identified a lack of problem-solving skills as one of the biggest challenges that novice programmers experienced. ...
... By focusing on Step 1 of the seven-step DtDs framework (identifying places in a course where many students consistently fail to master crucial material), this study aimed to (1) explore the problemsolving strategies utilised by novice programmers during SCC; (2) relate these strategies to Polya's four basic problem-solving steps; and (3) utilise a SWOT analysis of these strategies for the identification of problem-solving bottlenecks experienced by novice programmers. Thematic analysis of data collected by means of asking questions, 13 observations, and artefact analysis revealed that the novice programmers in this study did not necessarily follow a well-defined problem-solving process. We were, however, able to link the specific strategies they employed to all four of Polya's basic problemsolving steps. ...
... Computer programming has been widely studied in computer science education, which can be considered as a sub-branch of science, technology, engineering and mathematics (STEM) education (Fedorenko et al., 2019;Guzdial & Morrison, 2016). Collaborative programming, as a CSCL mode, supports two or more learners working together at one workstation or remotely online to solve the same programming problems (Zheng, 2021). ...
Full-text available
Background Instructor scaffolding is proved to be an effective means to improve collaborative learning quality, but empirical research indicates discrepancies about the effect of instructor scaffoldings on collaborative programming. Few studies have used multimodal learning analytics (MMLA) to comprehensively analyze the collaborative programming processes from a process-oriented perspective. This research conducts a MMLA research to examine the immediate and delayed effects of instructor scaffoldings on small groups’ collaborative programming in K-12 education context with an aim to provide research, analytics, and pedagogical implications. Results The results indicated that the instructor provided five types of scaffoldings from the social, cognitive, and metacognitive dimensions, and groups had seven types of responses (i.e., immediate uptake and delayed use) to five instructor scaffoldings, ranging from the low-to-medium and high level of cognitive engagement. After the scaffolding was faded, groups used the content from the high-control cognitive scaffolding frequently to solve problems in a delayed way, but groups did not use the instructor’s scaffolding content from the social and low-control cognitive scaffoldings from the pedagogical perspective, instructors should consider scaffolding types, group states and characteristics, as well as the timing of scaffolding to better design and facilitate collaborative programming. From an analytical perspective, MMLA was proved to be conducive to understand collaborative learning from social, cognitive, behavioral, and micro-level dimensions, such that instructors can better understand and reflect on the process of collaborative learning, and use scaffoldings more skillfully to support collaborative learning. Conclusions Collaborative programming is encouraged to be integrated in STEM education to transform education from the instructor-directed lecturing to the learner-centered learning. Using MMLA methods, this research provided a deep understanding of the immediate and delayed effects of instructor scaffoldings on small groups’ collaborative programming in K-12 STEM education from a process-oriented perspective. The results showed that various instructor scaffoldings have been used to promote groups’ social and cognitive engagement. Instructor scaffoldings have delayed effects on promoting collaborative programming qualities. It is highly suggested that instructors should integrate scaffoldings to facilitate computer programming education and relevant research should apply MMLA to reveal details of the process of collaboration.
... Software programming, in particular, the task of code reviewing is a complex and relatively recent human activities, involving the integration of mathematical skills, recursive thinking, language processing, and error-monitoring (Fedorenko et al., 2019). The study of these skills from a neuroscientific perspective has received an increasing interest. ...
Full-text available
The neural correlates of software programming skills have been the target of an increasing number of studies in the past few years. Those studies focused on error-monitoring during software code inspection. Others have studied task-related cognitive load as measured by distinct neurophysiological measures. Most studies addressed only syntax errors (shallow level of code monitoring). However, a recent functional MRI (fMRI) study suggested a pivotal role of the insula during error-monitoring when challenging deep-level analysis of code inspection was required. This raised the hypothesis that the insula is causally involved in deep error-monitoring. To confirm this hypothesis, we carried out a new fMRI study where participants performed a deep source-code comprehension task that included error-monitoring to detect bugs in the code. The generality of our paradigm was enhanced by comparison with a variety of tasks related to text reading and bugless source-code understanding. Healthy adult programmers (N = 21) participated in this 3T fMRI experiment. The activation maps evoked by error-related events confirmed significant activations in the insula [p(Bonferroni) < 0.05]. Importantly, a posterior-to-anterior causality shift was observed concerning the role of the insula: in the absence of error, causal directions were mainly bottom-up, whereas, in their presence, the strong causal top-down effects from frontal regions, in particular, the anterior cingulate cortex was observed.
... Given these advancements, perhaps task-oriented bots can be equipped with more capabilities to assist humans in cognitively demanding tasks, including programming by professionals or novices. By building models and tools that can generate both language and code, we could potentially better understand the cognitive basis of programming which can have key impacts on computer science education practices (Fedorenko et al., 2019). This survey is structured as follows: Section §2 explores general deep learning techniques that have been used to model language and source code over the last 35 years. ...
Full-text available
In this survey paper, we overview major deep learning methods used in Natural Language Processing (NLP) and source code over the last 35 years. Next, we present a survey of the applications of Artificial Intelligence (AI) for source code, also known as Code Intelligence (CI) and Programming Language Processing (PLP). We survey over 287 publications and present a software-engineering centered taxonomy for CI placing each of the works into one category describing how it best assists the software development cycle. Then, we overview the field of conversational assistants and their applications in software engineering and education. Lastly, we highlight research opportunities at the intersection of AI for code and conversational assistants and provide future directions for researching conversational assistants with CI capabilities.
... There are significant differences between using programming languages and natural languages for expressing ourselves (Fedorenko et al, 2019). CAL doesn't ignore these. ...
This paper describes a pedagogical approach, Coding as Another language (CAL) to teach programming and computational thinking in early childhood. The CAL curriculum connects powerful ideas from the discipline of computer science with ideas from literacy in a way that is developmentally appropriate for children 4-8 years of age. CAL is free and can be used with two widely available programming environments for young children: the free on-screen ScratchJr app and the KIBO robotics kit that doesnt require keyboards or screens. Through 24 lessons centered on books, CAL emphasizes creative play and self-expression by positioning the learning of programming as the mastering of a new symbolic language. In addition, CAL provides opportunities for socio-emotional development in the context of a collaborative play-based learning environment, a coding playground, in which there is purposeful exploration of ethical and moral values and intentional promotion of positive behaviors and chrachter strenghs.
Full-text available
There is growing interest in teaching computer science and programming skills in schools. Here we investigated the efficacy of peer tutoring, which is known to be a useful educational resource in other domains but never before has been examined in such a core aspect of applied logical thinking in children. We compared (a) how children (N = 42, age range = 7 years 1 month to 8 years 4 months) learn computer programming from an adult versus learning from a peer and (b) the effect of teaching a peer versus simply revising what has been learned. Our results indicate that children taught by a peer showed comparable overall performance—a combination of accuracy and response times—to their classmates taught by an adult. However, there was a speed–accuracy trade-off, and peer-taught children showed more exploratory behavior, with shorter response times at the expense of lower accuracy. In contrast, no tutor effects (i.e., resulting from teaching a peer) were found. Thus, our results provide empirical evidence in support of peer tutoring as a way to help teach computer programming to children. This could contribute to the promotion of a widespread understanding of how computers operate and how to shape them, which is essential to our values of democracy, plurality, and freedom.
Full-text available
Recent advances in artificial intelligence have brought attention to computational thinking (CT) in school education worldwide. However, little is known about the development of the literacy of CT in children, mainly because of the lack of proper psychometric assessments. We developed the first psychometrically validated assessment on the literacy of CT of children in Chinese elementary schools, coined as the Computational Thinking Assessment for Chinese Elementary Students (CTA-CES). Items were constructed to reflect key aspects of CT such as abstraction, algorithm thinking, decomposition, evaluation, and pattern recognition. To examine the test reliability and validity, we recruited two samples of 280 third- to sixth-grade students in total. Cronbach’s alpha provided evidence for the reliability of the test scores, item response theory analyses demonstrated psychometric appropriateness, whereas construct validity was verified by convergent validity, and criterion-related validity was confirmed by correlations between the CTA-CES and measures related to CT, namely reasoning, spatial ability, and verbal ability. In addition, an fMRI study further demonstrated similar neural activation patterns when students conducted the CTA-CES and programming tasks. Taken together, the CTA-CES is the first reliable and valid instrument for measuring the literacy of CT for Chinese children, and may be applicable to children worldwide.
Computer programming is becoming an essential skill in the 21st century, and in order to best prepare future generations, the promotion of computational thinking and literacy must begin in early childhood education. Computational thinking can be defined in many ways. The broad definition offered in this chapter is that computational thinking practices refer to techniques applied by humans to express themselves by designing and constructing computation. This chapter claims that one of the fundamental ways in which computational thinking can be supported and augmented is by providing children with opportunities to code and to create their own interactive computational media. Thus, computational literacy will allow children to become producers and not only consumers of digital artifacts and systems.
Conference Paper
Full-text available
It has been well documented that a large portion of the cost of any software lies in the time spent by developers in understanding a program’s source code before any changes can be undertaken. One of the main contributors to software comprehension, by subsequent developers or by the authors themselves, has to do with the quality of the lexicon, (i.e., the identifiers and comments) that is used by developers to embed domain concepts and to communicate with their teammates. In fact, previous research shows that there is a positive correlation between the quality of identifiers and the quality of a software project. Results suggest that poor quality lexicon impairs program comprehension and consequently increases the effort that developers must spend to maintain the software. However, we do not yet know or have any empirical evidence, of the relationship between the quality of the lexicon and the cognitive load that developers experience in trying to understand a program. Given the associated costs, there is a critical need to empirically characterize the impact of the quality of the lexicon on developers’ ability to comprehend a program. In this study, we explore the effect of poor source code lexicon and readability on developers’ cognitive load as measured using a cutting-edge and minimally invasive functional brain imaging technique called functional Near Infrared Spectroscopy (fNIRS). Additionally, we map cognitive load data to identifiers in the source code using an eye tracking device while developers perform software comprehension tasks. Our results show that the presence of linguistic antipatterns in source code significantly increases the developers’ cognitive load. (Full-text can be found here:
Full-text available
The Knowledge in Pieces (KiP) framework offers a powerful perspective to understand the dynamic nature of knowledge and knowledge development. However, there is still much we do not know about the mechanisms underlying the moment-to-moment knowledge construction process. The purpose of this study was to explore the ways in which contextual knowledge elements support that process. According to KiP, a knowledge element, once activated, becomes part of the context that shapes the activation, configuration, and operation of other elements. Based on this assumption, we hypothesized that when the structural and operational characteristics of these contextual knowledge elements are compatible with those of the target concept, the contextual knowledge elements can support the knowledge construction process. This hypothesis was investigated in the domain of recursion, a fundamental but challenging programming concept. Sixty undergraduate computer science students completed four recursion evaluation tasks with varying likelihood to induce compatible contextual elements. Results showed that students performed better on high-compatibility tasks than on low-compatibility tasks at both the group level and individual level. Further qualitative analysis identified the knowledge elements involved and elaborated and refined the explanatory model for the contextual support effect.
Full-text available
In the past decade, Computational Thinking (CT) and related concepts (e.g. coding, programing, algorithmic thinking) have received increasing attention in the educational field. This has given rise to a large amount of academic and grey literature, and also numerous public and private implementation initiatives. Despite this widespread interest, successful CT integration in compulsory education still faces unresolved issues and challenges. This report provides a comprehensive overview of CT skills for schoolchildren, encompassing recent research findings and initiatives at grassroots and policy levels. It also offers a better understanding of the core concepts and attributes of CT and its potential for compulsory education. The study adopts a mostly qualitative approach that comprises extensive desk research, a survey of Ministries of Education and semi-structured interviews, which provide insights from experts, practitioners and policy makers. The report discusses the most significant CT developments for compulsory education in Europe and provides a comprehensive synthesis of evidence, including implications for policy and practice.
Conference Paper
People from nearly every country are now learning computer programming, yet the majority of programming languages, libraries, documentation, and instructional materials are in English. What barriers do non-native English speakers face when learning from English-based resources? What desires do they have for improving instructional materials? We investigate these questions by deploying a survey to a programming education website and analyzing 840 responses spanning 86 countries and 74 native languages. We found that non-native English speakers faced barriers with reading instructional materials, technical communication, reading and writing code, and simultaneously learning English and programming. They wanted instructional materials to use simplified English without culturally-specific slang, to use more visuals and multimedia, to use more culturally-agnostic code examples, and to embed inline dictionaries. Programming also motivated some to learn English better and helped clarify logical thinking about natural languages. Based on these findings, we recommend learner-centered design improvements to programming-related instructional resources and tools to make them more accessible to people around the world.
Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit code's abundance of patterns. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models. We present a taxonomy based on the underlying design principles of each model and use it to navigate the literature. Then, we review how researchers have adapted these models to application areas and discuss cross-cutting and application-specific challenges and opportunities.
Substantial industry and government investments in software are at risk due to changes in the underlying programming languages, despite the fact that such changes have no empirically verified benefits. One way to address this problem is to establish rigorous evidence standards like those in medicine and other sciences.
Seeking to make computing education as available as mathematics or science education.
Computer programming and other design tasks have often been characterized as a set of non-interacting subtasks. In principle, it may be possible to separate these subtasks, but in practice there are substantial interactions between them. We argue that this is a fundamental feature of programming deriving from the cognitive characteristics of the subtasks, the high uncertainty in programming environments, and the social nature of the environments in which complex software development takes place.