ArticlePDF Available

Bug Pattern Analysis of Codes Produced By Beginner Programmers Using Association Rule Mining Techniques

Authors:
UIJSLICTR Vol. 4 No. 1 March 2020 10
University of Ibadan
Journal of Science and Logics in ICT Research
Bug Pattern Analysis of Codes Produced By Beginner Programmers Using
Association Rule Mining Techniques
Kazeem, O. N. Abiola, O. A. Akinola, S. O.
nasirudeenkazeem@yahoo.com oladimejiarowolo@yahoo.co.uk solom202@yahoo.co.uk
University of Ibadan, Ibadan, Nigeria
Abstract
Bugs are errors in computer programs that cause unexpected results or programs to behave in unintended ways.
Bug pattern are erroneous coding practices that mainly arise from poor programming design pattern and
misunderstanding of language features. Beginner programmers elicit the feeling of fears like lack of self-
confidence, low level of comfort and high level of anxiety in apprehension of programming. This is termed to be
boring and they spent many hours in debugging single bug in a program. For discovering the most prevalent and
detectable bug pattern, we propose an association rule mining techniques to predict and analyze the bugs commonly
incurred among the young programmers. Result from the study generated set of rules that was used to predict the
bug patterns in beginner programmers’ codes. Also, further statistical analysis of the study shows from the
comparative analysis that semantic, syntax and logic errors form the order of magnitudes at which bugs are
commonly incurred in prospective programmers’ codes. As they progress in the study, the reverse is the case with
logic, syntax and semantic errors in the programming. Our findings are useful tools and techniques for improving
learning process by students and tutoring process by programming tutors; this can be achieved by including it in
the learning curriculum. The result of this project will not only creating awareness for novice programmers of the
possible bugs commonly incurred, but also served as a milestone for software professionals.
Keywords: Computer Programming, Bug Patterns, Data Mining, Association Rule Mining
1. INTRODUCTION
It has been observed that Software defects
detection form an important factor in the software
development lifecycle and this helps improve the
overall quality of software outputs.
A study commissioned by the US Department of
Commerce concluded that “Software bugs are so
prevalent and so detrimental that it cost the US
economy an estimated $59 Billion annually or
about 0.6 percent of the Gross Domestic
Product” [1].
Most bugs arise from mistakes and errors made
by programmers in either a program's source
code, design or its analysis and a few are caused
by compilers producing incorrect codes.
Software engineering as a discipline still does not
have general consensus on which kinds of
software bugs are most common and whether bug
types have similar frequency distributions across
multiple systems. The reason is not a deficit of
research, but a lack of uniformity [2]. A single
loophole left in programming codes can provide
an entry point to hackers who can exploit the
vulnerability, putting computer security at a risk.
Previous researches have shown that Computer
Science students in lower levels at Universities,
Polytechnics and Colleges of Education show
little or no interest in programming because of
multitudes of bugs emanated during program
executions. In order to create better
understanding of likely bugs in programming by
young programmers, this study focuses on
determining bug patterns in beginner
programmer codes, which will enable both the
tutors and students to predict bug patterns and
guide against them in a program prior to tutoring
period.
Kazeem, O. N., Abiola, O. A. and Akinola, S. O. (2020).
Bug Pattern Analysis of Codes Produced By Beginner
Programmers Using Association Rule Mining Techniques
University of Ibadan Journal of Science and Logics in ICT
Research (UIJSLICTR), Vol. 4 No. 1, pp. 10 - 24
©U IJSLICTR Vol. 4, No. 1, March 2020
UIJSLICTR Vol. 4 No. 1 March 2020 11
To the best of our knowledge, previous
researchers have failed to determine the pattern
of bugs that emanates from program by young
programmers. This study focusses on analysis of
bugs in young programmer’s codes with the aim
of encouraging novice programmers to develop
more interest in programming and avoid the
perception of programming to be very boring and
difficult. The study also provides guidance for
software inspectors to review the software's
defects, which is necessary and also forms an
important tool for software quality assurance.
1.1 Bugs in Programming
There is no way a programmer could be one
hundred percent perfect in programming; errors
are bound to occur and getting rid of it becomes
necessary in order to get the required outputs. The
following common errors exist in programming
[3]:
1. Syntax error: This occurs when the
grammatical rules of the programming
language being used are not followed. For
instance, if a keyword is misspelt, or we
don’t put semi-colon at the appropriate
places, syntax errors occur.
2. Semantic error: Semantics have to do
with attaching meanings to program
statements. These errors occur if the
compiler cannot attach meaning to a
program statement. For instance, if we are
trying to assign a floating-point value to an
integer location, or we are trying to divide
a value by zero.
3. Logic Error: This is usually made by
programmers. The program will definitely
compile and run very well but the output
from it will be erroneous. For instance, if
the relational operator <= is supposed to be
used in an expression, but >= or < is used.
4. Compile-time Errors: These are the
errors brought out during the compilation
stage of the program development. They
can be syntax and semantic errors.
5. Run-time Errors: These are errors
brought out during the execution stage of
the program development.
1.2 Debugging Errors in Programming
Debugging is a process of correcting errors in
programs. This can be achieved:
During the testing phase of software
development life cycle which can help in
correcting the errors before full scale
deployment of the software program.
Errors can also be avoided through pre-
planning and absolute care during
program coding phase.
Through practice and discipline and
following rigorous debugging
procedures can helps in correcting errors
in programming, most of the errors can
be rectified during software
development phases.
Making mistakes in programming is a part of
learning and they can never be entirely avoided.
However, it is suggested that a programmer
focuses on making new mistakes and avoid
repeating the ones he made before [4].
1.3 Data Mining
Data Mining is a step by step process of “mining”
or extracting valuable information from a large
data set, i.e. big data to perform concrete analysis
and predict future trends [4]. Many people treat
data mining as a synonym for another popularly
used term, Knowledge Discovery from Data
(KDD), while others view data mining as merely
an essential step in the process of knowledge
discovery. The knowledge discovery process is
shown in iterative sequence as follows [5]:
1. Data cleaning (to remove noise and
inconsistent from data)
2. Data integration (where multiple data
sources may be combined)
3. Data selection (where data relevant to
the analysis task are retrieved from
the database)
4. Data transformation (where data are
transformed and consolidated into
forms appropriate for mining by
performing summary or aggregation
operations)
5. Data mining (an essential process
where intelligent methods are applied
to extract data patterns)
6. Pattern evaluation (to identify the
truly interesting patterns representing
knowledge based on interestingness
measures
UIJSLICTR Vol. 4 No. 1 March 2020 12
7. Knowledge representation (where
visualization and knowledge
representation techniques are used to
present mined knowledge to users).
Data mining is the process of discovering
interesting pattern and knowledge from large
amounts of data. The data sources can include
databases, data warehouses, the Web, other
information repositories, or data that are streamed
into the system dynamically [6].
1.4 Data Mining Techniques
There are different data mining, query model,
processing model, and data collection techniques
available. Anyone can be used to mine the data,
and the one used in combination with a software
examines different data mining and analytics
techniques and solutions, and learn how to build
them using existing software and installations [6].
Data Mining and relationship mining have been
used to identify relationships in bugs generated in
the codes of young programmer and diagnosing
the patterns to know errors that are frequently
occur together [6].
Although, the goal of association (relationship
mining) is to identify relationships between
variables and normally to encode them in rules
for later use. There are different types of
relationship in mining techniques such as
association rule mining (any relationship between
variables), sequential pattern mining (temporal
associations between variables), correlation
mining (linear correlations between variables),
and causal data mining (causal relationship
between variables).
1.3 Association Rule Technique
Itemset: A collection of one or more items
denoted as (k)
Support count: Frequency of occurrence of an
itemset denoted as (σ)
Support : Fraction of transactions that contain
an Itemset denoted as (s)
Frequent Itemset: An itemset whose support is
greater than or equal to a minsup (Minimum
Support) threshold.
In a set of dataset D, the goal of association rule
mining is to find all rules having:
support ≥ minsup threshold
confidence ≥ minconf threshold
The Association Rule is an expression of the
form, X → Y, where X and Y are itemsets
Rule Evaluation Metrics
Support (s) measures fraction of transactions
that contain both X and Y.
s = (σ){Level, group, bug pattern}/{D}
Confidence (c) measures how often items in Y
appear in datasets that contain X
c= (σ){Level, group, bug pattern}/{level, group}
2. RELATED WORKS
Dhyan and Saurabh [7] published an influential
paper dealing with software bug detection using
data mining. The authors proposed that software
bug problems cannot be easily detected by
software engineer except by the help of data
mining classification. The work classified and
detected software bugs by J48, ID3 and Naïve
Bayes data mining algorithms. Comparison of
these algorithms to detect accuracy and time
taken to build model was also presented in the
paper.
Yuan et. al., [8] analyzed Software Defect
Detection with ROCUS. The authors aimed to
automatically identify defective software
modules for efficient software test in order to
improve the quality of a software system. The
learning approach or algorithm named ROCUS
was used to predict the number of defects that
may be contained in certain software module.
The method exploits the abundant unlabeled
examples to improve the detection accuracy, as
well as employs under-sampling to tackle the
class-imbalance problem in the learning process.
Experimental results on real-world software
defect detection tasks show that ROCUS is
effective for software defect prediction. Its
performance is better than a semi-supervised
learning. However, the method failed to establish
the class-imbalance nature of the task and a class-
imbalance learning method that does not make
effective use of unlabeled data. Adoption of
ROCUS to other software engineering tasks or
even beyond software engineering where the data
distribution are essentially imbalanced and the
labels are difficult to obtain can be next
interesting research to be executed.
UIJSLICTR Vol. 4 No. 1 March 2020 13
Deepak and Shukla [9] focuses on the defect
prediction mechanism development using
software metric data of KC1. Subtractive
clustering approach for generation of Fuzzy
Inference System (FIS). The FIS rules are
generated at different radius of influence of input
attribute vectors and the developed rules are
further modified by Adaptive Neuro Fuzzy
Inference System (ANFIS) technique to obtain
the prediction of number of defects in software
project using fuzzy logic system. ANFIS for the
development of software defect was used in
predicting the defective modules in a software
system prior to project deployment which is very
crucial activity, since it leads to a decrease in the
total cost of the project and an increase in overall
project success rate. The study failed to use
NASA’s Metrics Data Program (MDP)
containing software metric data and error data at
the function/method level.
The bug detection code display cluster using K-
Means cluster algorithm and also detect
Prediction using decision tree with K-Means was
used in Promila and Rajiv [10] to enhance bug
detection by Data Mining Techniques. The
researchers work are able to provide an overview
of some important data mining techniques and
their applicability on large databases. The author
sought to maintain the sustainability of codes of
object oriented languages to clearly help the
testers to classify the number of bugs present in a
source code, increasing the response/execution
time of proposed algorithm using decision tree, to
attain the accuracy which helps in future growth
of testing phase of various IT applications and
smart phone applications. The future prediction
for this study is the use of better clustering
algorithm and more languages that can be
analyzed like dot net, C++, Python etc.
Mohsin et. al., [11] proposed a software defect
detection model that can be used to identify faulty
components in big software metric data. The
proposed approach is such that it can identify
significant metrics using a combination of
different filters and wrapper techniques.
Contributions of the proposed approach are
designed and evaluated a parallel framework of a
hybrid software defect predictor in order to deal
with big software metric data in a
computationally efficient way for cloud
environment. Two different hybrids have been
developed using Fisher and Maximum Relevance
(MR) filters with Artificial Neural Network
(ANN) based wrapper in the parallel framework.
Previous work done by the authors combines both
the filter and wrapper approaches to propose a
hybrid classification model to take the
advantages of both approaches. The evaluation is
carried out on multivariate process monitoring
and detection of sources of out-of-control signals
in manufacturing systems. However, previous
work was limited to a single filter and a single
search strategy. The study failed to extend the
proposed approach for object-oriented metrics.
Moreover, more filters and wrappers could be
integrated together to analyze the combined
performance.
Saiqa et. al., [12] used a predictive model
constructed by using machine learning
approaches and classified them into defective and
non-defective modules. Machine learning
techniques help developers to retrieve useful
information after the classification and enable
them to analyze data from different perspectives.
Machine learning techniques were proven to be
useful in terms of software bug prediction. The
authors used public available data sets of
software modules and provides comparative
performance analysis of different machine
learning techniques for software bug prediction.
The results showed that most of the machine
learning methods performed well on software
bug datasets.
UIJSLICTR Vol. 4 No. 1 March 2020 14
3.0 METHODOLOGY
Figure 1 shows the architecture of the proposed system.
Figure 1: Architecture of Proposed System
3.1 Participants
The subjects used for this project were 200 and
300 Level students of Computer Science
Department, University of Ibadan, Ibadan,
Nigeria in 2017/2018 practical classes.
2.2 Experimental Design
Students were tested in solo and groups in java
programming class. Six different programming
tasks were given to students in solo and pair
groups. The students were instructed to solve the
problem by writing lines of codes using Java
programming language. Two hours were given to
each group to write the codes after which students
were instructed to compile the program. The bugs
generated were captured and saved in Microsoft
Word package for analysis.
The experiment conducted included 325 groups,
501 beginner programmers and 2,892 bugs
generated in total. Experiments 1 to 4 were
conducted with 200 Level students, categorized
as Novice Programmers while experiments 5 and
6 were conducted with 300 Level students,
categorized as Experienced Programmers. The
three bugs incurred by the participants were
syntax, semantic and logic errors.
3.3 Data Gathering Process
Computer Program Codes produced from the Six
(6) different programming tasks and bugs
incurred by students as reported by compilers
were gathered for analysis.
3.4 Procedures Used for Analysis in Rapid
Miner
The data obtained from the analysis were loaded
into the rapid miner software. These data were
used to predict the bugs in the experiments and
association rules were obtained.
Steps
1. Launch Rapid Miner software after
installation, it displays the home page as in
Figure 2
2. Choose New Process to start a new
process. Figure 2 shows the welcome or
home page wizard, this displays after
launching the rapid miner.
Experiment
Analysis
Bugs Gathering
Programming Compilation
1.1 Results Evaluation
Programming Task
Bugs Analysis
Pattern
Analysis
using
association
rule mining
techniques
UIJSLICTR Vol. 4 No. 1 March 2020 15
Figure 2: Home perspective Interface
3. Load data into the data repository. Figure 3 shows the process of loading data from
Microsoft Excel into the rapid miner.
Figure 3: Home perspective Interface
4. Data Discretization
The proper datasets for processing association
rules mining are the numerical binomial form.
The bug pattern attributes which are non-
binomial format were discretized as syntax,
semantic and logic for binary forms as 1, 2, and 3
respectively. Binomial format was discretized as
syntax, semantic and logic for binary forms as 1,
2, and 3 respectively. Click and drag different
operators into the design perspective. Figure 4
shows the operators processing as follows:
UIJSLICTR Vol. 4 No. 1 March 2020 16
Figure 4: Data processing in design perspective
Data Retrieve Operator reads and loads
data from the data repository.
Select Attributes Operator lets one
select required attributes needed.
Numerical Binomial operator changes
the numeric attributes to a binominal
which is binary form of 0 and 1.
FP (Frequent Pattern) Growth
Operator calculates all frequent itemsets
from the given ExampleSet using the FP-
tree data structure.
Create Association Rules Operator
generates a set of association rules from
the given set of frequent itemsets.
3.5 Analysis and Classification of Bugs
Java programming was written to classify the
errors based on three bugs, Syntax, Semantic and
Logic errors. The output of the program were
obtained and saved in Microsoft excel.
4. RESULT PRESENTATION AND
DISCUSSION
Out of 11 experiments conducted result were
presented for six where meaningful and useful
results were obtained with the participants.
4.1 Results of the Analysis and
Classification of Bugs as Reported by the
Java Program
Table 1: First Experiment (with 200 Level
Students):
The table shows the data analysis of the first
experiment, the study groups included 29 and 46
programmers per group; total bugs reported was
209.
Group
No
Number of
students in
the Group
Number
of bugs
Bug Types
Bug
1
1
7
Cannot find
symbol
syntax
error
cannot find
symbol
syntax
error
cannot find
symbol
syntax
error
cannot find
symbol
syntax
error
cannot find
symbol
syntax
error
cannot find
symbol
syntax
error
cannot find
symbol
syntax
error
2
1
2
cannot find
symbol
syntax
error
UIJSLICTR Vol. 4 No. 1 March 2020 17
Table 2: Second Experiment (with 200 Level Students):
The tables shows the data analysis of the second experiment. The study groups included 31 and 60
programmers per group; total bugs reported was 298.
Group
No
Number
of
students
in the
Group
Number
of bugs
Bug Types
Bug
1
2
2
cannot find symbol
syntax error
cannot find symbol
syntax error
2
2
6
did not convert the string to array
semantic error
We used a wrong variable name, using strdivision
instead of strdivision
semantic error
The fact that we have not converted the string to an
array impacted on the lines
semantic error
had logical errors while swapping the two string
arrays as a result of mistake in the array indexing.
logic error
We printed the reversed array in the order
newAnewB instead of newBnewA
logic error
Our output was not properly arranged, the proper
print statements and \n escape character got that
sorted
logic error
Table 3: Third Experiment (with 200 Level Students):
The tables shows the data analysis of the third experiment, the study group included 44 and 58
programmers per group; total bugs reported was 568.
Group
No
Number
of
students
in the
Group
Number
of bugs
Bug Types
Bug
1
1
3
cannot find symbol
syntax error
cannot find symbol
syntax error
cannot find symbol
syntax error
2
2
33
: <identifier> expected
syntax error
: '{' expected
syntax error
<identifier> expected
syntax error
: invalid method declaration; return type required
semantic error
not a statement
semantic error
not a statement
semantic error
not a statement
semantic error
: ';' expected
syntax error
: ';' expected
syntax error
UIJSLICTR Vol. 4 No. 1 March 2020 18
: ';' expected
syntax error
: <identifier> expected
semantic error
not a statement
semantic error
not a statement
semantic error
: ';' expected
syntax error
: ';' expected
syntax error
: ';' expected
syntax error
Table 4: Fourth Experiment (with 200 Level Students):
The tables shows the data analysis of the fourth experiment. The study group included 54 and 55
programmers per group; total bugs reported was 627.
Group
No
Number
of
students
in the
Group
Number
of bugs
Bug Types
Bug
1
1
1
'[' expected
syntax error
2
1
16
class, interface, or enum expected
semantic error
class, interface, or enum expected
semantic error
class, interface, or enum expected
semantic error
class, interface, or enum expected
semantic error
class, interface, or enum expected
semantic error
class, interface, or enum expected
semantic error
class, interface, or enum expected
semantic error
class, interface, or enum expected
semantic error
class, interface, or enum expected
semantic error
: illegal escape character
semantic error
: 'try' without 'catch' or 'finally'
semantic error
class, interface, or enum expected
semantic error
UIJSLICTR Vol. 4 No. 1 March 2020 19
Table 5: Fifth Experiment (with 300 Level Students):
The Table shows the data analysis of the fifth experiment. The study group included 17 and 61
programmers per group; total bugs reported was 152.
Group
No
Number
of
students
in the
Group
Number
of bugs
Bug Types
Bug
1
3
3
'[' expected
syntax error
'[' expected
syntax error
'[' expected
syntax error
2
4
10
invalid method declaration; return type required
Semantic
: 'else' without 'if'
syntax error
: ')' expected
syntax error
not a statement
semantic error
not a statement
semantic error
not a statement
semantic error
not a statement
semantic error
cannot be applied to given types;
semantic error
method discount1 n class practical5 cannot be
applied to given types;
semantic error
reason: actual and formal argument lists differ in
length
semantic error
3
1
10
: unclosed string literal
syntax error
;' expected
syntax error
: unclosed string literal
syntax error
Table 6: Sixth Experiment (with 300 Level Students):
The tables shows the data analysis of the sixth experiment. The study group included 24 and 43
programmers; total bugs reported was 71.
Group
No
Number
of
students
in the
Group
Number
of bugs
Bug Types
Bug
1
2
6
variable a is already defined in method main
Syntax error
variable b is already defined in method main
Syntax error
variable c is already defined in method main
Syntax error
variable d is already defined in method main
Syntax error
incompatible types: possible loss conversion from
double to float
Semantic error
incompatible types: possible loss conversion from
double to float
Semantic error
UIJSLICTR Vol. 4 No. 1 March 2020 20
2
2
2
cannot find symbol switch(option){
Syntax error
illegal start of type switch(option){
Semantic error
3
1
1
';' expected float ins location: class new
Syntax error
4
1
1
symbol: variable JOptionPane
Syntax error
5
1
1
; expected
Syntax error
6
2
2
; expected
Syntax error
duplicate case label
Semantic error
7
1
1
could not find or load main class
Semantic error
4.2 Results of Association Rule Tasks
4.2.1 Sample Association Rules Generated
For the First Experiment
The Text view in Figure 5 shows the possible
rules that can occur based on the relationship of
one attributes and the others. The most common
occurrence are levels. The number of students in
the group determines the bugs patterns generated.
The graphical view of the rules is shown in Figure
6.
Figure 5: Text view of the first experiment
UIJSLICTR Vol. 4 No. 1 March 2020 21
Figure 6: Graph view of the first experiment result
4.2.2 Sample Association Rules Generated
For the Second Experiment
The Text view in Figure 7 shows the possible
rules that can occur based on the relationship of
one attributes and the others. The most common
occurrence are Number of Students in the Group.
The number of students in the group determines
the bugs patterns generated. The graphical view
of the rules is shown in Figure 8.
Figure 7: Text view of the second experiment
UIJSLICTR Vol. 4 No. 1 March 2020 22
Figure 8: Graphical view of the second Experiment result
Text and Graphical Views were generated for the
other experiments. From the experiments, it was
deduced that Number of Students per Group and
Number of Bugs Reported determine the bug
patterns generated.
4.3 Further Statistical Analysis with
Rapid Miner
Table 7 shows the summary of further statistical
results obtained with the Rapid Miner software
from the experiments.
Table 8 shows the mean percentage bug patterns
incurred by the student participants while Figure
9 shows the chart of the distribution.
Table7: Summary of further statistical results
Experiment No.
Level
Prevalence of Errors Incurred
1
200
Syntax errors (59.52%) were more prevalent in the programming
codes of the students followed by semantic errors (39.88%). Very few
Logic errors (0.59%) were recorded.
2
200
Semantic errors (49.82%) were more prevalent in the programming
codes of the students followed by syntax errors (49.15%). Few Logic
errors (1.02%) were recorded.
3
200
Syntax errors (52.47%) were more prevalent in the programming
codes of the students followed by semantic errors (47.53%). No Logic
error incurred.
4
200
Semantic errors (78.33%) were are more prevalent in the
programming codes of the students followed by syntax errors
(21.67%). No logic error was recorded.
5
300
Logic errors (51.86%) were more prevalent in the programming codes
of students, followed by syntax errors (32.28%). Less semantic errors
(16.12%) were recorded.
6
300
Logic errors (52.63%) were more prevalent in the programming codes
of students, followed by syntax errors (31.57%). Also, less semantic
errors (15.78%) were recorded.
UIJSLICTR Vol. 4 No. 1 March 2020 23
Table 8: Mean Percentage Bug Patterns
LEVELS
BUG PATTERNS
SNYTAX
SEMANTIC
LOGIC
200
45.68%
53.88%
0.4%
300
31.7%
15.95%
52.24%
Figure 9: Chart of the Mean Distribution of Bug Patterns
4.3 Discussion of Results
From the results obtained so far in the
experiments, one can infer that semantic and
syntax errors were more prevalent in the Novice
Programmers (200 level) codes. Logic errors
incurred by this category of students were very
minimal. This is attributed that the category of
students was just in the beginning stage of
learning computer programming. It is observed
with the Novice Programmers that they pay
much attention to logic of the programs, which
will improve the accuracy of the programs they
produce. This is a good result that should guide
the tutors of computer programming at
introductory levels. The tutors should
concentrate on syntax and sematic constructs of
computer programming with the learners at the
introductory level.
As expected, the learning curve should be
progressive for 300 Level (Experienced)
students since they have had previous computer
programming experience in 200 Level. The
results from the experiments showed that less
syntax and semantic errors were recorded for
this category of students. They majorly incurred
logic errors in their programs. Tutors of
computer programming at higher levels in
institutions should therefore pay much attention
to students with a view to tutoring them on how
to produce accurate programs.
5. CONCLUSION
This study mainly focus on bug patterns that
occur with beginner programmers at 200 and
300 levels of Computer Science Students of
University of Ibadan, Ibadan, Nigeria. The
study concludes by asserting that both Syntax
and Semantic errors are commonly committed
by new programmers and as the learning of
programming continues, error committable
tends towards logic, in which correct (accurate)
outputs are compromised. Computer
Programming tutors should therefore provide
absolute supports for novice programmers
during lectures and practical classes.
6. ACKNOWLEDGMENTS
Our appreciation goes to all the participants in
this experiment especially the Computer
Programming Tutors, Student Volunteers and
0
10
20
30
40
50
60
SNYTAX SEMANTIC LOGIC
Mean Percentage Bug
Bug Types
200 Level 300 Level
UIJSLICTR Vol. 4 No. 1 March 2020 24
Laboratory Personnel in the Department of
Computer Science, University of Ibadan,
Ibadan, Nigeria, for their cooperation and
support during the course of the experiments.
REFERENCES
[1] Wikipedia, Software Bugs [online]: https://
Wikipedia's%20Definition%20of%20a%0Soft
ware%20Bug%2 0Is%20Wrong.html. Visited
in June 2019.
[2] Kai P., Sunghun K., James Whitehead Jr E.,
(2008), Toward an Understanding of Bug Fix
Patterns. @Springer Science + Business Media,
LLC 2008 Editors: A. Hassan, S. Diehl and H
Gall, Empir Software Eng, DOI
10.1007/s10664-008-9077-5.
[3] Akinola. S.O. (2012). Java companion for
beginners (2nd Ed.). ISBN: 978-978-92
[4] Akshay V. B, (2015), A Comparative Study On
Data Mining Tools, a project presented to the
Department of Computer Science California
state University, Sacramento.
[5] Omkar Phatak (2011), Types of programming
Errors, Reprint Permission of computer
programming article.[online]: http://msc bug
pattern project\bug defects paper\Types of
Programming Errors.mht. Visited in June 2019.
[6] Martin, (2012), Data Mining Techniques, VP of
Technical Publications, Couchbas.
[7] Dhyan Chandra Yadav and Saurabh Pal (2015),
Software Bug Detection Using Data Mining,
International Journal of Computer
Applications, Volume 115.
[8] Yuan J., Ming L., and Zhi-Hua Z. (2011).
Software Defect Detection with Rocus, Journal
of Computer Science and Technology, 1, 1-21.
[9] Deepak K. V. And Shukla H. S (2015). A Defect
Prediction Model for Software Product based
on ANFIS. International Journal for Scientific
Research and Development, Vol. 3, Issue 10, |
ISSN (online): 2321-0613. Page 1
[10] Promila D. and Rajiv R. (2014). Enhanced Bug
Detection by Data Mining Techniques.
International Journal of Computational
Engineering Research (IJCER), Vol, 04, Issue
7, ISSN (e): 2250 3005. Page 22.
[11] Md Mohsin A., Shamsul H., Jemal A., Sultan
A., Hmood A., John Y., (2017). A parallel
Framework for Software Defect detection and
Metric Selection on Cloud Computing, Cluster
Computing, 20:22672281,17-0892-6.
[12] Saiqa A., Luiz F. C., and Faheem A.(2015),
Benchmarking Machine Learning Techniques
For Software Defect Detection, International
Journal of Software Engineering and
Applications (IJSEA), Vol.6, No.3, page 1.
[13] Jiawei Han, Micheline Kamber, Jian Pei,
(2011), Data Mining: Concepts and
Techniques, SBN 978-0-12-381479-11. Data
mining.
... To further complicate error correction, not all programming errors are picked up by the compiler. Numerous studies have been conducted to gain an in-depth understanding of the kinds of programming errors novices are likely to make [22,32]. Programming errors are typically classified into four categories: syntax, semantic, logical, and type errors. ...
... A syntax error occurs when the grammatical rules of the programming language being used are not followed [22]. This type of error can be a major obstacle for novices as it is likely to slow down their progress [14]. ...
... Semantic errors. This type of error occurs when the meaning of the programming code is not consistent with the programming language [22]. Examples include the use of a non-initialised variable or an attempt to divide a value by zero. ...
Chapter
Novices tend to make unnecessary errors when they write programming code. Many of these errors can be attributed to the novices’ fragile knowledge of basic programming concepts. Programming instructors also find it challenging to develop teaching and learning strategies that are aimed at addressing the specific programming challenges experienced by their students. This paper reports on a study aimed at (1) identifying the common programming errors made by a select group of novice programmers, and (2) analyzing how these common errors changed at different stages during an academic semester. This exploratory study employed a mixed-methods approach based on the Framework of Integrated Methodologies (FraIM). Manual, structured content analysis of 684 programming artefacts, created by 38 participants and collected over an entire semester, lead to the identification of 21 common programming errors. The identified errors were classified into four categories: syntax, semantic, logic, and type errors. The results indicate that semantic and type errors occurred most frequently. Although common error categories are likely to remain the same from one assignment to the next, the introduction of more complex programming concepts towards the end of the semester could lead to an unexpected change in the most common error category. Knowledge of these common errors and error categories could assist programming instructors in adjusting their teaching and learning approaches for novice programmers.
... Syntax error or grammatical error is an error in writing code in a program that makes the format or information unrecognizable by the computer system so that the computer cannot understand the meaning of the code (Sari, 2022). A syntax error can occur when the grammatical rules of the programming language being used are not followed by the person who wrote the program, e.g if a keyword is misspelled or the author does not put a semi-colon in the appropriate places (Solo, 2020). Syntax error can be a major obstacle for novices and will slow down their progress (Denny et al., 2014). ...
... In this kind of error, the compiler finds something wrong with the program. The first step in the debugging process is to fix syntax errors that occur because the program will not run properly if this error is not immediately resolved (Solo, 2020). Syntax errors are one of the main reasons why beginners in this field cannot master programming, due to their inability to apply valid syntax rules when writing programs (Plonka et al., 2015;Mase & Nel, 2022). ...
Article
Full-text available
Computational linguistics is concerned with understanding language from a computational perspective and constructing artifacts that are useful in processing and generating language. In the use of language, whether human language or programming language, there can be an error that makes the language not understood properly. One of the errors that often occurs is syntax error. In language, a syntax error is a mistake in using a language that involves organizing words and phrases that do not make sense. While in programming, a syntax error is an error in writing code in a program that makes the format or information unrecognizable by the computer system. Such errors are the simplest of errors but can affect many aspects of the final code output. This article aims to show how writing errors or typos in programming code can affect some or all of the results. The data obtained is data from web programming code that is used to make the website display the Geographic Information System Clustering the Distribution of Stunting Disease in Banggai Regency with K-Means, and also the R programming code used to calculate the stunting distribution using K-Means. The results of this article will show that a typo, even just a letter or any single punctuation, can affect the program’s final result.
Article
Full-text available
With the continued growth of Internet of Things (IoT) and its convergence with the cloud, numerous interoperable software are being developed for cloud. Therefore, there is a growing demand to maintain a better quality of software in the cloud for improved service. This is more crucial as the cloud environment is growing fast towards a hybrid model; a combination of public and private cloud model. Considering the high volume of the available software as a service (SaaS) in the cloud, identification of non-standard software and measuring their quality in the SaaS is an urgent issue. Manual testing and determination of the quality of the software is very expensive and impossible to accomplish it to some extent. An automated software defect detection model that is capable to measure the relative quality of software and identify their faulty components can significantly reduce both the software development effort and can improve the cloud service. In this paper, we propose a software defect detection model that can be used to identify faulty components in big software metric data. The novelty of our proposed approach is that it can identify significant metrics using a combination of different filters and wrapper techniques. One of the important contributions of the proposed approach is that we designed and evaluated a parallel framework of a hybrid software defect predictor in order to deal with big software metric data in a computationally efficient way for cloud environment. Two different hybrids have been developed using Fisher and Maximum Relevance (MR) filters with a Artificial Neural Network (ANN) based wrapper in the parallel framework. The evaluations are performed with real defect-prone software datasets for all parallel versions. Experimental results show that the proposed parallel hybrid framework achieves a significant computational speedup on a computer cluster with a higher defect prediction accuracy and smaller number of software metrics compared to the independent filter or wrapper approaches.
Article
Full-text available
Artificial intelligence techniques are day by day getting involvement in all the classification and prediction based process like environmental monitoring, stock exchange conditions, biomedical diagnosis, software engineering etc. However still there are yet to be simplify the challenges of selecting training criteria for design of artificial intelligence models used for prediction of results. This work focus on the defect prediction mechanism development using software metric data of KC1.We have taken subtractive clustering approach for generation of fuzzy inference system (FIS).The FIS rules are generated at different radius of influence of input attribute vectors and the developed rules are further modified by ANFIS technique to obtain the prediction of number of defects in software project using fuzzy logic system.
Article
Full-text available
The common software problems appear in a wide variety of applications and environments. Some software related problems arises in software project development i.e. software related problems are known as software defect in which Software bug is a major problem arises in the coding implementation .There are no satisfied result found by project development team. The software bug problems mentation in problem report and software engineer does not easily detect this software defect but by the help of data mining classification software engineers easily can classify software bug. This paper classified and detect software bug by J48, ID3 and Naïve Bayes data mining algorithms. Comparison of these algorithms to detect accuracy and time taken to build model is also presented in this paper.
Article
Machine Learning approaches are good in solving problems that have less information. In most cases, the software domain problems characterize as a process of learning that depend on the various circumstances and changes accordingly. A predictive model is constructed by using machine learning approaches and classified them into defective and non-defective modules. Machine learning techniques help developers to retrieve useful information after the classification and enable them to analyse data from different perspectives. Machine learning techniques are proven to be useful in terms of software bug prediction. This study used public available data sets of software modules and provides comparative performance analysis of different machine learning techniques for software bug prediction. Results showed most of the machine learning methods performed well on software bug datasets.
Toward an Understanding of Bug Fix Patterns
  • P Kai
  • K Sunghun
  • E James Whitehead Jr
Kai P., Sunghun K., James Whitehead Jr E., (2008), Toward an Understanding of Bug Fix Patterns. @Springer Science + Business Media, LLC 2008 Editors: A. Hassan, S. Diehl and H Gall, Empir Software Eng, DOI 10.1007/s10664-008-9077-5.
Java companion for beginners (2 nd Ed
  • S O Akinola
Akinola. S.O. (2012). Java companion for beginners (2 nd Ed.). ISBN: 978-978-92
Types of programming Errors, Reprint Permission of computer programming article
  • Omkar Phatak
Omkar Phatak (2011), Types of programming Errors, Reprint Permission of computer programming article.[online]: http://msc bug pattern project\bug defects paper\Types of Programming Errors.mht. Visited in June 2019.
Enhanced Bug Detection by Data Mining Techniques
  • D Promila
  • R Rajiv
Promila D. and Rajiv R. (2014). Enhanced Bug Detection by Data Mining Techniques. International Journal of Computational Engineering Research (IJCER), Vol, 04, Issue 7, ISSN (e): 2250 -3005. Page 22.