Morteza Zakeri

Morteza Zakeri
Verified
Morteza verified their affiliation via an institutional email.
Verified
Morteza verified their affiliation via an institutional email.
  • Doctor of Philosophy
  • Professor (Assistant) at Amirkabir University of Technology

Software engineering

About

43
Publications
6,203
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
230
Citations
Introduction
Morteza does research in Software Engineering (SE) and Artificial Intelligence (AI). His main research interest is automated and intelligent software engineering (AISE) with a focus on program analysis and transformation, transpilers, software refactoring, testing, repair, and applying AI in system design (software II and compiler II).
Current institution
Amirkabir University of Technology
Current position
  • Professor (Assistant)
Additional affiliations
April 2024 - present
Institute for Research in Fundamental Sciences
Position
  • Postdoctoral Researcher
September 2017 - September 2019
Iran University of Science and Technology
Position
  • Teacher Assistant
Description
  • I was teaching assistant of Compiler Design and Construction B.Sc. course by Dr. Saeed Parsa for four semesters (two years). Our teaching materials during these two years are available to view and download: http://parsa.iust.ac.ir/courses/compilers/
September 2017 - present
Iran University of Science and Technology
Position
  • Researcher
Description
  • I am a Computer Engineering Ph.D. student at Iran University of Science and Technology. My research interests are about empirical and automated software engineering, especially software analysis, refactoring, testing, and debugging with the help of AI.
Education
September 2018 - September 2021
Iran University of Science and Technology
Field of study
  • Computer Engineering
September 2016 - September 2018
Iran University of Science and Technology
Field of study
  • Computer Engineering
September 2011 - September 2015
Arak University
Field of study
  • Computer Engineering

Publications

Publications (43)
Preprint
Full-text available
Malware attacks targeting widely used non-executable formats, namely Microsoft Office and PDF files, have become a prevalent threat. These files, which encompass a broad spectrum of data types are classified as complex files. Existing malware detection models currently lack transparency, providing only binary labels without confidence scores. Incor...
Article
Ion exchange typically involves replacing smaller ions with larger ions below the glass transition temperature. This study introduces an innovative computational approach for the inverse design of ion-exchangeable glasses. It aims to simultaneously achieve high depth of layer (DOL) and surface compressive stress (CS), a task complicated by the inhe...
Article
Full-text available
Path testing is one of the most efficient approaches for covering a program during the test. However, executing a path with a single or limited number of test data does not guarantee that the path is fault-free, specifically in the fault-prone paths. A common solution in these cases is to extract the corresponding domain of the path constraint that...
Article
Full-text available
Requirements form the basis for defining software systems’ obligations and tasks. Testable requirements help prevent failures, reduce maintenance costs, and make it easier to perform acceptance tests. However, despite the importance of measuring and quantifying requirements testability, no automatic approach for measuring requirements testability h...
Article
The process of ion exchange includes the substitution of a larger ion for a smaller ion, which frequently takes place at temperatures lower than the glass transition temperature. The crucial variables in this particular procedure are the depth of layer (DOL) and the surface compressive stress (CS). Determining the best composition for ion-exchangea...
Conference Paper
Full-text available
Plant diseases pose significant challenges to global crop production, impacting the economy. Innovative agricultural solutions that integrate the Internet of Things and machine learning have emerged to address this issue for early discovery of plant pathogens. While convolutional neural networks (CNNs) have been widely used for plant disease detect...
Article
Full-text available
The responsibility of a method/function is to perform some desired computations and disseminate the results to its caller through various deliverables, including object fields and variables in output instructions. Based on this definition of responsibility, this paper offers a new algorithm to refactor long methods to those with a single responsibi...
Preprint
Full-text available
Test-first development (TFD) is a software development approach involving automated tests before writing the actual code. TFD offers many benefits, such as improving code quality, reducing debugging time, and enabling easier refactoring. However, TFD also poses challenges and limitations, requiring more effort and time to write and maintain test ca...
Article
Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell detection. This paper proposes a systematic literature review and meta-analysis on code similarity measurement an...
Preprint
Full-text available
Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell detection. This paper proposes a systematic literature review and meta-analysis on code similarity measurement an...
Preprint
Full-text available
The accuracy reported for code smell-detecting tools varies depending on the dataset used to evaluate the tools. Our survey of 45 existing datasets reveals that the adequacy of a dataset for detecting smells highly depends on relevant properties such as the size, severity level, project types, number of each type of smell, number of smells, and the...
Article
Full-text available
The accuracy reported for code smell-detecting tools varies depending on the dataset used to evaluate the tools. Our survey of 45 existing datasets reveals that the adequacy of a dataset for detecting smells highly depends on relevant properties such as the size, severity level, project types, number of each type of smell, number of smells, and the...
Preprint
Full-text available
The responsibility of a method/function is to perform some desired computations and disseminate the results to its caller through various deliverables, including object fields and variables in output instructions. Based on this definition of responsibility, this paper offers a new algorithm to refactor long methods to those with a single responsibi...
Preprint
Full-text available
Using mobile phones for medical applications are proliferating due to high-quality embedded sensors. Jaundice, a yellow discoloration of the skin caused by excess bilirubin, is a prevalent physiological problem in newborns. While moderate amounts of bilirubin are safe in healthy newborns, extreme levels are fatal and cause devastating and irreversi...
Article
Method naming is a critical factor in program comprehension, affecting software quality. State-of-the-art naming techniques use deep learning to compute the methods’ similarity considering their textual contents. They highly depend on identifiers’ names and do not compute semantical interrelations among methods’ instructions. Source code metrics co...
Preprint
Full-text available
Unlike most other software quality attributes, testability cannot be evaluated solely based on the characteristics of the source code. The effectiveness of the test suite and the budget assigned to the test highly impact the testability of the code under test. The size of a test suite determines the test effort and cost, while the coverage measure...
Preprint
Full-text available
The high cost of the test can be dramatically reduced, provided that the coverability as an inherent feature of the code under test is predictable. This article offers a machine learning model to predict the extent to which the test could cover a class in terms of a new metric called Coverageability. The prediction model consists of an ensemble of...
Article
Full-text available
Front Cover Caption: The cover image is based on the Research Article Learning to predict test effectiveness by Morteza Zakeri‐Nasrabadi and Saeed Parsa https://doi.org/10.1002/int.22722.
Article
Full-text available
Unlike most other software quality attributes, testability cannot be evaluated solely based on the characteristics of the source code. The effectiveness of the test suite and the budget assigned to the test highly impact the testability of the code under test. The size of a test suite determines the test effort and cost, while the coverage measure...
Article
Long Method is amongst the most common code smells in software systems. Despite various attempts to detect the long method code smell, few automated approaches are presented to refactor this smell. Extract Method refactoring is mainly applied to eliminate the Long Method smell. However, current approaches still face serious problems such as insuffi...
Article
Full-text available
The high cost of the test can be dramatically reduced, provided that the coverability as an inherent feature of the code under test is predictable. This article offers a machine learning model to predict the extent to which the test could cover a class in terms of a new metric called Coverageability. The prediction model consists of an ensemble of...
Article
Measuring the volume of urine in the bladder is a significant issue in patients who suffer from the lack of bladder fullness sensation or have problems with timeliness getting to the restroom, such as spinal cord injury patients and some of the elderlies. Real-time monitoring of the bladder, therefore, can be highly helpful for urinary incontinence...
Article
Full-text available
Appropriate test data are a crucial factor to succeed in fuzz testing. Most of the real-world applications, however, accept complex structure inputs containing data surrounded by meta-data which is processed in several stages comprising of the parsing and rendering (execution). The complex structure of some input files makes it difficult to generat...
Preprint
Appropriate test data is a crucial factor to reach success in dynamic software testing, e.g., fuzzing. Most of the real-world applications, however, accept complex structure inputs containing data surrounded by meta-data which is processed in several stages comprising of the parsing and rendering (execution). It makes the automatically generating e...

Questions

Questions (2)
Question
Consider a machine learning model trained on a dataset with some incorrectly labeled samples. The F1 score (Just as an example of evaluation measures) of the trained model is near one even on the test set. However, it completely predicts wrong labels due to training on wrong data samples. We have a high-quality model which generates wrong answers!
Such a situation can occur in unsupervised and self-supervised learning strategies mainly used to build large language models (LLMs), such as GPT, PaLM, and BERT. LLMs are mostly trained on worldwide web data while the web contains many incorrect data. A biased Wikipedia article, a wrong Stack Exchange answer, a faulty source code, etc. are just a few examples of incorrect data on the web.
We recently published a systematic literature review on code smell datasets that supports our claims [1]. Results indicate that most code smell dataset contains many incorrectly labeled samples. Therefore, even the most accurate code smell detection models are not reliable (!)
Here, the main question is how can we ensure that machine learning learns what we need?
I am eager to know the Artificial Intelligence community's comments on the following questions.
(Q1) What are examples of low-quality datasets used in machine learning projects?
(Q2) How can we evaluate the datasets used to train and test the machine learning model?
(Q3) How can we test the correctness and rationality of large language models (LLMs) on different tasks? Every data we feed to an LLM mostly has been observed by the model during the training phase. In other words, how can we find a good test set for evaluating LLMs on downstream tasks?
(Q4) How can we measure rationally of AI agents? More specifically, how can we ensure that machine learning learns what we need?
Please also help me refine the topic and questions to make something useful for the AI and science community.
[1] Morteza Zakeri-Nasrabadi, Saeed Parsa, Ehsan Esmaili, and Fabio Palomba. 2023. A Systematic Literature Review on the Code Smells Datasets and Validation Mechanisms. ACM Comput. Surv. 55, 13s, Article 298 (December 2023), 48 pages. https://doi.org/10.1145/3596908
Question
We all know about the advantages of agile software development methodologies, DevOps, and CICD. But, what are the negative impacts and the dark sides/dark corners of such highly accepted and recommended software development practices and mindsets on software engineers’ life?
I am writing a critical white paper on agile software development and need to know some evidence supporting my ideas. I am eager to know any related comments in response to the following questions:
(Q1) What are the negative impacts and the dark side of agile developments on software engineers’ life?
(Q2) Does agile software development put too much pressure on software engineers and developers?
(Q3) What are the equivalents of agile methods in other engineering and science disciplines, such as civil engineering, chemical engineering, and medical science?
(Q4) If there is any equivalent, then how is its popularity and acceptance among the experts in that field?
(Q5) Should software developers refuse to work for employers that enforce agile methodologies?
Please also help me refine the topic and questions to make something useful for the software engineers’ community.

Network

Cited By