About
167
Publications
20,207
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,695
Citations
Current institution
Additional affiliations
September 2020 - September 2020
Publications
Publications (167)
Large language models (LLMs) based on transformer architecture have revolutionized natural language processing (NLP), demonstrating excellent capabilities in understanding and generating human-like text. In Software Engineering, LLMs have been applied in code generation, documentation, and report writing tasks, to support the developer and reduce t...
Moving testing to the Cloud overcomes time/resource constraints by leveraging an unlimited and elastic infrastructure, especially for testing levels like End-to-End (E2E) that require a high number of resources and/or execution time. However, it introduces new challenges to those already faced on-premises, like selecting the most suitable Cloud inf...
Continuous integration practices have transformed software development, but executing test suites of modern software developments addresses new challenges due to its complexity and its huge number of test cases. Certain test levels, like End-to-end testing, are even more challenging due to long execution times and resource-intensive requirements, m...
Among the current technologies to analyse large data, the MapReduce processing model stands out in Big Data. MapReduce is implemented in frameworks such as Hadoop, Spark or Flink that are able to manage the program executions according to the resources available at runtime. The developer should design the program in order to support all possible no...
Software testing is an essential knowledge area required by industry for software engineers. However, software engineering students often consider testing less appealing than designing or coding. Consequently, it is difficult to engage students to create effective tests. To encourage students, we explored the use of gamification and investigated wh...
In recent years, software applications have been working with NoSQL databases as they have emerged to handle big data more efficiently than traditional databases. The data models of these databases are designed to satisfy the requirements of the software application, which means that the models must evolve when the requirements of the software appl...
Testing a database application is a challenging process where both the database and the user interaction have to be considered in the design of test cases. This paper describes a specification-based approach to guide the design of test inputs (both the test database and the user inputs) for a database application and to automatically evaluate the t...
Software testing is an essential knowledge area required by industry for software engineers. However, software engineering students often consider testing less appealing than designing or coding. Consequently, it is difficult to engage students to create effective tests. To encourage students, we explored the use of gamification and investigated wh...
End-to-end (E2E) test suite execution is expensive due to the number of complex resources required. When E2E test suites are executed frequently into a continuous integration system, the total amount of resources required may be prohibitive , moreover when the tests are run in the Cloud with different billing strategies. Traditional techniques to o...
Current information technologies generate large amounts of data for management or further analysis, storing it in NoSQL databases which provide horizontal scaling and high performance, supporting many read/write operations per second. NoSQL column-oriented databases, such as Cassandra and HBase, are usually modelled following a query-driven approac...
Schema design for NoSQL column-oriented database applications follows a query-driven strategy where each table satisfies a query that will be executed by the client application. This strategy usually implies that the schema is denormalized as the same information can be queried several times in different ways, leading to data duplication in the dat...
Database schemas evolve over time to satisfy changing application requirements. If this evolution is not performed correctly, some quality attributes are at risk such as data integrity, functional correctness, or maintainability. To help developer teams in the design of database schemas, several design methodologies for NoSQL databases have propose...
Continuous integration practice mandates to continuously introduce incremental changes into code, but doing so may introduce new faults too. These faults could be detected automatically through regression testing, but this practice becomes prohibitive as the cost of executing the tests grows. This problem is preponderant in end-to-end testing where...
Artificial intelligence (AI) is a broad field whose prevalence in the health sector has increased during recent years. Clinical data are the basic staple that feeds intelligent healthcare applications, but due to its sensitive character, its sharing and usage by third parties require compliance with both confidentiality agreements and security meas...
Web application testing is a great challenge due to the management of complex asynchronous communications, the concurrency between the clients-servers, and the heterogeneity of resources employed. It is difficult to ensure that a test case is re-running in the same conditions because it can be executed in undesirable ways according to several envir...
En este trabajo se analiza la relación entre el nivel de cooperación alcanzado en equipos y el rendimiento obtenido por los mismos durante el desarrollo de trabajos en grupo en varias asignaturas. El objetivo último es la monitorización en tiempo real del nivel de cooperación alcanzada por los equipos de alumnos con objeto de corregir a tiempo posi...
In this work we apply several Poisson and zero-inflated models for software defect prediction. We apply different functions from several R packages such as pscl, MASS, R2Jags and the recent glmmTMB. We test the functions using the Equinox dataset. The results show that Zero-inflated models, fitted with either maximum likelihood estimation or with B...
En la actualidad gran cantidad de datos son compartidos para su uso, tratamiento o análisis entre empresas y terceros. Es habitual que estos datos tengan que ser protegidos con diferentes técnicas de preservación de la privacidad para dar cum-plimiento a las leyes y regulaciones. Una de las técnicas más comunes es la ano-nimización que, aunque prov...
Continuous integration practices introduce incremental changes in the code to both improve the quality and add new functionality. These changes can introduce faults that can be timely detected through continuous testing by automating the test cases and re-executing them at each code change. However, re-executing all test cases at each change may no...
El aprendizaje basado en proyectos permite desarrollar competencias de enorme importancia, incluyendo las necesarias competencias transversales donde se incluyen aspectos como la capacidad de resolver problemas, la capacidad para tomar decisiones, habilidades de comunicación, de organización del trabajo y de gestión del tiempo. En este proyecto de...
Traditional SQL and NoSQL big data systems are the backbone for managing data in cloud, fog and edge computing. This paper develops a new system and adopts the TPC-DS industry standard benchmark in order to evaluate three key properties, availability, consistency and efficiency (ACE) of SQL and NoSQL systems. The contributions of this work are mani...
In recent years, data published and shared with third parties to develop artificial intelligence (AI) tools and services has significantly increased. When there are regulatory or internal requirements regarding privacy of data, anonymization techniques are used to maintain privacy by transforming the data. The side-effect is that the anonymization...
The use of NoSQL databases for cloud environments has been increasing due to their performance advantages when working with big data. One of the most popular NoSQL databases used for cloud services is Cassandra, in which each table is created to satisfy one query. This means that as the same data could be retrieved by several queries, these data ma...
Context
MapReduce is a processing model used in Big Data to facilitate the analysis of large data under a distributed architecture.
Objective
The aim of this study is to identify and categorize the state of the art of software testing in MapReduce applications, determining trends and gaps.
Method
Systematic mapping study to discuss and classify a...
The Master in Computer Science offers the students an integral education that develops technological, methodological and also management skills. It is therefore fundamental that, during the degree, students are faced with the resolution of complex real-world projects that allow them to acquire the skills and competencies necessary for their profess...
Entity reconciliation (ER) aims to combine data from different sources for a unified vision. The management of large volumes of data has given rise to significant challenges to the ER problem due to facts such as data becoming more unstructured, unclean, and incomplete or the existence of many datasets that store information about the same topic. T...
New processing models are being adopted in Big Data engineering to overcome the limitations of traditional technology. Among them, MapReduce stands out by allowing for the processing of large volumes of data over a distributed infrastructure that can change during runtime. The developer only designs the functionality of the program and its executio...
Testing database applications is a complex task since it involves designing test databases with meaningful test data in order to reveal faults and, at the same time, with a small size in order to carry out the testing process in an efficient way. This paper presents an automated approach to generating test data (test relational databases and test i...
NoSQL databases are capable of storing and processing big data which is characterized by various properties such as volume, variety and velocity. Such databases are used in a variety of user applications that need large volume of data which is highly available and efficiently accessible. But they do not enforce or require strong data consistency no...
Big Data programs are those that process large data exceeding the capabilities of traditional technologies. Among newly proposed processing models, MapReduce stands out as it allows the analysis of schema-less data in large distributed environments with frequent infrastructure failures. Functional faults in MapReduce are hard to detect in a testing...
Transactional services guarantee the consistency of shared data during the concurrent execution of multiple applications. They have been used in various domains ranging from classical databases through to service-oriented computing systems to NoSQL databases and cloud. Though transactional services aim to ensure data consistency, NoSQL databases pr...
The implemented programs in the MapReduce processing model are focused in the analysis of large volume of data in a distributed and parallel architecture. This architecture is automatically managed by the framework, so the developer could be focused in the program functionality regardless of infrastructure failures or resource allocation. However,...
Since NoSQL databases use data models, access modes and query languages that differ from the relational databases, the testing NoSQL database applications raises new challenges. There has been much attention to testing relational database applications, however testing NoSQL database applications is an area that has been hardly explored. This paper...
The management of large volumes of data has given rise to significant challenges to the entity reconciliation problem (which refers to combining data from different sources for a unified vision) due to the fact that the data are becoming more unstructured, unclean and incomplete, need to be more linked, etc. Testing the applications that implement...
Functional testing of applications that process the information stored in databases often requires a careful design of the test database. The larger the test database, the more difficult it is to develop and maintain tests as well as to load and reset the test data. This paper presents an approach to reduce a database with respect to a set of SQL q...
NoSQL databases provide high availability and efficiency in data processing but at the expense of weaker consistency. In this paper, we propose a new approach in order to test NoSQL key/value databases in general and their CRUD operations in particular. We design a new context-aware model that takes into account the contextual requirements of clien...
NoSQL databases have given rise to new testing challenges due to the fact that they use data models and access modes to the data that differ from the relational databases. Testing relational database applications has attracted the interest of many researchers; but this is still not the case with NoSQL database applications. The approach presented i...
MapReduce is a parallel data processing paradigm oriented to process large volumes of information in data-intensive applications, such as Big Data environments. A characteristic of these applications is that they can have different data sources and data formats. For these reasons, the inputs could contain some poor quality data that could produce a...
This work summarizes the main topics that have been researched in the area of software testing under the umbrella of ``Bayesian approaches'' since 2010. There is a growing trend on the use of the so-called Bayesian statistics and Bayesian concepts in general and software testing in particular. Following a Systematic Literature Review protocol using...
Journal: Revista de Investigación en Docencia Universitaria de la Informática (ReVisión)
This paper provides an overview of the Informatics and Computing Engineering Master Degree at the Gijon Polytechnic School of Engineering (University of Oviedo). This is one of the earliest Informatics and Computing Engineering master’s in Spain designed accord...
In the scope of the applications developed under the service-based paradigm, Service Level Agreements (SLAs) are a standard mechanism used to flexibly specify the Quality of Service (QoS) that must be delivered. These agreements contain the conditions negotiated between the service provider and consumers as well as the potential penalties derived f...
MapReduce is a paradigm that allows parallel processing of large amounts of data. MapReduce programs combined with their underlying run-time framework have distinctive features that are prone to include unexpected behaviors not present in other types of programs. This paper describes an approach to functional testing of MapReduce programs based on...
In the scope of Services Science, Management and Engineering (SSME), Service Level Agreements (SLAs) are technical documents that contain the conditions that must be fulfilled during the provision and consumption of services. Typically, the violation of the terms specified in the SLA leads to consequences for the stakeholders involved in the agreem...
Service level agreements (SLAs) are typically used to specify rules regarding the consumption of services that are agreed between the providers of the service-based applications (SBAs) and their consumers. An SLA includes a list of terms that contain the guarantees that must be fulfilled during the provisioning and consumption of the services. Sinc...
This chapter focuses on web services transactions which support creating
robust web services applications by guaranteeing that their execution is correct and
the data sources are consistent. More specifically, it investigates into the testing of
such transactions which has not received proper attention from the current research.
It presents a gener...
Web services (WS) transactions are important in order to reliably compose distributed and autonomous services into composite web services and to ensure that their execution is consistent and correct. But such transactions are generally complex and they require longer processing time, and manipulate critical data. Thus various techniques have been d...
Web Services (WS) transactions are used to build efficient and reliable web applications which are distributed across the Internet and are accessed by multiple simultaneous users. Current research has developed various models and protocols in order to improve the performance and reliability of WS transactions. However, there is little research on t...
Service Level Agreements (SLAs) are used to specify the negotiated conditions between the provider and the consumer of services. In this paper we present a stepwise method to identify and categorize a set of test requirements that represent the potential situations that can be exercised regarding the specification of each isolated guarantee term of...
Web services transactions are used to build efficient and reliable Web applications that are distributed across the Internet and are accessed by multiple simultaneous users. Current research develops various models and protocols to improve the performance and reliability of Web services transactions. However, there is little research on testing the...
Testing a database application is a challenging process where both the database and the user interaction have to be considered in the design of test cases. This paper describes a specification-based approach to guide the design of test inputs (both the test database and the user inputs) for a database application and to automatically evaluate the t...
hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with c...
Transactions are a key issue to develop reliable web service based applications. The advanced models used to manage this kind of transactions rely on the dependencies between the involved activities (subtransactions). Dependencies are constraints on the processing produced by the concurrent execution of interdependent activities. Existing work uses...
In the scope of Service Oriented Architectures (SOA), Service Level Agreements (SLAs) are used to specify the conditions that have to be fulfilled by both service provider and consumer. These stakeholders need checking whether the executions of the services fulfill the conditions or not, so the evaluation is an important and not trivial task within...
Transactions are a key issue to develop reliable web service based applications. The advanced models used to manage this kind of transactions rely on the dependencies between the involved activities (subtransactions). Dependencies are constraints on the processing produced by the concurrent execution of interdependent subtransactions. Existing work...
Transactions are a fundamental technology for building efficient and reliable web service based applications. Various models and protocols have been developed by academic and industrial research community in order to effectively manage web services transactions. We propose a novel abstract model for dynamically modeling distinct web services transa...
Transactions are a key issue in the reliability of distributed applications because they ensure all the participants achieve a mutually agreed outcome. However, current research has given little attention to testing transactions in web services. This paper presents a conceptual framework, inspired in risk-based methodologies, to address this gap. I...
Service Oriented Architectures (SOA) have emerged as a new paradigm to develop interoperable and highly dynamic applications.ObjectiveThis paper aims to identify the state of the art in the research on testing in Service Oriented Architectures with dynamic binding.MethodA mapping study has been performed employing both manual and automatic search i...
In the field of database applications a considerable part of the business logic is implemented using a semi-declarative language: the Structured Query Language (SQL). Because of the different semantics of SQL compared with other procedural languages, the conventional coverage criteria for testing are not directly applicable. This paper presents a c...
Service Oriented Architectures (SOA) have emerged as a promising solution to develop interoperable and highly dynamic applications. In the domain of SOA, Service Level Agreements (SLAs) are used to specify the stipulated terms between the service provider and the consumer. Due to the unique features of this paradigm such as SLA management, testing...
XML processing programs play an important role in the achievement of XML data querying, manipulation, and construction operations to compose XML data structures for very diverse purposes regarding information representation, storing and exchange on XML-based systems. Testing of XML processing programs is a challenging task since the test input and...
Transactions are crucial to ensuring the quality (such as recovery and reliability) of web services applications by constraining them to a mutually agreed outcome. This paper addresses the issue of testing the long-lived web services transactions which has been given little attention by the current research. It proposes a risk-based approach and al...
Populating test databases with meaningful test data is a difficult task as it involves generating data for many joined tables that must be diverse enough to be able to reveal faults and small enough to make the testing process efficient. This paper proposes an approach for the automatic generation of a test database for a set of SQL queries using a...
An approach to automatically generate test data for SQL queries is described in this paper. Both the database schema and queries to be tested are used to guide the generation. The different test situations on the queries are identified using a condition coverage criterion and they are represented with a set of constraints that the information in th...
A partial test oracle is proposed to verify the actual outputs in access testing on XML data. The considered software under test is a query program which receives as input an XML document obtained from an XML repository of any kind, and produces XML data as output. To deal with the actual outputs from this testing process, the partial oracle evalua...
Keeping the test databases as small as possible leads to faster execution of tests and facilitates the task of completing the test cases and evaluating the actual outputs against the expected. In this paper we present an automated approach to database reduction that considers an initial database that may be a copy of a production database and the s...
A challenging part of software testing entails the generation of test cases, which cost can be reduced by means of the use of techniques for automating this task. In this paper we present an approach based on the metaheuristic technique scatter search for the automatic test case generation of the BPEL business process. A transition coverage criteri...
The techniques for the automatic generation of test cases try to efficiently find a small set of cases that allow a given adequacy criterion to be fulfilled, thus contributing to a reduction in the cost of software testing. In this paper we present and analyze two versions of an approach based on the scatter search metaheuristic technique for the a...
Adequacy criteria provide an objective measurement of test quality. Although these criteria are a major research issue in software testing, little work has been specifically targeted towards the testing of database-driven applications. In this paper, two structural coverage criteria are provided for evaluating the adequacy of a test suite for SQL q...
XML queries are broadly used in Web environments, but the existing approaches towards software quality based on testing have not deeply addressed them. Although there are some works oriented to generate test inputs for testing XML queries, the evaluation of expected outputs against the actual outputs resulting from the tests has not been tackled as...
This proposal studies the web transactions characteristics in Service Oriented Architectures (SOA) from a testing point of view. Our idea is to define adequacy criteria to test web transactions. This will be accomplished using risk analysis techniques in order to take into account the possible failures of the transactions. An adaptation of risk-bas...
This article shows the benefits of developing a software project using TSPi in a "Very Small Enterprise" based in quality and productivity measures. An adapted process from the current process based on the TSPi was defined and the team was trained in it. The workaround began by gathering historical data from previous projects in order to get a meas...
This paper presents a tabu search metaheuristic algorithm for the automatic generation of structural software tests. It is a novel work since tabu search is applied to the automation of the test generation task, whereas previous works have used other techniques such as genetic algorithms. The developed test generator has a cost function for intensi...
Controlled experiments are a powerful way to assess and compare the effectiveness of different techniques. In this paper we present the experimental results of the evaluation of the effectiveness of a structural test coverage criterion developed for SQL queries when used by a tester to guide the selection of database test cases. We describe a contr...
Modern software technologies use the XML language as a preferred means of data interchange and representation, which can be accessed using XML queries. Testing these queries can be difficult because they use complex data as input in form of XML documents. This work addresses this problem by defining a query-aware procedure for design and generation...
Software testing is an expensive and difficult process, which requires much time. The use of techniques for automating the generation of software test cases is very important as it can reduce the time and cost of this process. The newest methods for automatic generation of tests use metaheuristic search techniques, i.e. Genetic Algorithms, Simulate...
The XML language is becoming the preferred means of data interchange and representation in web based applications. Usually, XML data is stored in XML repositories, which can be accessed efficiently using the standard XPath as query language. However, the specific techniques for testing these queries often ignore the functional testing. This work ad...
A set of mutation operators for SQL queries that retrieve information from a database is developed and tested against a set of queries drawn from the NIST SQL Conformance Test Suite. The mutation operators cover a wide spectrum of SQL features, including the handling of null values. Additional experiments are performed to explore whether the cost o...
Análisis y descripción de técnicas cuantitativas de aplicación en el campo de la Ingeniería del Software aportada por investigadores en la materia.