# Joaquín Pérez OrtegaCentro Nacional de Investigación y Desarrollo Tecnológico · Department of Computer Sciences

Joaquín Pérez Ortega

PhD

## About

113

Publications

23,997

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

491

Citations

## Publications

Publications (113)

A hybrid variant of the Fuzzy C-Means and K-Means algorithms is proposed to solve large datasets such as those presented in Big Data. The Fuzzy C-Means algorithm is sensitive to the initial values of the membership matrix. Therefore, a special configuration of the matrix can accelerate the convergence of the algorithm. In this sense, a new approach...

Predicting the values of a financial time series is mainly a function of its price history, which depends on several factors, internal and external. With this history, it is possible to build an ∊-machine for predicting the financial time series. This work proposes considering the influence of a financial series through the transfer of entropy when...

Mexico is among the five countries with the largest number of reported deaths from COVID-19 disease, and the mortality rates associated to infections are heterogeneous in the country due to structural factors concerning population. This study aims at the analysis of clusters related to mortality rate from COVID-19 at the municipal level in Mexico f...

One of the ways of acquiring new knowledge or underlying patterns in data is by means of clustering algorithms or techniques for creating groups of objects or individuals with similar characteristics in each group and at the same time different from the other groups. There is a consensus in the scientific community that the most widely used cluster...

In most big cities, public transports are enclosed and crowded spaces. Therefore, they are considered as one of the most important triggers of COVID-19 spread. Most of the existing research related to the mobility of people and COVID-19 spread is focused on investigating highly frequented paths by analyzing data collected from mobile devices, which...

Around the world, diabetes is a disease that in recent years has had a significant increase in mortality rates. Currently, several countries consider diabetes an important public health problem, particularly Mexico. In this research work, the problem of mortality forecasting for the next 5 years in Mexico City was addressed by applying Data Science...

Different research problems (optimization, classification, ordering) have shown that some problem instances are better solved by a certain solution algorithm in comparison to any other. A literature review indicated implicitly that this phenomenon has been identified, formulated, and analyzed in understanding levels descriptive and predictive witho...

Systematic Review of Methodologies in Data Science Abstract—In this paper, the problem of finding outstanding research works in the topic of methodologies of Data Science was addressed. Although there are survey-type publications on this science, these do not explicitly highlight the use of a state-of-the- art review methodology that would help in...

The relation between problem and solution algorithm presents a similar phenomenon in different research problems (optimization, decision, classification, ordering); the algorithm performance is very good in some cases of the problem, and very bad in other. Majority of related works have worked for predicting the most adequate algorithm to solve a n...

We know that SARS-Cov2 produces the new COVID-19 disease, which is one of the most dangerous pandemics of modern times. This pandemic has critical health and economic consequences, and even the health services of the large, powerful nations may be saturated. Thus, forecasting the number of infected persons in any country is essential for controllin...

Entropy is a key concept in the characterization of uncertainty for any given signal, and its extensions such as Spectral Entropy and Permutation Entropy. They have been used to measure the complexity of time series. However, these measures are subject to the discretization employed to study the states of the system, and identifying the relationshi...

This research addresses the problem of the short-term prediction of the closing prices of financial series. As a method of solution, an ϵ-machine model is proposed, which is constructed with a probabilistic finite state machine with an input function and an output function. The definition of the alphabet of the ϵ-machine is generated by transformin...

A review of state of art reveals that the characterization and analysis of the relation between problem-algorithm has been focused only on problem features or on algorithm features; or in some situations on both, but the algorithm logical is not considered in the analysis. The above for selecting an algorithm will give the best solution. However th...

The Hurst exponent is a metric used to evaluate whether a time series exhibits long-term memory, and it is used to identify its complexity. Besides, forecasting methods are tested using time series from Makridakis competition. Additionally, Exponential Smoothing is among the best forecasting methods of this competition, and ARIMA is one of the most...

The robots are designed to help with difficult tasks for the human being, currently carried out activities that were a main feature of the human how to perform surgeries, injections, run musical instruments among others. This paper aims to highlight the Musical Artificial Intelligent and the history of the Musicians Robots.

In this paper we propose a criterion to balance the processing time and the solution quality of k-means cluster algorithms when applied to instances where the number n of objects is big. The majority of the known strategies aimed to improve the performance of k-means algorithms are related to the initialization or classification steps. In contrast,...

With the increasing presence of Big Data there arises the need to group large instances. These instances present a number of objects with multidimensional features, which require to be grouped in hundreds or thousands of clusters. This article presents a new improvement to the K-means algorithm, which is oriented to the efficient solution of instan...

With the increasing presence of Big Data there arises the need to group large instances. These instances present a number of objects with multidimensional features, which require to be grouped in hundreds or thousands of clusters. This article presents a new improvement to the K-means algorithm, which is oriented to the efficient solution of instan...

In this paper, we propose a heuristic algorithm that obtains the optimal solution for 5 instances of the set of instances Hard28, for the problem of packing objects in containers of a dimension (1DBPP). This algorithm is based on storage patterns of objects in containers. To detect how objects are stored in containers, the HGGA-BP algorithm [8] was...

This paper proposes a new criterion for reducing the processing time of the assignment of data points to clusters for algorithms of the k-means family, when they are applied to instances where the number n of points is large. Our criterion allows a point to be classified in an early stage, excluding it from distance calculations to cluster centroid...

This paper aims at being a guide to understanding the different types of optimization problems of the Water Distribution Network by presenting a survey of mathematical models and algorithms used to solve the variants of the Water Distribution Networks. Some problems are Water Resource Planning, Water Quality Management, Water Supply Networks, Water...

This work presents an improved version of the K-Means algorithm, this version consists in a simple heuristic where objects that remains in the same group, between the current and the previous iteration, are identified and excluded from calculi in the classification phase for subsequent iterations. In order to evaluate the improved version versus th...

This work presents an approach for enhancing the K-means algorithm in the classification phase. The approach consists in a heuristic, which at each time that an object remains in the same group, between the current and the previous iteration, it is identified as stable and it is removed from computations in the classification phase in the current a...

The conventional K-medoids algorithm is one of the most used clustering algorithms, however, one of its limitations is its sensitivity to initial medoids. The generation of optimized initial medoids, which increases the efficiency and effectiveness of K-medoids is proposed. The initial medoids are obtained in two steps, in the first one the data ar...

The mechanisms to communicate emotions have dramatically changed in the last 10 years with social networks, where users massively communicate their emotional states by using the Internet. However, people with socialization problems have difficulty expressing their emotions verbally or interpreting the environment and providing an appropriate emotio...

It is modeled diagnosis and treatment for cerebrovascular diseases by partially observable Markov decision processes. It is desired that this model will be helpful for physicians without adequate experience for the treatment of such diseases. Patients with cerebrovascular diseases need immediate treatment to prevent disability from irreversible neu...

In this article the solution of Bin Packing problem of one dimension through a weighted finite automaton is presented. Construction of the automaton and its application to solve three different instances, one synthetic data and two benchmarks are presented: N1C1W1_A.BPP belonging to data set Set_1; and BPP13.BPP belonging to hard28. The optimal sol...

Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies

This paper addresses the problem of clustering instances with a high number of dimensions. In particular, a new heuristic for reducing the complexity of the K-means algorithm is proposed. Traditionally, there are two approaches that deal with the clustering of instances with high dimensionality. The first executes a preprocessing step to remove tho...

The Bin Packing problem (BPP) is NP-hard, the use of exact methods for solving BPP instances require a high number of variables and therefore a high computational cost. In this paper a new heuristic strategy for solving the BPP instances, which guarantees obtain optimal solutions, is proposed. The proposed strategy includes the use of a new model b...

El problema de Bin Packing (BPP) es NP-duro, por lo que un método exacto para resolver instancias del BPP requiere un gran número de variables y demasiado tiempo de ejecución. En este trabajo se propone una nueva estrategia heurística para resolver instancias del BPP en donde se garantiza la solución óptima. La estrategia propuesta incluye el uso d...

The Nursing Process Problem (NPP) has the main objective of defining the nursing diagnosis based on the initial patient diagnosis, his/her medical history and the classifications of NANDA, NIC and NOC nomenclatures. The main objective is to propose the characterization of NPP, where the health initial conditions and the medical history of patients...

Early diagnosis of social isolation in older adults can prevent physical and cognitive impairment or further impoverishment of their social network. This diagnosis is usually performed by personal and periodic application of psychological assessment instruments. This situation encourages the development of novel approaches able to monitor risk situ...

It is known that the data preparation phase is the most time consuming phase in the data mining process. Between 50% or up to 70% of the total project time and the results of data preparation directly affect the quality of it. Currently, data mining methodologies hold a general purpose; one of the limitations being that they do not provide a guide...

It is known that the data preparation phase is the most time consuming in the data mining process, using up to 50 % or up to 70 % of the total project time. Currently, data mining methodologies are of general purpose and one of their limitations is that they do not provide a guide about what particular task to develop in a specific domain. This pap...

We address the multidimensional characterization of difficult instances of the Bin Packing Problem, well known in the combinatorial optimization realm. In search for efficient procedures to solve hard combinatorial optimization problems previous investigations have attempted to identify the instances' characteristics with the greatest impact in the...

This work presents a methodology for characterizing difficult instances of the Bin Packing Problem using Data Mining. Characteristics of such instances help to provide ideas for developing new strategies to find optimal solutions by improving the current solution algorithms or developing new ones. According to related work, in general, instance cha...

The algorithm of multiplication of matrices of Dekel, Nassimi and Sahani or Hypercube is analysed, modified and implemented on multicore processor cluster, where the number of processors used is less than that required by the algorithm n33. 23, 43 and 83 processing units are used to multiply matrices of the order of 10x10, 102x102 and 103X103. The...

In this paper a hybrid algorithm is proposed to find the optimal solution for any instance of the bin packing problem one-dimensional. The hybrid algorithm considers the use of a heuristic method and a mathematical model based on flow arcs technique to find the optimal solution for an instance of 1D-BPP. The hybrid algorithm makes use of the lower...

The K-means clustering algorithm is widely used in several domains, because of its simplicity of implementation and interpretation. However, one of its limitations is its high computational complexity. In this work the problem of reducing the complexity of the K means algorithm is approached, in order to make possible the solution of large scale da...

Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies

En el presente artículo se presenta un nuevo problema NP denominado Problema del Proceso Enfermero por sus siglas en inglés Nursing Process Problem (NPP). Se tiene como objetivo principal el desarrollo de un modelo matemático para NPP y la caracterización de instancias. Se generaron 5 conjuntos de instancias con 30 casos de manera aleatoria con la...

Currently, the use of technologies in organizations is a key feature in automating business processes. On the other hand, the concep-tual modeling of information systems is about describing the semantics of software applications at a high level of abstraction. However, at present, the modeling of business technologies has not been given at a concep...

The calculation of variance is defined as a objective and increasing function, this definition allows establish the hypotheses to calculate diversified investment portfolios from the dominion of the function. In order to apply these hypotheses our mathematical multi-objective linear model is modified. Diversified portfolios are selected from the st...

The main purpose of this paper is to show the advantage of using a model proposed by us, which minimizes roundtrip response time versus traditional models that minimize query transmission and processing costs for the design of a distributed database with vertical fragmentation. To this end, an experiment was conducted to compare the roundtrip respo...

The theorical lower bounds have been proposed to solve the Bin Packing problem, yet their optimal approximation of the number of bins is limited. In this sense, this paper presents a metaheuristic method to select a lower bound at a low cost. Getting lower bound more precise improves on convergence to the optimal solution methods; it helps to impro...

At present, there exist many modeling techniques for capturing business semantics from different perspectives: transactional, goal-oriented, aspect-oriented, value-oriented, etc. The results of these modeling techniques serve as natural input for the software system generation process. However, none of these current modeling proposals takes into ac...

At present, there exist many modeling techniques for capturing business semantics from different perspectives: transactional, goal-oriented, aspect-oriented, value-oriented, etc. The results of these modeling techniques serve as natural input for the software system generation process. However, none of these current modeling proposals takes into ac...

We propose the usage of formal languages for expressing instances of NP-complete problems for their application in polynomial transformations. The proposed approach, which consists of using formal language theory for polynomial transformations, is more robust, more practical, and faster to apply to real problems than the theory of polynomial transf...

Cluster analysis is the study of algorithms and techniques for grouping objects according to their intrinsic characteristics and similarity. A widely studied and popular clustering algorithm is K-Means, which is characterized by its ease of implementation and high computational cost. Although various performance improvements have been proposed for...

The object clustering problem, according to their similarity measures, can be formulated as a combinatorial optimization problem. The K-Means algorithm has been widely used for solving such problem; however, its computational cost is very high. In this work a new heuristics is proposed for reducing the computational complexity in the classification...

In this paper a tool for supporting visual analysis of the behavior of metaheuristic algorithms focused to solve the Bin Packing Problem is proposed. Traditionally, metaheuristics have been analyzed monolithically, by means of solving a set of input instances and analyzing the output solutions. However, due to the stochastic features of metaheurist...

This work addresses the problem of finding the mortality distribution for lung cancer in Mexican districts, through clustering patterns discovery. A data mining system was developed which consists of a pattern generator and a visualization subsystem. Such an approach may contribute to biomarker discovery by means of identifying risk regions for a g...

This paper aims at being a guide to understand polynomial transformations and polynomial reductions between NP-complete problems by presenting the methodologies for polynomial reductions/transformations and the differences between reductions and transformations. To this end the article shows examples of polynomial reductions/transformations and the...

In this work is presented a linear multi-objective model that in each iteration is solved a linear programming problem, where the magnitude of the restrictions is obtained by a neighbourhood of the convex search space of feasible solutions. The selection of an investment portfolio through a multiobjective mathematical model indicates that it has se...

We studied the photon spectrum in coated microspheres with alternating quasiperiodic layers having left-handed (LH) materials included. It is found that the band gap (spectral zone of nearly zero transmittancy) in such a system is strongly enhanced. At an increase of the quasiperiodicity parameter γ, the boundaries of the spectral band gaps acquire...

One of the challenges of applications of distributed database (DDB) systems is the possibility of expanding through the use
of the Internet, so widespread nowadays. One of the most difficult problems in DDB systems deployment is distribution design.
Additionally, existing models for optimizing the data distribution design have only aimed at optimiz...

This paper describes the results of data mining system developed ad hoc to address the problem of discovering patterns of interest in population databases for cancer. In particular, the experimental results obtained by the system are shown. The architecture of the system is innovative since it integrates a visual cartographic, a data warehouse and...