International Journal of Interactive Multimedia and Artificial Intelligence

Online ISSN: 1989-1660
Templates of (a) jogging, (b) running, (c) walking, and (d) jumping for human activities. 
Human activity recognition based on the computer vision is the process of labelling image sequences with action labels. Accurate systems for this problem are applied in areas such as visual surveillance, human computer interaction and video retrieval. The challenges are due to variations in motion, recording settings and gait differences. Here we propose an approach to recognize the human activities through gait. Activity recognition through Gait is the process of identifying an activity by the manner in which they walk. The identification of human activities in a video, such as a person is walking, running, jumping, jogging etc are important activities in video surveillance. We contribute the use of Model based approach for activity recognition with the help of movement of legs only. Experimental results suggest that our method are able to recognize the human activities with a good accuracy rate and robust to shadows present in the videos.
Data fitting: The trajectories of the model with the real-time data for the most affected countries from COVID-19.
Sensitivity Analysis of the basic reproduction number R 0 .
Time Series: the trajectories of the model with extended time to forecast the epidemic future. The parameter values are the same as for Fig. 1 for the most affected countries from COVID-19.
The wide spread of coronavirus (COVID-19) has threatened millions of lives and damaged the economy worldwide. Due to the severity and damage caused by the disease, it is very important to fore-tell the epidemic lifetime in order to take timely actions. Unfortunately, the lack of accurate information and unavailability of large amount of data at this stage make the task more difficult. In this paper, we used the available data from the mostly affected countries by COVID-19, (China, Iran, South Korea and Italy) and fit this with the SEIR type model in order to estimate the basic reproduction number R_0. We also discussed the development trend of the disease. Our model is quite accurate in predicting the current pattern of the infected population. We also performed sensitivity analysis on all the parameters used that are affecting the value of R0.
Proposed two-layer framework diagram.
COVID-19-CheXNet AUC Test. Blue curve represents test AUC for our first layer CNN model predictions. Red dashed line represents a model with an AUC of 0.5 and is used as reference.
COVID-19-CheXNet heatmap for a deceased patient.
The pandemic caused by coronavirus COVID-19 has already had a massive impact in our societies in terms of health, economy, and social distress. One of the most common symptoms caused by COVID-19 are lung problems like pneumonia, which can be detected using X-ray images. On the other hand, the popularity of Machine Learning models has grown exponentially in recent years and Deep Learning techniques have become the state-of-the-art for image classification tasks and is widely used in the healthcare sector nowadays as support for clinical decisions. This research aims to build a prediction model based on Machine Learning, including Deep Learning, techniques to predict the mortality risk of a particular patient given an X-ray and some basic demographic data. Keeping this in mind, this paper has three goals. First, we use Deep Learning models to predict the mortality risk of a patient based on this patient X-ray images. For this purpose, we apply Convolutional Neural Networks as well as Transfer Learning techniques to mitigate the effect of the reduced amount of COVID19 data available. Second, we propose to combine the prediction of this Convolutional Neural Network with other patient data, like gender and age, as input features of a final Machine Learning model, that will act as second and final layer. This second model layer will aim to improve the goodness of fit and prediction power of our first layer. Finally, and in accordance with the principle of reproducible research, the data used for the experiments is publicly available and we make the implementations developed easily accessible via public repositories. Experiments over a real dataset of COVID-19 patients yield high AUROC values and show our two-layer framework to obtain better results than a single Convolutional Neural Network (CNN) model, achieving close to perfect classification.
The novel coronavirus-2019 (Covid-19), a contagious disease became a pandemic and has caused overwhelming effects on the human lives and world economy. The detection of the contagious disease is vital to avert further spread and to promptly treat the infected people. The need of automated scientific assisting diagnostic methods to identify Covid-19 in the infected people has increased since less accurate automated diagnostic methods are available. Recent studies based on the radiology imaging suggested that the imaging patterns on X-ray images and Computed Tomography (CT) scans contain leading information about Covid-19 and is considered as a potential automated diagnosis method. Machine learning and deep learning techniques combined with radiology imaging can be helpful for accurate detection of the disease. A deep learning approach based on the multilayer-Spatial Convolutional Neural Network for automatic detection of Covid-19 using chest X-ray images and CT scans is proposed in this paper. The proposed model, named as the Multilayer Spatial Covid Convolutional Neural Network (MSCovCNN), provides an automated accurate diagnostics for Covid-19 detection. The proposed model showed 93.63% detection accuracy and 97.88% AUC (Area Under Curve) for chest x-ray images and 91.44% detection accuracy and 95.92% AUC for chest CT scans, respectively. We have used 5-tiered 2D-CNN frameworks followed by the Artificial Neural Network (ANN) and softmax classifier. In the CNN each convolution layer is followed by an activation function and a Maxpooling layer. The proposed model can be used to assist the radiologists in detecting the Covid-19 and confirming their initial screening.
Data scientists aim to provide techniques and tools to the clinicians to manage the new coronavirus disease. Nowadays, new emerging tools based on Artificial Intelligence (AI), Image Processing (IP) and Machine Learning (ML) are contributing to the improvement of healthcare and treatments of different diseases. This paper reviews the most recent research efforts and approaches related to these new data driven techniques and tools in combination with the exploitation of the already available COVID-19 datasets. The tools can assist clinicians and nurses in efficient decision making with complex and heavily heterogeneous data, even in hectic and overburdened Intensive Care Units (ICU) scenarios. The datasets and techniques underlying these tools can help finding a more correct diagnosis. The paper also describes how these innovative AI+IP+ML-based methods (e.g., conventional X-ray imaging, clinical laboratory data, respiratory monitoring and automatic adjustments, etc.) can assist in the process of easing both the care of infected patients in ICUs and Emergency Rooms and the discovery of new treatments (drugs).
Created bounding boxes of normal regions in a chest X-Ray image.
Detection boxes of SSD model inference output detecting normal lungs.
Comparison between original image and CLAHE applied images.
The Corona Virus Disease (COVID-19) is an infectious disease caused by a new virus that has not been detected in humans before. The virus causes a respiratory illness like the flu with various symptoms such as cough or fever that, in severe cases, may cause pneumonia. The COVID-19 spreads so quickly between people, affecting to 1,200,000 people worldwide at the time of writing this paper (April 2020). Due to the number of contagious and deaths are continually growing day by day, the aim of this study is to develop a quick method to detect COVID-19 in chest X-ray images using deep learning techniques. For this purpose, an object detection architecture is proposed, trained and tested with a public available dataset composed with 1500 images of non-infected patients and infected with COVID-19 and pneumonia. The main goal of our method is to classify the patient status either negative or positive COVID-19 case. In our experiments using SDD300 model we achieve a 94.92% of sensibility and 92.00% of specificity in COVID-19 detection, demonstrating the usefulness application of deep learning models to classify COVID-19 in X-ray images.
Our article “Recognizing human activities user-independently on smartphones based on accelerometer data” was published in the International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI) in 2012. In 2018, it was selected as the most outstanding article published in the 10 years of IJIMAI life. To celebrate the 10th anniversary of IJIMAI, in this article we will introduce what has happened in the field of human activity recognition and wearable sensor-based recognition since 2012, and especially, this article concentrates on introducing our work since 2012.
Two-Stage Classification of human single-limb and multi-limb activities.
Flow diagram of the proposed framework.
There is huge requirement of continuous intelligent monitoring system for human activity recognition in various domains like public places, automated teller machines or healthcare sector. Increasing demand of automatic recognition of human activity in these sectors and need to reduce the cost involved in manual surveillance have motivated the research community towards deep learning techniques so that a smart monitoring system for recognition of human activities can be designed and developed. Because of low cost, high resolution and ease of availability of surveillance cameras, the authors developed a new two-stage intelligent framework for detection and recognition of human activity types inside the premises. This paper, introduces a novel framework to recognize single-limb and multi-limb human activities using a Convolution Neural Network. In the first phase single-limb and multi-limb activities are separated. Next, these separated single and multi-limb activities have been recognized using sequence-classification. For training and validation of our framework we have used the UTKinect-Action Dataset having 199 actions sequences performed by 10 users. We have achieved an overall accuracy of 97.88% in real-time recognition of the activity sequences.
The 2nd configuration G2.
Visualization of the 2nd iteration.
Final iteration.
Visualization of the last iteration.
Textual rules list.
in the present paper we aim to study the visual decision support based on Cellular machine CASI (Cellular Automata for Symbolic Induction). The purpose is to improve the visualization of large sets of association rules, in order to perform Clinical decision support system and decrease doctors’ cognitive charge. One of the major problems in processing association rules is the exponential growth of generated rules volume which impacts doctor’s adaptation. In order to clarify it, many approaches meant to represent this set of association rules under visual context have been suggested. In this article we suggest to use jointly the CASI cellular machine and the colored 2D matrices to improve the visualization of association rules. Our approach has been divided into four important phases: (1) Data preparation, (2) Extracting association rules, (3) Boolean modeling of the rules base (4) 2D visualization colored by Boolean inferences.
Nowadays e-commerce websites offer users such a huge amount of products, which far from facilitating the buying process, actually make it more difficult. Hence, recommenders, which learn from users' preferences, are consolidating as valuable instruments to enhance the buying process in the 2D Web. Indeed, 3D virtual environments are an alternative interface for recommenders. They provide the user with an immersive 3D social experience, enabling a richer visualisation and increasing the interaction possibilities with other users and with the recommender. In this paper, we focus on a novel framework to tightly integrate interactive recommendation systems in a 3D virtual environment. Specifically, we propose to integrate a Collaborative Conversational Recommender (CCR) in a 3D social virtual world. Our CCR Framework defines three layers: the user interaction layer (3D Collaborative Space Client), the communication layer (3D Collaborative Space Server), and the recommendation layer (Collaborative Conversational Recommender). Additionally, we evaluate the framework based on several usability criteria such as learnability, perceived efficiency and effectiveness. Results demonstrate that users positively valued the experience.
We construct a convolutional neural network to classify pulmonary nodules as malignant or benign in the context of lung cancer. To construct and train our model, we use our novel extension of the fastai deep learning framework to 3D medical imaging tasks, combined with the MONAI deep learning library. We train and evaluate the model using a large, openly available data set of annotated thoracic CT scans. Our model achieves a nodule classification accuracy of 92.4% and a ROC AUC of 97% when compared to a “ground truth” based on multiple human raters subjective assessment of malignancy. We further evaluate our approach by predicting patient-level diagnoses of cancer, achieving a test set accuracy of 75%. This is higher than the 70% obtained by aggregating the human raters assessments. Class activation maps are applied to investigate the features used by our classifier, enabling a rudimentary level of explainability for what is otherwise close to “black box” predictions. As the classification of structures in chest CT scans is useful across a variety of diagnostic and prognostic tasks in radiology, our approach has broad applicability. As we aimed to construct a fully reproducible system that can be compared to new proposed methods and easily be adapted and extended, the full source code of our work is available at
This article offers a thorough analysis of the machine learning classifiers approaches for the collected Received Signal Strength Indicator (RSSI) samples which can be applied in predicting propagation loss, used for network planning to achieve maximum coverage. We estimated the RMSE of a machine learning classifier on multivariate RSSI data collected from the cluster of 6 Base Transceiver Stations (BTS) across a hilly terrain of Uttarakhand-India. Variable attributes comprise topology, environment, and forest canopy. Four machine learning classifiers have been investigated to identify the classifier with the least RMSE: Gaussian Process, Ensemble Boosted Tree, SVM, and Linear Regression. Gaussian Process showed the lowest RMSE, R- Squared, MSE, and MAE of 1.96, 0.98, 3.8774, and 1.3202 respectively as compared to other classifiers.
Over the past years, the mobile Health approach has motivated research projects to develop mood monitoring systems for bipolar disorder. Whereas mobile-based approaches have examined self-assessment or sensor data, so far, potentially important emotional aspects of this disease have been neglected. Thus, we developed an emotion-sensitive system that analyzes the verbal and facial expressions of bipolar patients in regard to their emotional cues. In this article, preliminary findings of a pilot study with five bipolar patients with respect to the acceptability and feasibility of the new approach are presented and discussed. There were individual differences in the usage frequency of the participants, and improvements regarding its handling were suggested. From the technical point of view, the video analysis was less dependable than the audio analysis and recognized almost exclusively the facial expressions of happiness. However, the system was feasible and well-accepted. The results indicate that further developments could facilitate the long-term analysis of expressed emotions in bipolar or other disorders without invading the privacy of patients.
A basic form of an Extreme learning machine.
a). Retrieval results from Corel-1K dataset.
b). Retrieval results from Corel-5K dataset.
Comparison plot of the proposed technique with the related techniques on Corel-1K, Corel-5K and Corel-10K dataset.
Comparison plot of the proposed technique with the related techniques on GHIM-10 dataset.
The process of searching, indexing and retrieving images from a massive database is a challenging task and the solution to these problems is an efficient image retrieval system. In this paper, a unique hybrid Content-based image retrieval system is proposed where different attributes of an image like texture, color and shape are extracted by using Gray level co-occurrence matrix (GLCM), color moment and various region props procedure respectively. A hybrid feature matrix or vector (HFV) is formed by an integration of feature vectors belonging to three individual visual attributes. This HFV is given as an input to an Extreme learning machine (ELM) classifier which is based on a solitary hidden layer of neurons and also is a type of feed-forward neural system. ELM performs efficient class prediction of the query image based on the pre-trained data. Lastly, to capture the high level human semantic information, Relevance feedback (RF) is utilized to retrain or reformulate the training of ELM. The advantage of the proposed system is that a combination of an ELM-RF framework leads to an evolution of a modified learning and intelligent classification system. To measure the efficiency of the proposed system, various parameters like Precision, Recall and Accuracy are evaluated. Average precision of 93.05%, 81.03%, 75.8% and 90.14% is obtained respectively on Corel-1K, Corel-5K, Corel-10K and GHIM-10 benchmark datasets. The experimental analysis portrays that the implemented technique outmatches many state-of-the-art related approaches depicting varied hybrid CBIR system.
Recently, an increasing amount of research has focused on methods to assess and account for fairness criteria when predicting ground truth targets in supervised learning. However, recent literature has shown that prediction unfairness can potentially arise due to measurement error when target labels are error prone. In this study we demonstrate that existing methods to assess and calibrate fairness criteria do not extend to the true target variable of interest, when an error-prone proxy target is used. As a solution to this problem, we suggest a framework that combines two existing fields of research: fair ML methods, such as those found in the counterfactual fairness literature and measurement models found in the statistical literature. Firstly, we discuss these approaches and how they can be combined to form our framework. We also show that, in a healthcare decision problem, a latent variable model to account for measurement error removes the unfairness detected previously.
Many researchers have used sound sensors to record audio data from insects, and used these data as inputs of machine learning algorithms to classify insect species. In image classification, the convolutional neural network (CNN), a well-known deep learning algorithm, achieves better performance than any other machine learning algorithm. This performance is affected by the characteristics of the convolution filter (ConvFilter) learned inside the network. Furthermore, CNN performs well in sound classification. Unlike image classification, however, there is little research on suitable ConvFilters for sound classification. Therefore, we compare the performances of three convolution filters, 1D-ConvFilter, 3×1 2D-ConvFilter, and 3×3 2D-ConvFilter, in two different network configurations, when classifying mosquitoes using audio data. In insect sound classification, most machine learning researchers use only audio data as input. However, a classification model, which combines other information such as activity circadian rhythm, should intuitively yield improved classification results. To utilize such relevant additional information, we propose a method that defines this information as a priori probabilities and combines them with CNN outputs. Of the networks, VGG13 with 3×3 2D-ConvFilter showed the best performance in classifying mosquito species, with an accuracy of 80.8%. Moreover, adding activity circadian rhythm information to the networks showed an average performance improvement of 5.5%. The VGG13 network with 1D-ConvFilter achieved the highest accuracy of 85.7% with the additional activity circadian rhythm information.
Students’ acquisition of teamwork competence has become a priority for educational institutions. The development of teamwork competence in education generally relies in project-based learning methodologies and challenges. The assessment of teamwork in project-based learning involves, among others, assessing students’ participation and the interactions between team members. Project-based learning can easily be handled in small-size courses, but course management and teamwork assessment become a burdensome task for instructors as the size of the class increases. Additionally, when project-based learning happens in a virtual space, such as online learning, interactions occur in a less natural way. This study explores the use of instant messaging apps (more precisely, the use of Telegram) as team communication space in project-based learning, using a learning analytics tool to extract and analyze student interactions. Further, the study compares student interactions (e.g., number of messages exchanged) and individual teamwork competence acquisition between traditional asynchronous (e.g., LMS message boards) and synchronous instant messaging communication environments. The results show a preference of students for IM tools and increased participation in the course. However, the analysis does not find significant improvement in the acquisition of individual teamwork competence.
The transformation to the Digital Society presents a challenge to engineer ever more complex socio-technical systems in order to address wicked societal problems. Therefore, it is essential that these systems should be engineered with respect not just to conventional functional and non-functional requirements, but also with respect to satisfying qualitative human values, and assessing their impact on global challenges, such as those expressed by the UN sustainable development goals (SDGs). In this paper, we present a set of sets of design principles and an associated meta-platform, which focus design of socio-technical systems on the potential interaction of human and artificial intelligence with respect to three aspects: firstly, decision-support with respect to the codification of deep social knowledge; secondly, visualisation of community contribution to successful collective action; and thirdly, systemic improvement with respect to the SDGs through impact assessment and measurement. This methodology, of SDG-Sensitive Design, is illustrated through the design of two collective action apps, one for encouraging plastic re-use and reducing plastic waste, and the other for addressing redistribution of surplus food. However, as with the inter-connectedness of the SDGs, we conclude by arguing that the inter-connectedness of the Digital Society implies that system development cannot be undertaken in isolation from other systems. © 2021, Universidad Internacional de la Rioja. All rights reserved.
Box and whiskers plot per declaration.
Frequencies of responses per declaration.
Box and whiskers plot for times spent by users per profile.
Box and whiskers plot for total score in Likert survey per profile.
In recent years, many investigations have appeared that combine the Internet of Things and Social Networks. Some of them addressed the interconnection of objects as Social Networks interconnect people, and others addressed the connection between objects and people. However, they usually used interfaces created for that purpose instead of using familiar interfaces for users. Why not integrate Smart Objects in traditional Social Networks? Why not control Smart Objects through natural interactions in Social Networks? The goal of this paper is to make easier to create applications that allow non-experts users to control Smart Objects actuators through Social Networks through the proposal of a novel approach to connect objects and people using Social Networks. This proposal will address how to use Twitter so that objects could perform actions based on Twitter users’ posts. Moreover, it will be presented a Domain-Specific language that could help in the task of defining the actions that objects could perform when people publish specific content on Twitter.
Today, people are increasingly capable of creating and sharing documents (which generally are multimedia oriented) via the internet. These multimedia documents can be accessed at anytime and anywhere (city, home, etc.) on a wide variety of devices, such as laptops, tablets and smartphones. The heterogeneity of devices and user preferences has raised a serious issue for multimedia contents adaptation. Our research focuses on multimedia documents adaptation with a strong focus on interaction with users and exploration of multimodality. We propose a multimodal framework for adapting multimedia documents based on a distributed implementation of W3C’s Multimodal Architecture and Interfaces applied to ubiquitous computing. The core of our proposed architecture is the presence of a smart interaction manager that accepts context related information from sensors in the environment as well as from other sources, including information available on the web and multimodal user inputs. The interaction manager integrates and reasons over this information to predict the user’s situation and service use. A key to realizing this framework is the use of an ontology that undergirds the communication and representation, and the use of the cloud to insure the service continuity on heterogeneous mobile devices. Smart city is assumed as the reference scenario.
In this paper, we propose a global architecture of a recommender tool, which represents a part of an existing collaborative platform. This tool provides diagnostic documents for industrial operators. The recommendation process considered here is composed of three steps: Collecting and filtering information; Prediction or recommendation step; evaluating and improvement. In this work, we focus on collecting and filtering step. We mainly use information result from collaborative sessions and documents describing solutions that are attributed to the complex diagnostic problems. The developed tool is based on collaborative filtering that operates on users' preferences and similar responses.
Severe weather conditions such as rain and snow often reduce the visual perception quality of the video image system, the traditional methods of deraining and desnowing usually rarely consider adaptive parameters. In order to enhance the effect of video deraining and desnowing, this paper proposes a video deraining and desnowing algorithm based on adaptive tolerance and dual-tree complex wavelet. This algorithm can be widely used in security surveillance, military defense, biological monitoring, remote sensing and other fields. First, this paper introduces the main work of the adaptive tolerance method for the video of dynamic scenes. Second, the algorithm of dual-tree complex wavelet fusion is analyzed and introduced. Using principal component analysis fusion rules to process low-frequency sub-bands, the fusion rule of local energy matching is used to process the high-frequency sub-bands. Finally, this paper used various rain and snow videos to verify the validity and superiority of image reconstruction. Experimental results show that the algorithm has achieved good results in improving the image clarity and restoring the image details obscured by raindrops and snows.
Adjustment factors found by processing Weibull clutter with a 64 cells CA-CFAR.  
Adjustment factors extracted from the response of a 32 cells CA-CFAR to K and Log-Normal clutter.  
Adjustment factors found by processing Weibull clutter samples with different window sizes.  
Curve fittings for a 64 cells CA-CFAR operating for P f = 10 −3 and facing Log-Normal clutter.  
Adjustment factors extracted from the response of a 32 cells CA-CFAR to K and Log-Normal Clutter.  
Oceanic and coastal radars operation is affected because the targets information is received mixed with and undesired contribution called sea clutter. Specifically, the popular CA-CFAR processor is incapable of maintaining its design false alarm probability when facing clutter with statistical variations. In opposition to the classic alternative suggesting the use of a fixed adjustment factor, the authors propose a modification of the CA-CFAR scheme where the factor is constantly corrected according on the background signal statistical changes. Mathematically translated as a variation in the shape parameter of the clutter distribution, the background signal changes were simulated through the Weibull, Log-Normal and K distributions, deriving expressions which allow choosing an appropriate factor for each possible statistical state. The investigation contributes to the improvement of radar detection by suggesting the application of an adaptive scheme which assumes the clutter shape parameter is known a priori. The offered mathematical expressions are valid for three false alarm probabilities and several windows sizes, covering also a wide range of clutter conditions.
The study of belief is expanding and involves a growing set of disciplines and research areas. These research programs attempt to shed light on the process of believing, understood as a central human cognitive function. Computational systems and, in particular, what we commonly understand as Artificial Intelligence (AI), can provide some insights on how beliefs work as either a linear process or as a complex system. However, the computational approach has undergone some scrutiny, in particular about the differences between what is distinctively human and what can be inferred from AI systems. The present article investigates to what extent recent developments in AI provide new elements to the debate and clarify the process of belief acquisition, consolidation, and recalibration. The article analyses and debates current issues and topics of investigation such as: different models to understand belief, the exploration of belief in an automated reasoning environment, the case of religious beliefs, and future directions of research.
Thermal imagery advantages under night and spoofing with disguise scenarios, (Up) LWIR Thermal (Down) Visible , (a,f) normal -(b,g) dark -(c,h) disguise with goggle -(d,i) disguise with mask -(e,j) disguise with wig.
Qualitative comparative study regarding NVIE database for the visible face synthesis from LWIR face images: (a) Plain Thermal, (b) Pix2Pix [55], (c) TV-GAN [36], (d) CycleGAN [56], (e) TV-CycleGAN (Ours), (f) Target Visible.
Enlarged regions of facial attributes, eyes and eyebrows, nose and mouth from Fig. 7 to compare TV-CycleGAN against its main competitor CycleGAN: (a) Plain Thermal. (b) CycleGAN (c) TV-CycleGAN, (d) Target Visible.
LWIR to Visible translation using TV-CycleGAN for scenarios including: extreme poses ((a) and (b)), facial expression (c) and glasses (d). first row: Raw Thermal, second row: TV-CycleGAN transformation, third row: Target Visible.
Security is a sensitive area that concerns all authorities around the world due to the emerging terrorism phenomenon. Contactless biometric technologies such as face recognition have grown in interest for their capacity to identify probe subjects without any human interaction. Since traditional face recognition systems use visible spectrum sensors, their performances decrease rapidly when some visible imaging phenomena occur, mainly illumination changes. Unlike the visible spectrum, Infrared spectra are invariant to light changes, which makes them an alternative solution for face recognition. However, in infrared, the textural information is lost. We aim, in this paper, to benefit from visible and thermal spectra by proposing a new heterogeneous face recognition approach. This approach includes four scientific contributions. The first one is the annotation of a thermal face database, which has been shared via Github with all the scientific community. The second is the proposition of a multi-sensors face detector model based on the last YOLO v3 architecture, able to detect simultaneously faces captured in visible and thermal images. The third contribution takes up the challenge of modality gap reduction between visible and thermal spectra, by applying a new structure of CycleGAN, called TV-CycleGAN, which aims to synthesize visible-like face images from thermal face images. This new thermal-visible synthesis method includes all extreme poses and facial expressions in color space. To show the efficacy and the robustness of the proposed TV-CycleGAN, experiments have been applied on three challenging benchmark databases, including different real-world scenarios: TUFTS and its aligned version, NVIE and PUJ. The qualitative evaluation shows that our method generates more realistic faces. The quantitative one demonstrates that the proposed TV -CycleGAN gives the best improvement on face recognition rates. Therefore, instead of applying a direct matching from thermal to visible images which allows a recognition rate of 47,06% for TUFTS Database, a proposed TV-CycleGAN ensures accuracy of 57,56% for the same database. It contributes to a rate enhancement of 29,16%, and 15,71% for NVIE and PUJ databases, respectively. It reaches an accuracy enhancement of 18,5% for the aligned TUFTS database. It also outperforms some recent state of the art methods in terms of F1-Score, AUC/EER and other evaluation metrics. Furthermore, it should be mentioned that the obtained visible synthesized face images using TV-CycleGAN method are very promising for thermal facial landmark detection as a fourth contribution of this paper.
In view of the shortcomings of the traditional thinking of computer graphic advertising design, this paper introduces TRIZ innovative thinking to design computer advertising. First of all, combined with specific cases of computer creative print advertising, this paper analyzes the creative methods of stimulating divergent thinking, aggregation thinking and transformation thinking from the innovation principle of TRIZ theory as the origin, and applies them to the creative mechanism and application program of print advertising creativity. The whole process is led by rational principles of perceptual thinking, driven by specific principles of abstract imagination, to explore the thinking source of creative design essence of print advertising. The theory and its application mechanism become a new thinking method and application attempt in the creative field of print advertisement. Then, based on the TRIZ innovation theory, the business model of advertising content arrangement is constructed, and the mathematical model is constructed according to the planning business media resource planning on the business model to realize the multi-objective optimization of efficient use of orders and precise delivery of time. Finally, a multi-objective optimization mathematical model of parallel genetic algorithm is designed to solve the advertisement content arrangement. The innovative thinking of TRIZ and the application of genetic algorithm in content arrangement of computer graphic advertisement design are verified by experiments.
Dependency between aesthetics and usability considering the impact of visual clarity.
Dependency between usability and aesthetics considering the impact of visual clarity.
Search result pages for the versions with high visual clarity (top) and low visual clarity (bottom). Manipulation was done by changing alignment, adding and removing elements and using structuring elements like boxes.
Dependency between usability and aesthetics considering the impact of visual clarity. Regression coefficients all significantly >0, p<0.01).
Several studies reported a dependency between perceived beauty and perceived usability of a user interface. But it is still not fully clear which psychological mechanism is responsible for this dependency. We suggest a new explanation based on the concept of visual clarity. This concept describes the perception of order, alignment and visual complexity. A high visual clarity supports a fast orientation on an interface and creates an impression of simplicity. Thus, visual clarity will impact usability dimensions, like efficiency and learnability. Visual clarity is also related to classical aesthetics and the fluency effect, thus an impact on the perception of aesthetics is plausible. We present two large studies that show a strong mediator effect of visual clarity on the dependency between perceived aesthetics and perceived usability. These results support the proposed explanation. In addition, we show how visual clarity of a user interface can be evaluated by a new scale embedded in the UEQ+ framework. Construction and first evaluation results of this new scale are described.
Facial expression is an essential part of communication. For this reason, the issue of human emotions evaluation using a computer is a very interesting topic, which has gained more and more attention in recent years. It is mainly related to the possibility of applying facial expression recognition in many fields such as HCI, video games, virtual reality, and analysing customer satisfaction etc. Emotions determination (recognition process) is often performed in 3 basic phases: face detection, facial features extraction, and last stage - expression classification. Most often you can meet the so-called Ekman’s classification of 6 emotional expressions (or 7 - neutral expression) as well as other types of classification - the Russell circular model, which contains up to 24 or the Plutchik’s Wheel of Emotions. The methods used in the three phases of the recognition process have not only improved over the last 60 years, but new methods and algorithms have also emerged that can determine the ViolaJones detector with greater accuracy and lower computational demands. Therefore, there are currently various solutions in the form of the Software Development Kit (SDK). In this publication, we point to the proposition and creation of our system for real-time emotion classification. Our intention was to create a system that would use all three phases of the recognition process, work fast and stable in real time. That’s why we’ve decided to take advantage of existing Affectiva SDKs. By using the classic webcamera we can detect facial landmarks on the image automatically using the Software Development Kit (SDK) from Affectiva. Geometric feature based approach is used for feature extraction. The distance between landmarks is used as a feature, and for selecting an optimal set of features, the brute force method is used. The proposed system uses neural network algorithm for classification. The proposed system recognizes 6 (respectively 7) facial expressions, namely anger, disgust, fear, happiness, sadness, surprise and neutral. We do not want to point only to the percentage of success of our solution. We want to point out the way we have determined this measurements and the results we have achieved and how these results have significantly influenced our future research direction.
Performance creative evaluation can be achieved through affective data, and the use of affective featuresto evaluate performance creative is a new research trend. This paper proposes a “Performance Creative—Multimodal Affective (PC-MulAff)” model based on the multimodal affective features for performance creative evaluation. The multimedia data acquisition equipment is used to collect the physiological data of the audience, including the multimodal affective data such as the facial expression, heart rate and eye movement. Calculate affective features of multimodal data combined with director annotation, and defined “Performance Creative—Affective Acceptance (PC-Acc)” based on multimodal affective features to evaluate the quality of performance creative. This paper verifies the PC-MulAff model on different performance data sets. The experimental results show that the PC-MulAff model shows high evaluation quality in different performance forms. In the creative evaluation of dance performance, the accuracy of the model is 7.44% and 13.95% higher than that of the single textual and single video evaluation.
Utilizing biomedical signals as a basis to calculate the human affective states is an essential issue of affective computing (AC). With the in-depth research on affective signals, the combination of multi-model cognition and physiological indicators, the establishment of a dynamic and complete database, and the addition of high-tech innovative products become recent trends in AC. This research aims to develop a deep gradient convolutional neural network (DGCNN) for classifying affection by using an eye-tracking signals. General signal process tools and pre-processing methods were applied firstly, such as Kalman filter, windowing with hamming, short-time Fourier transform (SIFT), and fast Fourier transform (FTT). Secondly, the eye-moving and tracking signals were converted into images. A convolutional neural networks-based training structure was subsequently applied; the experimental dataset was acquired by an eye-tracking device by assigning four affective stimuli (nervous, calm, happy, and sad) of 16 participants. Finally, the performance of DGCNN was compared with a decision tree (DT), Bayesian Gaussian model (BGM), and k-nearest neighbor (KNN) by using indices of true positive rate (TPR) and false negative rate (FPR). Customizing mini-batch, loss, learning rate, and gradients definition for the training structure of the deep neural network was also deployed finally. The predictive classification matrix showed the effectiveness of the proposed method for eye moving and tracking signals, which performs more than 87.2% inaccuracy. This research provided a feasible way to find more natural human-computer interaction through eye moving and tracking signals and has potential application on the affective production design process.
Steps of the vote.  
Finding solutions algorithm.  
Sample of source cases.  
Number of relevant and irrelevant solution found by each measure.  
In spunlace nonwovens industry, the maintenance task is very complex, it requires experts and operators collaboration. In this paper, we propose a new approach integrating an agent-based modelling with case-based reasoning that utilizes similarity measures and preferences module. The main purpose of our study is to compare and evaluate the most suitable similarity measure for our case. Furthermore, operators that are usually geographically dispersed, have to collaborate and negotiate to achieve mutual agreements, especially when their proposals (diagnosis) lead to a conflicting situation. The experimentation shows that the suggested agent-based approach is very interesting and efficient for operators and experts who collaborate in INOTIS enterprise.
Collect the task-relevant data.
Initial contrasting class relation and initial target class relation.
In the context of a data driven approach aimed to detect the real and responsible factors of the transmission of diseases and explaining its emergence or re-emergence, we suggest SOLAM (Spatial on Line Analytical Mining) system, an extension of Spatial On Line Analytical Processing (SOLAP) with Spatial Data Mining (SDM) techniques. Our approach consists of integrating EPISOLAP system, tailored for epidemiological surveillance, with spatial generalization method allowing the predictive evaluation of health risk in the presence of hazards and awareness of the vulnerability of the exposed population. The proposed architecture is a single integrated decision-making platform of knowledge discovery from spatial databases. Spatial generalization methods allow exploring the data at different semantic and spatial scales while reducing the unnecessary dimensions. The principle of the method is selecting and deleting attributes of low importance in data characterization, thus produces zones of homogeneous characteristics that will be merged.
During the last years, electrical systems around the world and in particular the Spanish electric sector have undergone great changes with the focus of turning them into more liberalized and competitive markets. For this reason, in many countries like Spain have appeared electric markets where producers sell and electricity retailers buy the power we consume. All agents involved in this market need predictions of generation, demand and especially prices to be able to participate in them in a more efficient way, obtaining a greater profit. The present work is focused on the context of development of a tool that allows to predict the price of electricity for the next day in the most precise way possible. For such target, this document analyzes the electric market to understand how prices are calculated and who are the agents that can make prices vary. Traditional proposals in the literature range from the use of Game Theory to the use of Machine Learning, Time Series Analysis or Simulation Models. In this work we analyze a normalization of the target variable due to a strong seasonal component in an hourly and daily way to later benchmark several models of Machine Learning: Ridge Regression, K-Nearest Neighbors, Support Vector Machines, Neural Networks and Random Forest. After observing that the best model is Random Forest, a discussion has been carried out on the appropriateness of the normalization for this algorithm. From this analysis it is obtained that the model that gives the best results has been Random Forest without applying the normalization function. This is due to the loss of the close relationship between the objective variable and the electric demand, obtaining an Average Absolute Error of 3.92€ for the whole period of 2016.
Examining AI spirituality can illuminate problematic assumptions about human spirituality and AI cognition, suggest possible directions for AI development, reduce uncertainty about future AI, and yield a methodological lens sufficient to investigate human-AI sociotechnical interaction and morality. Incompatible philosophical assumptions about human spirituality and AI limit investigations of both and suggest a vast gulf between them. An emergentist approach can replace dualist assumptions about human spirituality and identify emergent behavior in AI computation to overcome overly reductionist assumptions about computation. Using general systems theory to organize models of human experience yields insight into human morality and spirituality, upon which AI modeling can also draw. In this context, the pragmatist Josiah Royce’s semiotic philosophy of spirituality identifies unanticipated overlap between symbolic AI and spirituality and suggests criteria for a human-AI community focused on modeling morality that would result in an emergent Interpreter-Spirit sufficient to influence the ongoing development of human and AI morality and spirituality.
Original mammogram with abnormal region (mass) (a) Five level Gauss Pyramid (b) Reduced filtered image (c). (a) (b) (c)
FCM clustered images on original reduced mammograms (c=2..10).
IMCAD Receiver Operating Characteristic Curve.
Examples of IMCAD segmentation images (b) and detection results (c) on original images (a). Comparison by superposition on annotated masks of Inbreast Database (d).
IMCAD results on normal mammograms.
Computer Aided Detection (CAD) systems are very important tools which help radiologists as a second reader in detecting early breast cancer in an efficient way, specially on screening mammograms. One of the challenging problems is the detection of masses, which are powerful signs of cancer, because of their poor apperance on mammograms. This paper investigates an automatic CAD for detection of breast masses in screening mammograms based on fuzzy segmentation and a bio-inspired method for pattern recognition: Artificial Immune Recognition System. The proposed approach is applied to real clinical images from the full field digital mammographic database: Inbreast. In order to validate our proposition, we propose the Receiver Operating Characteristic Curve as an analyzer of our IMCAD classifier system, which achieves a good area under curve, with a sensitivity of 100% and a specificity of 95%. The recognition system based on artificial immunity has shown its efficiency on recognizing masses from a very restricted set of training regions.
The point-to-point architecture of air routes.
The Hub-and-spoke architecture of air routes.
The convergence plot of the proposed model
The non-constrained-domination model (the model of variation of number of airways vs. number of flight change vs. average travel length in Km) of the 500th generation.
Air route network optimization, one of the airspace planning challenges, effectively manages airspace resources toward increasing airspace capacity and reducing air traffic congestion. In this paper, the structure of the flight network in air transport is analyzed with a multi-objective genetic algorithm regarding Geographic Information System (GIS) which is used to optimize this Iran airlines topology to reduce the number of airways and the aggregation of passengers in aviation industries organization and also to reduce changes in airways and the travel time for travelers. The proposed model of this study is based on the combination of two topologies – point-to-point and Hub-and-spoke – with multiple goals for causing a decrease in airways and travel length per passenger and also to reach the minimum number of air stops per passenger. The proposed Multi-objective Genetic Algorithm (MOGA) is tested and assessed in data of the Iran airlines industry in 2018, as an example to real-world applications, to design Iran airline topology. MOGA is proven to be effective in general to solve a network-wide flight trajectory planning. Using the combination of point-to-point and Hub-and-spoke topologies can improve the performance of the MOGA algorithm. Based on Iran airline traffic patterns in 2018, the proposed model successfully decreased 50.8% of air routes (184 air routes) compared to the current situations while the average travel length and the average changes in routes were increased up to 13.8% (about 100 kilometers) and up to 18%, respectively. The proposed algorithm also suggests that the current air routes of Iran can be decreased up to 24.7% (89 airways) if the travel length and the number of changes increase up to 4.5% (32 kilometers) and 5%, respectively. Two intermediate airports were supposed for these experiments. The computational results show the potential benefits of the proposed model and the advantage of the algorithm. The structure of the flight network in air transport can significantly reduce operational cost while ensuring the operation safety. According to the results, this intelligent multi-object optimization model would be able to be successfully used for a precise design and efficient optimization of existing and new airline topologies.
Wine is an exciting and complex product with distinctive qualities that makes it different from other manufactured products. Therefore, the testing approach to determine the quality of wine is complex and diverse. Several elements influence wine quality, but the views of experts can cause the most considerable influence on how people view the quality of wine. The views of experts on quality is very subjective, and may not match the taste of consumer. In addition, the experts may not always be available for the wine testing. To overcome this issue, many approaches based on machine learning techniques that get the attention of the wine industry have been proposed to solve it. However, they focused only on using a particular classifier with a specific set of wine dataset. In this paper, we thus firstly propose the generalized wine quality prediction framework to provide a mechanism for finding a useful hybrid model for wine quality prediction. Secondly, based on the framework, the generalized wine quality prediction algorithm using the genetic algorithms is proposed. It first encodes the classifiers as well as their hyperparameters into a chromosome. The fitness of a chromosome is then evaluated by the average accuracy of the employed classifiers. The genetic operations are performed to generate new offspring. The evolution process is continuing until reaching the stop criteria. As a result, the proposed approach can automatically find an appropriate hybrid set of classifiers and their hyperparameters for optimizing the prediction result and independent on the dataset. At last, experiments on the wine datasets were made to show the merits and effectiveness of the proposed approach.
This article shows the application and design of a hybrid algorithm capable of classifying people into risk groups using data such as prehensile strength, body mass index and percentage of fat. The implementation was done on Python and proposes a tool to help make medical decisions regarding the cardiovascular health of patients. The data were taken in a systematic way, k-means and c-means algorithms were used for the classification of the data, for the prediction of new data two vectorial support machines were used, one for the k-means and the other for the c-means, obtaining as a result a 100% of precision in the vectorial support machine with c-means and a 92% in the one of k-means.
Software Maintainability is an indispensable factor to acclaim for the quality of particular software. It describes the ease to perform several maintenance activities to make a software adaptable to the modified environment. The availability & growing popularity of a wide range of Machine Learning (ML) algorithms for data analysis further provides the motivation for predicting this maintainability. However, an extensive analysis & comparison of various ML based Boosting Algorithms (BAs) for Software Maintainability Prediction (SMP) has not been made yet. Therefore, the current study analyzes and compares five different BAs, i.e., AdaBoost, GBM, XGB, LightGBM, and CatBoost, for SMP using open-source datasets. Performance of the propounded prediction models has been evaluated using Root Mean Square Error (RMSE), Mean Magnitude of Relative Error (MMRE), Pred(0.25), Pred(0.30), & Pred(0.75) as prediction accuracy measures followed by a non-parametric statistical test and a post hoc analysis to account for the differences in the performances of various BAs. Based on the residual errors obtained, it was observed that GBM is the best performer, followed by LightGBM for RMSE, whereas, in the case of MMRE, XGB performed the best for six out of the seven datasets, i.e., for 85.71% of the total datasets by providing minimum values for MMRE, ranging from 0.90 to 3.82. Further, on applying the statistical test and on performing the post hoc analysis, it was found that significant differences exist in the performance of different BAs and, XGB and CatBoost outperformed all other BAs for MMRE. Lastly, a comparison of BAs with four other ML algorithms has also been made to bring out BAs superiority over other algorithms. This study would open new doors for the software developers for carrying out comparatively more precise predictions well in time and hence reduce the overall maintenance costs.
Top-cited authors
Jörg Thomaschewski
  • Hochschule Emden/Leer
Martin Schrepp
  • SAP Research
Andreas Hinderks
  • Hochschule Emden/Leer
Ruben Gonzalez Crespo
  • Universidad Internacional de La Rioja
B. Cristina Pelayo García-Bustelo