Journal of Intelligent Information Systems

Published by Springer Nature
Online ISSN: 1573-7675
Learn more about this page
Recent publications
Article
  • Osho SharmaOsho Sharma
  • Akashdeep SharmaAkashdeep Sharma
  • Arvind KaliaArvind Kalia
Context Technological advances have led to a tremendous increase in complexity and volume of specialized malware, affecting computational devices across the globe. Along with malware targeting Windows devices, IoT devices having lesser computational power, have also been affected by malware attacks in the recent past. Due to a scarcity of updated malware datasets, malware recognition and classification has become trickier, particularly in IoT environments where malware samples are limited and scarce. Identifying a malware family can reveal the underlying intent of malware and traditional machine learning algorithms have performed well in this area. However, since such methods necessitate a large amount of feature engineering, deep learning algorithms for malware recognition and classification have been developed. In particular, the malware visualization-based approaches, which have shown decent success in the past have scope of improvement, which has been exploited in the current study. Objectives The current work aims at utilizing malware images (grayscale, RGB, markov) and deep CNNs for effective Windows and IoT malware recognition and classification using traditional learning and transfer learning approaches. Methods and Design First, grayscale, RGB and markov images were created from malware binaries. In particular, the idea of markov image generation by using markov probability matrix is to retain the global statistics of malware bytes which are generally lost during image transformation operations. A Gabor filter-based approach is utilized to extract textures and then a custom-built deep CNN and pretrained Xception CNN trained on 1.5 million images from ImageNet dataset, which is fine-tuned for malware images are employed for classifying malware images into families. Results and Conclusions To assess the effectiveness of the suggested framework, two public benchmark Windows malware image datasets, one custom built Windows malware image dataset and one custom built IoT malware image dataset were utilized. In particular, the methods demonstrate excellent classification results for the 500 GB Microsoft Malware Challenge dataset. A comparison of the suggested solutions with state-of-the-art methods clearly indicates the effectiveness and low computational cost of our malware recognition and classification solution.
 
Article
Nowadays, multimedia-based digital ecosystem (e.g., social media sites) has become a great source of user-contributed multimedia documents for many types of real-world events. Very often social media posts about events are multimedia (e.g., image, video, text, and others), multi-features (e.g., 5W1H), and multi-sources. Such events might be related to each other, implying that they can be linked by different kinds of relationships (e.g., spatial, temporal, and causal relations). Additionally, a single event may contain two or more than two parts of event and some event relationships may involve more than two events. The traditional pairwise graphical representation method is insufficient to capture this complicated event structure. Thus, the problem of how to represent such complex events in an inclusive representation remains unsolved. To address this, we proposed a comprehensive event representation model based on advanced hypergraph notation. Our event hypergraph contains different kinds of nodes and links between them. Specifically, we first detect real-world events including their elements based on event-only descriptive features from social media documents. Using these detected events, we identify temporal, spatial, and causal relationships between events by comparing their associated dimensions. Finally, each node type is linked together using different kinds of relationships in the form of hierarchical hypergraph structure. Experimental results demonstrate the potential of our method.
 
Article
Nowadays, really huge volumes of fake news are continuously posted by malicious users with fraudulent goals thus leading to very negative social effects on individuals and society and causing continuous threats to democracy, justice, and public trust. This is particularly relevant in social media platforms (e.g., Facebook, Twitter, Snapchat), due to their intrinsic uncontrolled publishing mechanisms. This problem has significantly driven the effort of both academia and industries for developing more accurate fake news detection strategies: early detection of fake news is crucial. Unfortunately, the availability of information about news propagation is limited. In this paper, we provided a benchmark framework in order to analyze and discuss the most widely used and promising machine/deep learning techniques for fake news detection, also exploiting different features combinations w.r.t. the ones proposed in the literature. Experiments conducted on well-known and widely used real-world datasets show advantages and drawbacks in terms of accuracy and efficiency for the considered approaches, even in the case of limited content information.
 
Article
For e-commerce platforms, high-quality product titles are a vital element in facilitating transactions. A concise, accurate, and informative product title can not only stimulate consumers’ desire to buy the products, but also provide them with precise shopping guides. However, previous work is mainly based on manual rules and templates, which not only limits the generalization ability of the model, but also lacks dominant product aspects in the generated titles. In this paper, we propose a Transformer-based Multimodal Aspect-Aware Product Title Generation model, denoted as MAA-PTG, which can effectively integrate the visual and textual information of the product to generate a valuable title. Specifically, on the decoder side, we construct an image cross-attention layer to incorporate the local image feature. And then, we explore various strategies to fuse product aspects and global image features. During training, we also adopt an aspect-based reward augmented maximum likelihood (RAML) training strategy to promote our model to generate a product title covering the key product aspects. We elaborately construct an e-commerce product dataset consisting of the product-title pairs. The experimental results on this dataset demonstrate that compared with competitive methods, our MAA-PTG model has significant advantages in ROUGE score and human evaluation.
 
Article
E-commerce giants like Amazon rely on consumer reviews to allow buyers to inform other potential buyers about a product’s pros and cons. While these reviews can be useful, they are less so when the number of reviews is large; no consumer can be expected to read hundreds or thousands of reviews in order to gain better understanding about a product. In an effort to provide an aggregate representation of reviews, Amazon offers an average user rating represented by a 1- to 5-star score. This score only represents how reviewers feel about a product without providing insight into why they feel that way. In this work, we propose an AI technique that generates an easy-to-read, concise summary of a product based on its reviews. It provides an overview of the different aspects reviewers emphasize in their reviews and, crucially, how they feel about those aspects. Our methodology generates a list of the topics most-mentioned by reviewers, conveys reviewer sentiment for each topic and calculates an overall summary score that reflects reviewers’ overall sentiment about the product. These sentiment scores adapt the same 1- to 5-star scoring scale in order to remain familiar to Amazon users.
 
Example data: news articles give different stances towards a statement
Simulated data generation flow
Relative changes and Accuracy of FND_Prob and FND_SO on one dataset from FNC-SM
Article
The detection of fake news has become essential in recent years. This paper presents a new technique that is highly effective in identifying fake news articles. We assume a scenario where the relationship between a news article and a statement has already been classified as either agreeing or disagreeing with the statement, being uncertain about it, or being unrelated to it. Using this information, we focus on selecting the news articles that are most likely to be fake. We propose two models: the first one uses only the agree and disagree classifications; the second uses a subjective opinions based model that can also handle the uncertain cases. Our experiments on a real-world dataset (the Fake News Challenge 1 dataset) and a simulated dataset validate that both proposed models achieve state-of-the-art performance. Furthermore, we show which model to use in different scenarios to get the best performance.
 
Article
Nowadays, data scientists prefer “easy” high-level languages like R and Python, which accomplish complex mathematical tasks with a few lines of code, but they present memory and speed limitations. Data summarization has been a fundamental technique in data mining that has promise with more demanding data science applications. Unfortunately, most summarization approaches require reading the entire data set before computing any machine learning (ML) model, the old-fashioned way. Also, it is hard to learn models if there is an addition or removal of data samples. Keeping these motivations in mind, we present incremental algorithms to smartly compute summarization matrix, previously used in parallel DBMSs, to compute ML models incrementally in data science languages. Compared to the previous approaches, our new smart algorithms interleave model computation periodically, as the data set is being summarized. A salient feature is scalability to large data sets, provided the summarization matrix fits in RAM, a reasonable assumption in most cases. We show our incremental approach is intelligent and works for a wide spectrum of ML models. Our experimental evaluation shows models get increasingly accurate, reaching total accuracy when the data set is fully scanned. On the other hand, we show our incremental algorithms are as fast as Python ML library, and much faster than R built-in routines.
 
Article
Tsetlin machines (TMs) are a pattern recognition approach that uses finite state machines for learning and propositional logic to represent patterns. In addition to being natively interpretable, they have provided competitive accuracy for various tasks. In this paper, we increase the computing power of TMs by proposing a first-order logic-based framework with Herbrand semantics. The resulting TM is relational and can take advantage of logical structures appearing in natural language, to learn rules that represent how actions and consequences are related in the real world. The outcome is a logic program of Horn clauses, bringing in a structured view of unstructured data. In closed-domain question-answering, the first-order representation produces 10 × more compact KBs, along with an increase in answering accuracy from 94.83 % to 99.48 % . The approach is further robust towards erroneous, missing, and superfluous information, distilling the aspects of a text that are important for real-world understanding
 
Article
Collaborative recommender systems (CRSs) have become an essential component in a wide range of e-commerce systems. However, CRSs are also easy to suffer from malicious attacks due to the fundamental vulnerability of recommender systems. Facing with the limited representative of rating behavior and the unbalanced distribution of rating profiles, how to further improve detection performance and deal with unlabeled real-world data is a long-standing but unresolved issue. This paper develops a new detection approach to defend anomalous threats for recommender systems. First, eliminating the influence of disturbed rating profiles on abnormality detection is analyzed in order to reduce the unbalanced distribution as far as possible. Based on the remaining rating profiles, secondly, rating behaviors which belong to the same dense region using standard distance measures are further partitioned by exploiting a probability mass-based dissimilarity mechanism. To reduce the scope of determining suspicious items while keeping the advantage of target item analysis (TIA), thirdly, suspected items captured by TIA are empirically converted into an associated item-item graph according to frequent patterns of rating distributions. Finally, concerned attackers can be detected based on the determined suspicious items. Extensive experiments on synthetic data demonstrate the effectiveness of the proposed detection approach compared with benchmarks. In addition, discovering interesting findings such as suspected items or ratings on four different real-world datasets is also analyzed and discussed.
 
Article
In the last few years, there has been a significant growth in the amount of data published in RDF and adoption of Linked Data principles. Every day, a large number of people and communities contribute to the publication of datasets as Linked Data on Linked Open Data (LOD) cloud. Due to a large size of LOD cloud on the Web and the RDF representation of linked dataset, searching and retrieving relevant data on the Web is a major challenge. Because the data is published in RDF triple format, i.e. an interlinked structure, traditional search engines are unable to perform searches on Linked Data. This article introduces LOD search engine, a novel semantic search engine that searches on Semantic Web documents (such as Linked Data or triples) to retrieve a set of relevant information based on user queries. For searching over triples, we proposed two semantic search methods: Forward Search and Backward Search. To improve search results, two new ranking methods have also been introduced: Domain Ranking and Triple Ranking. The proposed LOD search engine produced remarkable results and outperformed other semantic search engines. In the best-case scenario, the proposed LOD search engine outperforms the swoogle and falcons by 22.35%, 43.38% and 33.18% in terms of precision, recall, and F-Measure respectively. Full-text access: https://rdcu.be/cBFKu
 
Article
We propose a local feature selection method for the Multiple Instance Learning (MIL) framework. Unlike conventional feature selection algorithms that assign a global set of features to the whole data set, our algorithm, called Multiple Instance Local Salient Feature Selection (MI-LSFS), searches the feature space to find the relevant features within each bag. We also propose a new multiple instance classification algorithm, called Multiple Instance Learning via Embedded Structures with Local Feature Selection (MILES-LFS), by integrating the information learned by MI-LSFS during the feature selection process. In MILES-LFS, we use information learned by MI-LSFS to identify a reduced subset of representative bags. For each representative bag, we identify its most representative instances. Using the instance prototypes of all representative bags and their relevant features, we project and map the MIL data to a standard feature vector data. Finally, we train a 1-Norm support vector machine (1-Norm SVM) to learn the classifier. We investigate the performance of MI-LSFS in selecting the local relevant features using synthetic and benchmark data sets. The results confirm that MI-LSFS can identify the relevant features for each bag. We also investigate the performance of the proposed MILES-LFS algorithm on several synthetic and real benchmark data sets. The results confirm that MILES-LFS has a robust classification performance comparable to the well-known MILES algorithm. More importantly, our results confirm that using the reduced set of prototypes to project the MIL data reduces the computational time significantly without affecting the classification accuracy.
 
Software architecture of the proposed framework
An example of two contexts Ctx1 and Ctx2. Ctx1 refers to the sentence: “I am so freaking annoyed that he constantly feels the need to be a time in my life where I’d be happy with just one monitor on my PC”. Ctx2 refers to the sentence: “I cannot see a rational way of bearing a grudge to him for that. I do not. Maybe because I do not live by Netflix and I do not care...”
The figure shows the results of UKB on the contexts previously shown in Fig. 2. It is worth noting that for some words it has not been possible to associate a WordNet synset
The figure shows an example of a generated lexicon within the domain of Emotion Detection. From left to right we have WordNet synsets, lemmas, and scores for the following five example emotional categories: anger, fear, joy, sadness and surprise
High-level architecture of our use case and an example of interaction between the user and the robot. To note that during action #2 the robot performs speech-to-text, sends the text to the Emotion Detection algorithm and waits for its output
Article
Lexicons have risen as alternative resources to common supervised methods for classification or regression in different domains (e.g., Sentiment Analysis). These resources (especially lexical) lack of important domain context and it is not possible to tune/edit/improve them depending on new domains and data. With the exponential production of data and annotations witnessed today in several domains, leveraging lexical resources to improve existing lexicons becomes a must. In this work, a novel framework to build lexicons independently from the target domain and from input categories where each text needs to be classified is provided. It employs state-of-the-art Natural Language Processing, Word Sense Disambiguation tools, and techniques to make the method as general as possible. The framework takes as input a heterogeneous collection of annotated text towards a fixed number of categories. Its output is a list of WordNet word senses with weights for each category. We prove the effectiveness of the framework taking as case study the Emotion Detection task by employing the generated lexicons within such a domain. The results prove the effectiveness of proposed framework. Additionally, the paper shows an use case on the human-robot interaction within the Emotion Detection task. Furthermore we applied our methodology in several other domains and compared our approach against common supervised methods (regressors) showing the effectiveness of the generated lexicons. By freely providing the framework we aim at encouraging and disseminating the production of context-aware and domain-specific lexicons in other domains as well.
 
Correlated sales and social media words
Predicting temporary deal success by monitoring words on social media
Overview of the framework for predicting temporary deal success with timing signals and deal information
Article
Temporary deals such as flash sales nowadays are popular strategies in retail business for cleaning out excessive inventories. It is known that the success of a temporary deal is related to product quality, promotion, and discount rates. In this paper, we look at another more obscure factor, that is the timing in the market, and we argue that such timing can be learned from social media. For example, the trending of words “summer” and “ice cream” in social media may indicate successful sales of air conditioners. We propose an approach to detect emerging words in social media as timing signals, and associate them with successful temporary deals. More specifically, the words that tend to emerge just before successful deals are considered as effective timing signals. We obtain a real-world temporary deal dataset from an industry partner and collect a social media datasets from Twitter for experiments. With experimental evaluation, we show and discuss the discovered timing signals. Furthermore, we propose a prediction framework and show that using social media timing signals can achieve better accuracy for predicting temporary deal success, comparing to internal deal information.
 
Article
Graph Convolutional Network (GCN) for aspect-based sentiment classification has attracted a lot of attention recently due to their promising performance in handling complex structure information. However, previous methods based on GCN focused mainly on examining the structure of syntactic dependency relationships, which were subject to the noise and sparsity problem. Furthermore, these methods tend to focus on one kind of structural information (namely syntactic dependency) while ignoring many other kinds of rich structures between words. To tackle these problems, we propose a novel GCN based model, named Structure-Enhanced Dual-Channel Graph Convolutional Network (SEDC-GCN). Specifically, we first exploit the rich structure information by constructing a text sequence graph and an enhanced dependency graph, then design a dual-channel graph encoder to model the structure information from the two graphs. After that, we propose two kinds of aspect-specific attention, i.e., aspect-specific semantic attention and aspect-specific structure attention, to learn sentence representation from two different perspectives, i.e., the semantic perspective based on the text encoder, and the structure perspective based on the dual-channel graph encoder. Finally, we merge the sentence representations from the above two perspectives and obtain the final sentence representation. We experimentally validate our proposed model SEDC-GCN by comparing with seven strong baseline methods. In terms of the metric accuracy, SEDC-GCN achieves performance gains of 74.42%, 77.74%, 83.30%, 81.73% and 90.75% on TWITTER, LAPTOP, REST14, REST15, and REST16, respectively, which are 0.35%, 4.22%, 1.62%, 0.70% and 2.01% better than the best performing baseline BiGCN. Similar performance improvements are also observed in terms of the metric macro-averaged F1 score. The ablation study further demonstrates the effectiveness of each component of SEDC-GCN.
 
Parts of synthetic data in comparison with real data
Recall@K and Precision@K results for three news categories
Article
Given the recent availability of large volumes of social media discussions, finding temporal unusual phenomena, which can be called events, from such data is of great interest. Previous works on social media event detection either assume a specific type of event, or assume certain behavior of observed variables. In this paper, we propose a general method for event detection on social media that makes few assumptions. The main assumption we make is that when an event occurs, affected semantic aspects will behave differently from their usual behavior, for a sustained period. We generalize the representation of time units based on word embeddings of social media text, and propose an algorithm to detect durative events in time series in a general sense. In addition, we also provide an incremental version of the algorithm for the purpose of real-time detection. We test our approaches on synthetic data and two real-world tasks. With the synthetic dataset, we compare the performance of retrospective and incremental versions of the algorithm. In the first real-world task, we use a novel setting to test if our method and baseline methods can exhaustively catch all real-world news in the test period. The evaluation results show that when the event is quite unusual with regard to the base social media discussion, it can be captured more effectively with our method. In the second real-world task, we use the event captured to help improve the accuracy of stock market movement prediction. We show that our event-based approach has a clear advantage compared to other ways of adding social media information.
 
Article
Complex system development and maintenance face the challenge of dealing with different types of models due to language affordances, preferences, sizes, and so forth that involve interaction between users with different levels of proficiency. Current conceptual data modelling tools do not fully support these modes of working. It requires that the interaction between multiple models in multiple languages is clearly specified to ensure they keep their intended semantics, which is lacking in extant tools. The key objective is to devise a mechanism to support semantic interoperability in hybrid tools for multi-modal modelling in a plurality of paradigms, all within one system. We propose FaCIL, a framework for such hybrid modelling tools. We design and realise the framework FaCIL, which maps UML, ER and ORM2 into a common metamodel with rules that provide the central point for management among the models and that links to the formalisation and logic-based automated reasoning. FaCIL supports the ability to represent models in different formats while preserving their semantics, and several editing workflows are supported within the framework. It has a clear separation of concerns for typical conceptual modelling activities in an interoperable and extensible way. FaCIL structures and facilitates the interaction between visual and textual conceptual models, their formal specifications, and abstractions as well as tracking and propagating updates across all the representations. FaCIL is compared against the requirements, implemented in crowd 2.0, and assessed with a use case. The proof-of-concept implementation in the web-based modelling tool crowd 2.0 demonstrates its viability. The framework also meets the requirements and fully supports the use case.
 
Article
Word embedding is the process of converting words into vectors of real numbers which is of great interest in natural language processing. Recently, the performance of word embedding models has been the subject of some studies in emotion analysis. They mainly try to embed affective aspects of words into their vector representations utilizing some external sentiment/emotion lexica. The underlying emotion models in the existing studies follow basic emotion theories in psychology such as Plutchik or VAD. However, none of them investigate the Mixed Emotions (ME) model in their work which is the most precise theory of emotions raised in the recent psychological studies. According to ME, feelings can be the consequent of multiple emotion categories at the same time with different intensities. Relying on the ME model, this article embeds mixed emotions features into the existing word-vectors and performs extensive experiments on various English datasets. The analyses in both lines of intrinsic evaluations and extrinsic evaluations prove the improvement of the presented model over the existing emotion-aware embeddings such as SAWE and EWE.
 
Block diagram of the system architecture
Block diagram of the Dicernibility Matrix Generation Module (DMGM) for generating a part of discernibility matrix
Block diagram of the Reduct Generation Module (RGM) for generating the reduct
Detailed architecture diagram of Attribute Router with Masking diagram with part of discernibility matrix and ones counters
Relationship between number of objects and calculation time for hardware and software implementation of REDUCT-HIDM
Article
The rough sets theory developed by Prof. Z. Pawlak is one of the tools used in intelligent systems for data analysis and processing. In modern systems, the amount of the collected data is increasing quickly, so the computation speed becomes the critical factor. One of the solutions to this problem is data reduction. Removing the redundancy in the rough sets can be achieved with the reduct. Most of the algorithms for reduct generation are only software implementations, resulting in many limitations coming from using the fixed word length, as well as consuming the time for fetching and processing of the instructions and data. These limitations make the software-based implementations relatively slow. Unlike software-based systems, hardware systems can process data much faster. This paper presents FPGA and softcore CPU based device for large datasets reduct calculation using rough set methods. Presented architecture has been tested on two real datasets by downloading and running presented solutions inside FPGA. Tested datasets had 1 000 to 1 000 000 objects. For the research purpose, the algorithm was also implemented in C language and ran on a PC. The time of a reduct calculation in hardware and software was considered. The obtained results show an increase in the speed of data processing.
 
Article
Graph convolutional network is a recently developed artificial neural network method commonly used in recommendation system research. This paper points out three shortcomings of existing recommendation systems based on the graph convolutional network. 1. Existing models that take the one-hot encoding based on node ordinal numbers in the graph or encoding based on original entity attributes as input may not fully utilize the information carried by the attribute interactions. 2. Previous models update the node embeddings only by the first-order neighbors in the graph convolution layer, which is easily affected by noise. 3. Existing models do not take into account differences in user opinions. We propose an improved graph convolutional network-based collaborative filtering model to address these drawbacks. We identify inner and cross interaction between user attributes and item attributes, and then we take the vector representations of aggregated attributes graph as input. In the convolutional layer, we aggregate the second-order collaborative signals and incorporate the different user opinions. The experiments on three public datasets show that our model outperforms state-of-the-art models.
 
Article
The Answer Sentence Selection (AS2) task is defined as the task of ranking the candidate answers for each question based on a matching score. The matching score is the probability of being a correct answer for a given question. Detecting the question class and matching it with the named entities of the answer sentence to narrow down the search space was used in primary question answering systems. We used this idea in the state-of-the-art text matching models namely, Transformer-based language models. In this paper, we proposed two different architectures: Ent-match and Ent-add, while using two different question classifiers: Convolutional Neural Network-based (CNN-based) and rule-based. The proposed models outperform the state-of-the-art AS2 model, namely TANDA and RoBERTa-base on both TREC-QA and Wiki-QA datasets. Using Wiki-QA, the Ent-add (CNN-based) model outperforms the TANDA model by 2.1% and 1.9% improvement over Mean Average Precision (MAP) and Mean Reciprocal Rank (MRR) metrics, respectively. Over the TREC-QA dataset the Ent-match (CNN-based) model outperformed the TANDA model with 1.5% and 1.4% improvement over MAP and MRR, respectively.
 
Article
In this paper, we propose a generative inpainting-based method to detect anomalous images in human monitoring via self-supervised multi-task learning. Our previous methods, where a deep captioning model is employed to find salient regions in an image and exploit caption information for each of them, detect anomalies in human monitoring at region level by considering the relations of overlapping regions. Here, we focus on image-level detection, which is preferable when humans prefer an immediate alert and handle them by themselves. However, in such a setting, the methods could show their deficiencies due to their reliance on the salient regions and their neglect of non-overlapping regions. Moreover, they take all regions equally important, which causes the performance to be easily influenced by unimportant regions. To alleviate these problems in image-level detection, we first employ inpainting techniques with a designed local and global loss to better capture the relation between a region and its surrounding area in an image. Then, we propose an attention-based Gaussian weighting anomaly score to combine all the regions by considering their importance for mitigating the influences of unimportant regions. The attention mechanism exploits multi-task learning for higher accuracy. Extensive experiments on two real-world datasets demonstrate the superiority of our method in terms of AUROC, precision, and recall over the baselines. The AUROC has improved from 0.933 to 0.989 and from 0.911 to 0.953 compared with the best baseline on the two datasets.
 
Article
In the era of Internet of Things (IoT), it is vital for smart environments to be able to efficiently provide effective predictions of user’s situations and take actions in a proactive manner to achieve the highest performance. However, there are two main challenges. First, the sensor environment is equipped with a heterogeneous set of data sources including hardware and software sensors, and oftentimes humans as complex sensors, too. These sensors generate a huge amount of raw data. In order to extract knowledge and do predictive analysis, it is necessary that the raw sensor data be cleaned, understood, analyzed, and interpreted. Second challenge refers to predictive modeling. Traditional predictive models predict situations that are likely to happen in the near future by keeping and analyzing the history of past user’s situations. Traditional predictive analysis approaches have become less effective because of the massive amount of data continuously streamed in that both affects data processing efficiency and complicates the data semantics. To address the above challenges, we propose a data-driven, situation-aware framework for predictive analysis in smart environments. First, to effectively analyze the sensor data, we employ the Situ-Morphism method to transfer sensor-enabled situation information to vector information. Then we introduce new similarity metrics and implement similarity prediction based on Locality Sensitive Hashing to improve data processing efficiency and effectively handle the data semantics. Experiment results show that the predictive analysis method proposed in this paper can be effective.
 
Article
Recommender Systems (RS) provide an effective way to deal with the problem of information overload by suggesting relevant items to users that the users may prefer. However, many online social platforms such as online dating and online recruitment recommend users to each other where both the users have preferences that should be considered for generating successful recommendations. Reciprocal Recommender Systems (RRS) are user-to-user Recommender Systems that recommend a list of users to a user by considering the preferences of both the parties involved. Generating successful recommendations inherently face the exploitation-exploration dilemma which requires predicting the best recommendation from the current information or gathering more information about the environment. To address this, we formulate reciprocal recommendation generation task as a contextual bandit problem which is a principled approach where the agent chooses an action from a set of actions based on contextual information and receives a reward for the chosen action. We propose SiameseNN-UCB algorithm: a deep neural network-based strategy that follows Siamese architecture to transform raw features and learn reward for the chosen action. Upper confidence bound type exploration is used to solve exploitation-exploration trade-off. In this algorithm, we attempt to generate reciprocal recommendations by utilizing multiple aspects such as multi-criteria ratings of a user, popularity-awareness, demographic information, and availability of users. Experimental studies conducted with speed dating data set demonstrate the effectiveness of the proposed approach.
 
Article
Sequential recommender systems aim to model users’ changing interests based on their historical behavior and predict what they will be interested in at the next moment. In recent years, approaches to modeling users’ long-term/short-term preferences have achieved promising results. Previous works typically model historical interactions through an end-to-end neural network incorporating rich side information, which relies on a final loss function to optimize all parameters. However, they tend to concatenate side information and item ID into a vector representation, leading to irreversible fusion. We propose a two-stage sequence recommendation framework to address this problem. The first stage aims to enhance the representation ability of sequence through a non-invasive bidirectional self-attentive item embedding. In the second stage, we use a time-interval aware Gated Recurrent Units with attention to capture the user’s latest intents, while predicting long-term preferences based on the first stage. To integrate the long-term/short-term preferences, we generate the final preference representation using an attention-based adaptive fusion module. We conduct extensive experiments on four benchmark datasets and the results demonstrate the effectiveness of our proposed model.
 
Article
Massive teaching resources will cause serious teaching efficiency problems for online teaching, and traditional online teaching models are even inferior to traditional classroom teaching in terms of teaching effects. Based on this, this paper analyzes massive educational resources and builds a scalable computer interactive education system based on large-scale multimedia data analysis. Moreover, this paper sets the role of the system according to the actual teaching situation, and constructs the functional module of the system structure. In addition, this paper uses computer simulation technology to analyze interactive technology and make technical improvements to make interactive technology the core technology of the computer interactive education system, and get an extensible interactive education system based on the characteristics of network teaching. Then helps to monitor and access the performance of an interactive educational system. Furthermore, this paper designs an experiment to evaluate the performance of the computer interactive education system, which is mainly carried out from two aspects: interactive evaluation and teaching evaluation. From the experimental research results, we can see that this system can effectively improve the quality of teaching.
 
Article
In this paper, we propose a lightweight multi-stage network for monaural vocal and accompaniment separation. We design a dual-branch attention (DBA) module to obtain the correlation of each position pair and that among the channels of feature maps, respectively. The square CNN (i.e. the size of the filter is k× k) shares the weights of each of the square areas in feature maps that which makes its ability of feature extraction limited. In order to address it, we propose a hybrid convolution (HC) block based on hybrid convolutional mechanism instead of square CNN to capture the dependencies along with the time dimension and the frequency dimension respectively. The ablation experiments demonstrate that the DBA module and HC block can assist in improving the separation performance. Experimental results show that our proposed network achieves outstanding performance on the MIR-1K dataset only with fewer parameters, and competitive performance compared with state-of-the-arts on DSD100 and MUSDB18 datasets.
 
Article
Aspect-based sentiment analysis (ABSA) of patients’ opinions expressed in drug reviews can extract valuable information about specific aspects of a particular drug such as effectiveness, side effects and patient conditions. One of the most important and challenging tasks of ABSA is to extract the implicit and explicit aspects from a text, and to classify the extracted aspects into predetermined classes. Supervised learning algorithms possess high accuracy in extracting and classifying aspects; however, they require annotated datasets whose manual construction is time-consuming and costly. In this paper, first a new method was introduced for identifying expressions that indicate an aspect in user reviews about drugs in English. Then, distant supervision was adopted to automate the construction of a training set using sentences and phrases that are annotated as aspect classes in the drug domain. The results of the experiments showed that the proposed method is able to identify various aspects of the test set with 74.4% F-measure, and outperforms the existing aspect extraction methods. Also, training the random forest classifier on the dataset that was constructed via distant supervision obtained the F-measure of 73.96%, and employing this dataset to fine-tune BERT for aspect classification yielded better F-measure (78.05%) in comparison to an existing method in which the random forest classifier trained on an accurate manually constructed dataset.
 
An example of ATE, OTE, and ASTE tasks. The aspect terms, opinion terms, and sentiment are marked with red, blue, and orange, respectively
An illustration of the E-AOTSR framework
Effects of hyper-parameters
Article
Aspect sentiment triplet extraction is the most recent subtask of aspect-based sentiment analysis, which aims to extract triplets information from a review sentence, including an aspect term, corresponding sentiment polarity, and associated opinion expression. Although existing researchers adopt an end-to-end method to avoid the error propagation caused by the pipeline manner, they cannot effectively establish the semantic association between aspects and opinions when extracting triples. Furthermore, utilizing sequence tagging methods in extraction and classification tasks will lead to problems, such as increased model search space and sentiment inconsistency of multi-word entities. To tackle the above issues, we propose an enhancing aspect and opinion terms semantic relation framework to make extract triplets more exact by fully capturing interactive information. Specifically, dual convolutional neural networks are used to construct aspect-oriented and opinion-oriented features respectively, the semantic relation is considered through the attention mechanism, and then feedback to each extraction task. We also employ a span-based tagging scheme to extract multiple entities directly under the supervision of span boundary detection accurately predict sentiment polarity based on span distance. We conduct extensive experiments on four benchmark datasets, and the experimental results demonstrate that our model significantly outperforms all baseline methods.
 
A dialogue based on structured knowledge. Response #1 and Response #2 show the diversity of knowledge selection. Response #3 shows that multiple knowledge entries can be used in one single response. Response #4 shows a common phenomenon that the response produced by generation model may be inconsistent with the selected knowledge
An example shows the confidence decrease of knowledge selection before decoding
Overview of our model. The dashed line means that the ground-truth response Y is only used at the training stage
Decoding process of our model. We present the detailed process at step t only and ignore part of the input at other steps for simplicity. Here, ctu\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\boldsymbol {c}_{t}^{u}$\end{document} is the context vector and ctk\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\boldsymbol {c}_{t}^{k}$\end{document} is the local knowledge vector
A case study on DuConv. The word in red means it is inconsistent with the knowledge. The <unk>represents out-of-vocabulary words
Article
Existing End-to-End neural models for dialogue generation tend to generate generic and uninformative responses. Recently, knowledge-based dialogue models have been developed to generate more informative responses by leveraging external knowledge. However, it is still challenging for the models to select appropriate knowledge from an external knowledge base and generate responses coherent with the context and knowledge. In this paper, we propose a new method that uses two-stage knowledge selection to get proper knowledge for response generation without the guidance of ground-truth knowledge. Specifically, in the first stage the model selects knowledge according to the relevance between the context and candidate knowledge from a global perspective. During response generation, dynamic knowledge attention is performed to capture the knowledge relevant to the current decoding state, which is the second stage. Furthermore, we incorporate a knowledge selection-guided pointer network into the decoder to copy words from the captured knowledge. Experimental results on DuConv and Wizard-of-Wikipedia datasets demonstrate that our model can generate more coherent and informative responses than baselines do.
 
Article
In low-resource domains, it is challenging to achieve good performance using existing machine learning methods due to a lack of training data and mixed data types (numeric and categorical). In particular, categorical variables with high cardinality pose a challenge to machine learning tasks such as classification and regression because training requires sufficiently many data points for the possible values of each variable. Since interpolation is not possible, nothing can be learned for values not seen in the training set. This paper presents a method that uses prior knowledge of the application domain to support machine learning in cases with insufficient data. We propose to address this challenge by using embeddings for categorical variables that are based on an explicit representation of domain knowledge (KR), namely a hierarchy of concepts. Our approach is to 1. define a semantic similarity measure between categories, based on the hierarchy—we propose a purely hierarchy-based measure, but other similarity measures from the literature can be used—and 2. use that similarity measure to define a modified one-hot encoding. We propose two embedding schemes for single-valued and multi-valued categorical data. We perform experiments on three different use cases. We first compare existing similarity approaches with our approach on a word pair similarity use case. This is followed by creating word embeddings using different similarity approaches. A comparison with existing methods such as Google, Word2Vec and GloVe embeddings on several benchmarks shows better performance on concept categorisation tasks when using knowledge-based embeddings. The third use case uses a medical dataset to compare the performance of semantic-based embeddings and standard binary encodings. Significant improvement in performance of the downstream classification tasks is achieved by using semantic information.
 
Article
The amazing progresses achieved in both data collecting and transferring have confronted us with a vast volume of stored and transient data. Analyzing such data can result in valuable knowledge providing a competitive advantage in support of decision-making. However, the high volume of the data makes it impossible to analyze such data manually. Data mining methods have been developed to automate this process. These methods extract useful knowledge from a massive amount of data. The vast majority of available methods focus on finding different types of patterns from various kinds of data whereas a few of them pay enough attention to the usability of mined patterns. Subsequently, there is a noticeable gap between delivered patterns and business expectations. Actionable Knowledge Discovery (AKD) is motivated to narrow this gap. Up to now, many AKD methods have been proposed. However, there is no comprehensive survey summarizing different aspects of such methods. Moreover, the lack of clear definitions and boundaries in this area makes it challenging to detect comparable methods. This paper aims at clarifying definitions and boundaries of AKD area. In this regard, some viewpoints are defined, and AKD methods are categorized by use of them. In addition, AKD methods are reviewed, and finally, a characterization table is presented that concludes the survey and can be used for studying different methods in AKD area.
 
Article
Entity resolution (ER) is a process that identifies duplicate records referring to a real-world entity and links them together in one or more datasets. As a first step toward reducing the number of required record comparisons, blocking methods attempt to group records that are likely to match. A proper evaluation of blocking methods for selecting the best one has a direct effect on the ultimate ER performance. Currently, the available metrics for evaluating blocking techniques exclusively assess their actual potential. However, it is possible to deduce new pairs from the identified ones in dirty datasets due to transitive closure between matching record pairs. In the present study, a modification of current metrics is proposed to obtain a more accurate evaluation of blocking methods taking into account transitive closure and the potential of blocking methods. Comparing the existing and proposed metrics for ten available blocking algorithms on two dirty datasets demonstrates that the proposed metrics correlate significantly with ER final performance.
 
Article
Modern information systems have to support the user in managing, understanding and interacting with, more and more data. Visualization could help users comprehend information more easily and reach conclusions in relative shorter time. However, the bigger the data is, the harder the problem of visualizing it becomes. In this paper we focus on the problem of placing a set of values in the 2D (or 3D) space. We present a novel family of algorithms that produces spiral-like layouts where the biggest values are placed in the centre of the spiral and the smaller ones in the peripheral area, while respecting the relative sizes. The derived layout is suitable not only for the visualization of medium-sized collections of values, but also for collections of values whose sizes follow power-law distribution because it makes evident the bigger values (and their relative size) and it does not leave empty spaces in the peripheral area which is occupied by the majority of the values which are small. Therefore, the produced drawings are both informative and compact. The algorithm has linear time complexity (assuming the values are sorted), very limited main memory requirements, and produces drawings of bounded space, making it appropriate for interactive visualizations, and visual interfaces in general. We showcase the application of the algorithms in various domains and interactive interfaces.
 
Article
Knowledge graph completion aims to perform link prediction between entities. Reasoning over paths in incomplete knowledge graphs has become a hot research topic. However, most of the existing path reasoning methods ignore both the overlapping phenomenon of paths between similar relations and the order information of relations in paths, and they only consider the obvious paths between entities. To address the problems of knowledge graph reasoning, a new path-based reasoning method with K-Nearest Neighbor and position embedding is proposed in this paper. The method first projects entities and relations to continuous vector space, and then utilizes the idea of the K-Nearest Neighbor algorithm to find the K nearest neighbors of each relation. After that, the paths of similar relations are merged. Then, paths are modeled through the combination operations on relations. The position information of the relations is considered during the combination, that is, the position embedding is added to the relation vector in the path. A series of experiments are conducted on real datasets to prove the effectiveness of the proposed method. The experimental results show that the proposed method significantly outperforms all baselines on entity prediction and relation prediction tasks.
 
Article
Machine learning systems offer the key capability to learn about their operating environment from the data that they are supplied. They can learn via supervised and unsupervised training, from system results during operations, or both. However, while machine learning systems can identify solutions to problems and questions, in many cases they cannot explain how they arrived at them. Moreover, they cannot guarantee that they have not relied upon confounding variables and other non-causal relationships. In some circumstances, learned behaviors may violate legal or ethical principles such as rules regarding non-discrimination. In these and other cases, learned associations that are true in many – but not all – cases may result in critical system failures when processing exceptions to the learned behaviors. A machine learning system, which applies gradient descent to expert system networks, has been proposed as a solution to this. The expert system foundation means that the system can only learn across valid pathways, while the machine learning capabilities facilitate optimization via training and operational learning. While the initial results of this approach are promising, cases where networks were optimized into high error states (and for which continued optimization continued to increase the error level) were noted. This paper proposes and evaluates multiple techniques to handle these high error networks and improve system performance, in these cases.
 
General Architecture of Question-Answering Systems
Systematic review processes
Number of publication over the years
Article
Question Answering (QA) is a field of study addressed to develop automatic methods for answering questions expressed in natural language. Recently, the emergence of the new generation of intelligent assistants, such as Siri, Alexa, and Google Assistant, has intensified the importance of an effective and efficient QA system able to handle questions with different complexities. Regarding the type of question to be answered, QA systems have been divided into two sub-areas: (i) factoid questions that require a single fact – e.g., a name of a person or a date, and (ii) non-factoid questions that need a more complex answer – e.g., descriptions, opinions, or explanations. While factoid QA systems have overcome human performance on some benchmarks, automatic systems for answering non-factoid questions remain a challenge and an open research problem. This work provides an overview of recent research addressing non-factoid questions. It focuses on which methods have been applied in each task, the data sets available, challenges and limitations, and possible research directions. From a total of 455 recent studies, we selected 75 papers based on our quality control system and exclusion criteria for an in-depth analysis. This systematic review helped to answer what are the tasks and methods involved in non-factoid, what are the data sets available, what the limitations are, and what is the recommendations for future research.
 
Article
Various applications in remote sensing demand automatic detection of changes in optical satellite images of the same scene acquired over time. This paper investigates how to leverage autoencoders in change vector analysis, in order to better delineate possible changes in a couple of co-registered, optical satellite images. Let us consider both a primary image and a secondary image acquired over time in the same scene. First an autoencoder artificial neural network is trained on the primary image. Then the reconstruction of both images is restored via the trained autoencoder so that the spectral angle distance can be computed pixelwise on the reconstructed data vectors. Finally, a threshold algorithm is used to automatically separate the foreground changed pixels from the unchanged background. The assessment of the proposed method is performed in three couples of benchmark hyperspectral images using different criteria, such as overall accuracy, missed alarms and false alarms. In addition, the method supplies promising results in the analysis of a couple of multispectral images of the burned area in the Majella National Park (Italy).
 
Article
Cross-border trade barriers introduced by national authorities to protect local business and labor force cause substantial damage to international economical actors. Therefore, identifying such barriers beyond regulator’s audit reporting is of paramount importance. This paper contributes towards this goal by proposing a novel approach that uses natural language processing and deep learning method for uncovering Finnish-Russian trade barriers in the fish industry from selected business discussion forums. Especially, the approach makes use i) a three-leg ontology for data collection, ii) a BERT architecture for mapping Onkivisit-Shaw-Kananen trade barrier ontology to negative polarity posts and, iii) a new reverse-engineering clustering approach to identify the causes of individual trade-barrier types. A comparison with official statistical reports has been carried out to identify the salient aspects of trade-barriers that hold regardless of the time difference. The findings reveal the dominance of the Time-length barrier type in the Finnish discussion forum dataset and import vs export tariff discrepancy and product requirement barrier types in the Russian forum dataset. The developed framework can serve as a tool to assist companies or regulators in providing business-related recommendations to overcome the detected trade barriers.
 
The ProdSpot Architecture
Example listing with titles, seed surface forms, and generated sentences
Product mention characterization for each category
Article
As online purchasing becomes more popular, users trust more information published on social media than on advertisement content. Opinion mining is often applied to social media, and opinion target extraction is one of its main sub-tasks. In this paper, we focus on recognizing target entities related to electronic products. We propose a method called ProdSpot, for training a named entity extractor to identify product mentions in user text based on the distant supervision paradigm. ProdSpot relies only on an unlabeled set of product offer titles and a list of product brand names. Initially, surface forms are identified from product titles. Given a collection of user posts, our method selects sentences that contain at least one surface form to be automatically labeled. A cluster-based filtering strategy is applied to detect and filter out possible mislabelled sentences. Finally, data augmentation is used to produce more general and diverse training. The set of augmented sentences constitutes the training set to train a recognition model. Experiments demonstrate that the training data automatically generated yields results similar to those achieved by a supervised model. Our best result for precision is only 9% lower than a supervised model, while our recall level is higher by approximately 7% in two distinct product categories. Compared to a state-of-the-art supervised method specifically designed to recognize mobile phone names, our method achieved competitive results with F1 values only 4% lower while not requiring user supervision. Our filtering and data augmentation steps directly influence these results.
 
Article
The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is different in terms of syntax, grammar, etc. This paper presents a new language-independent representation approach for sentiment analysis, SentiCode. Unlike previous work in multilingual sentiment analysis, the proposed approach does not rely on machine translation to bridge the gap between different languages. Instead, it exploits common features of languages, such as part-of-speech tags used in Universal Dependencies. Equally important, SentiCode enables sentiment analysis in multi-language and multi-domain environments simultaneously. Several experiments were conducted using machine/deep learning techniques to evaluate the performance of SentiCode in multilingual (English, French, German, Arabic, and Russian) and multi-domain environments. In addition, the vocabulary proposed by SentiCode and the effect of each token were evaluated by the ablation method. The results highlight the 70% accuracy of SentiCode, with the best trade-off between efficiency and computing time (training and testing) in a total of about 0.67 seconds, which is very convenient for real-time applications.
 
Article
Cloud hosting is a kind of storage that enables users to access, save, and manage their data in a secure and private cloud environment. As a result of this choice, users are no longer need to maintain and build their storage infrastructure on their computers or servers. Many businesses are hesitant to embrace cloud storage because of the complexities of data privacy and security issues. An easy-to-use and secure method for cloud storage sharing and data access is proposed in this study, which may be implemented quickly and easily. This solution requires users to have a secure password and biometric data in order to function properly. Their capacity to deceive consumers into disclosing critical information to their service providers is the primary reason for this problem. Cloud storage systems must have a secure framework in place in order for users to connect to and interact with one another. Many benefits of cloud storage exist, including enabling users to store and manage their data in a safe environment. Users can regulate and manage their data security while using cloud storage services. While implementing a safe and authenticated data storage model, this article addresses the different elements that must be taken into consideration. Several procedures have been established to deal with this problem. Unfortunately, they are not sufficiently secure to prevent a wide variety of security intrusions from taking place on them. When encrypting stored cloud data, the Fully Homomorphic multikey Encryption (FHE) algorithm is utilized. They also have a vulnerability in their protocol that makes it susceptible to both user and serverside attacks. When it comes to remote access, cloud data and data sharing between geographically dispersed devices is a reliable protocol to use.
 
Article
Emergency Departments (EDs) are the most overcrowded places in public hospitals. Machine learning can support decisions on effective ED resource management by accurately forecasting the number of ED visits. In addition, Explainable Artificial Intelligence (XAI) techniques can help explain decisions from forecasting models and address challenges like lack of trust in machine learning results. The objective of this paper is to use machine learning and XAI to forecast and explain the ED visits on the next on duty day. Towards this end, a case study is presented that uses the XGBoost algorithm to create a model that forecasts the number of patient visits to the ED of the University Hospital of Ioannina in Greece, based on historical data from patient visits, time-based data, dates of holidays and special events, and weather data. The SHapley Additive exPlanations (SHAP) framework is used to explain the model. The evaluation of the forecasting model resulted in an MAE value of 18.37, revealing a more accurate model than the baseline, with an MAE of 29.38. The number of patient visits is mostly affected by the day of the week of the on duty day, the mean number of visits in the previous four on duty days, and the maximum daily temperature. The results of this work can help policy makers in healthcare make more accurate and transparent decisions that increase the trust of people affected by them (e.g., medical staff).
 
Article
Hierarchical clustering of multivariate data usually provide useful information on the similarity among elements. Unfortunately, the clustering does not immediately suggest the data-governing structure. Moreover, the number of information retrieved by the data clustering can be sometimes so large to make the results little interpretable. This work presents two tools to derive relevant information from a large number of quantitative multivariate data, simply by post-processing the dendrograms resulting from hierarchical clustering. The first tool helps gaining a good insight in the physical relevance of the obtained clusters, i.e. whether the detected families of elements result from true or spurious similarities due to, e.g., experimental uncertainty. The second tool provides a deeper knowledge of the factors governing the distribution of the elements in the multivariate space, that is the determination of the most relevant parameters which affect the similarities among the configurations. These tools are, in particular, suitable to process experimental results to cope with related uncertainties, or to analyse multivariate data resulting from the study of complex or chaotic systems.
 
Example of a product graph for a typical catalog of the Smartphone category
Example of reviews for products illustrated in Fig. 1. ♠ represents an opinion that should be mapped to a product attribute. ♢ identifies expressions associated with the product itself. ♣ identifies aspect expressions that should be associated to other
Example of an enriched product graph corresponding to the product graph from Fig. 1 added opinions taken from Fig. 2
Architecture of our proposed Neural Network model
Proportion of the class unbalancing for the five product categories applied in this study. The proportion is obtained from each thresholds 𝜖 used for building training examples
Article
Product Graphs (PGs) are knowledge graphs that structure the relationship of products and their characteristics. They have become very popular lately due to their potential to enable AI-related tasks in e-commerce. With the rise of social media, many dynamic and subjective information on products and their characteristics became widely available, creating an opportunity to aggregate such information to PGs. In this paper, we propose a method called PGOpi (Product Graph enriched with Opinions), whose goal is to enrich existing PGs with subjective information extracted from reviews written by customers. PGOpi uses a deep learning model to map opinions extracted from user reviews to nodes in the PG corresponding to targets of these opinions. To alleviate manual labor dependency for training the model, we devise a distant supervision strategy based on word embeddings. We have performed an extensive experimental evaluation on five product categories of two representative real-world datasets. The proposed unsupervised approach achieves superior micro F1 score over more complex unsupervised models. It also presents comparable results to a fully-supervised model.
 
Article
Time series forecasting is one of the most active research topics. Machine learning methods have been increasingly adopted to solve these predictive tasks. However, in a recent work, evidence was shown that these approaches systematically present a lower predictive performance relative to simple statistical methods. In this work, we counter these results. We show that these are only valid under an extremely low sample size. Using a learning curve method, our results suggest that machine learning methods improve their relative predictive performance as the sample size grows. The R code to reproduce all of our experiments is available at https://github.com/vcerqueira/MLforForecasting.
 
Article
Functional requirements on a software system are traditionally captured as text that describes the expected functionality in the domain of a real-world system. Natural language processing methods allow us to extract the knowledge from such requirements and transform it, e.g., into a model. Moreover, these methods can improve the quality of the requirements, which usually suffer from ambiguity, incompleteness, and inconsistency. This paper presents a novel approach to using natural language processing. We use the method of grammatical inspection to find specific patterns in the description of functional requirement specifications (written in English). Then, we transform the requirements into a model of Normalized Systems elements. This may realize a possible component of the eagerly awaited text-to-software pipeline. The input of this method is represented by textual requirements. Its output is a running prototype of an information system created using Normalized Systems (NS) techniques. Therefore, the system is ready to accept further enhancements, e.g., custom code fragments, in an evolvable manner ensured by compliance with the NS principles. A demonstration of pipeline implementation is also included in this paper. The text processing part of our pipeline extends the existing pipeline implemented in our system TEMOS, where we propose and implement methods of checking the quality of textual requirements concerning ambiguity, incompleteness, and inconsistency.
 
A graphical representation of the proposed approaches for dealing with negative preferences
A sample restaurant recommendation
Content of the four recommendation lists
Article
Negative information plays an important role in the way we express our preferences and desires. However, it has not received the same attention as positive feedback in recommender systems. Here we show how negative user preferences can be exploited to generate recommendations. We rely on a logical semantics for the recommendation process introduced in a previous paper and this allows us to single out three main conceptual approaches, as well as a set of variations, for dealing with negative user preferences. The formal framework provides a common ground for analysis and comparison. In addition, we show how existing approaches to recommendation correspond to alternatives in our framework.
 
Articles shortlisting procedure for the systematic literature review
Distribution of Journal Articles by Publication Year
Article
Cold Start problems in recommender systems pose various challenges in the adoption and use of recommender systems, especially for new item uptake and new user engagement. This restricts organizations to realize the business value of recommender systems as they have to incur marketing and operations costs to engage new users and promote new items. Owing to this, several studies have been done by recommender systems researchers to address the cold start problems. However, there has been very limited recent research done on collating these approaches and algorithms. To address this gap, the paper conducts a systematic literature review of various strategies and approaches proposed by researchers in the last decade, from January 2010 to December 2021, and synthesizes the same into two categories: data-driven strategies and approach-driven strategies. Furthermore, the approach-driven strategies are categorized into five main clusters based on deep learning, matrix factorization, hybrid approaches, or other novel approaches in collaborative filtering and content-based algorithms. The scope of this study is limited to a systematic literature review and it does not include an experimental study to benchmark and recommend the best approaches and their context of use in cold start scenarios.
 
Article
COVID-19 pandemic has fueled the interest in artificial intelligence tools for quick diagnosis to limit virus spreading. Over 60% of people who are infected complain of a dry cough. Cough and other respiratory sounds were used to build diagnosis models in much recent research. We propose in this work, an augmentation pipeline which is applied on the pre-filtered data and uses i) pitch-shifting technique to augment the raw signal and, ii) spectral data augmentation technique SpecAugment to augment the computed mel- spectrograms. A deep learning based architecture that hybridizes convolution neural networks and long-short term memory with an attention mechanism is proposed for building the classification model. The feasibility of the proposed is demonstrated through a set of testing scenarios using the large-scale COUGHVID cough dataset and through a comparison with three baselines models. We have shown that our classification model achieved 91.13% of testing accuracy, 90.93% of sensitivity and an area under the curve of receiver operating characteristic of 91.13%.
 
Article
Depression is a common mental disorder, which may lead to suicide when the condition is severe. With the advancement of technology, there are billions of people who share their thoughts and feelings on social media at any time and from any location. Social media data has therefore become a valuable resource to study and detect the depression of the user. In our work, we use Instagram as the platform to study depression detection. We use hashtags to find users and label them as depressive or non-depressive according to their self-statement. Text, image, and posting time are used jointly to detect depression. Furthermore, the time interval between posts is important information when studying medical-related data. In this paper, we use time-aware LSTM to handle the irregularity of time intervals in social media data and use an attention mechanism to pay more attention to the posts that are important for detecting depression. Experiment results show that our model outperforms previous work with an F1-score of 95.6%. In addition to the good performance on Instagram, our model also outperforms state-of-the-art methods in detecting depression on Twitter with an F1-score of 90.8%. This indicates the potential of our model to be a reference for psychiatrists to assess the patient; or for users to know more about their mental health condition.
 
Top-cited authors
Manfred Reichert
  • Ulm University
Peter Dadam
  • Ulm University
Thi Ngoc Trang Tran
  • Graz University of Technology
Muesluem Atas
  • Graz University of Technology
Yannis Tzitzikas
  • University of Crete