Chapter

Enhancing Healthcare Insights Through Integration of AI and Covering-Based Rough Set Theory in Web Mining

Authors:
  • Facts Aout Color
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In modern knowledge-rich healthcare scene, combining artificial intelligence (AI) approaches alongside covering-based rough set theory offers a potential method for identifying significant trends in web-mined health care information. Traditional data analysis tools frequently struggle to deal with the inherent inconsistency and complicated nature of healthcare data, posing obstacles for making choices and treatment of patients. Nevertheless, by combining the strength of AI with the solid foundation of covering-based rough set theory, healthcare organisations can open up new avenues to enhanced processes for making decisions, increasing patient outcomes, and encouraging creativity in the delivery of healthcare. This study investigates the complementary nature of AI approaches and covering-based rough set theory in healthcare data analysis, highlighting its potential to revolutionise the delivery of health care through facilitating personalised therapy, optimising the deployment of resources, and improving the overall effectiveness of care provided to patients.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This article explores the transformative impact of Artificial Intelligence (AI) in healthcare, with a specific focus on how predictive analytics and decision support systems are revolutionizing patient care. Predictive analytics enable early disease prevention and diagnosis by identifying patterns and risk factors, contributing to improved patient outcomes and cost-effective healthcare. Machine learning facilitates personalized treatment plans, leveraging individual patient data for tailored interventions that enhance efficacy and minimize adverse effects. AI-driven algorithms in medical imaging enhance diagnostic accuracy, providing rapid and precise assessments. Decision support systems, powered by AI, streamline healthcare workflows by offering real-time insights based on patient data and clinical guidelines, facilitating evidence-based decision-making. Remote patient monitoring, facilitated by AI, allows for proactive healthcare interventions by tracking vital signs and identifying potential health issues in real time. The article also discusses challenges and ethical considerations associated with AI integration in healthcare, emphasizing the importance of responsible deployment and regulatory frameworks. The comprehensive exploration underscores how AI is not only transforming patient care but also shaping the future of healthcare delivery.
Article
Full-text available
This article explores the with a specific focus on how predictive analytics and decision support systems are revolutionizing patient care. Predictive analytics enable early disease prevention and diagnosis by iden outcomes and cost treatment plans, leveraging individual patient data for tailored interventions that enhance efficacy and m enhance diagnostic accuracy, providing rapid and precise assessments. Decision support systems, powered by AI, streamline healthcare workflows by offering real based on patient making. Remote patient monitoring, facilitated by AI, allows for proactive healthcare interventions by tracking vital signs and identifying potential health issues in real time. The article also discusses challenges and ethical considerations associated with AI integration in healthcare, emphasizing the importance of responsible deployment and regulatory frameworks. The comprehensive exploration underscores how AI is not only transforming pa .
Article
Full-text available
Three-way decision (T-WD) theory is about thinking, problem solving, and computing in threes. Behavioral decision making (BDM) focuses on affective, cognitive, and social processes employed by humans for choosing the optimal object, of which prospect theory and regret theory are two widely used tools. The hesitant fuzzy set (HFS) captures a series of uncertainties when it is difficult to specify precise fuzzy membership grades. Guided by the principles of three-way decisions as thinking in threes and integrating these three topics together, this paper reviews and examines advances in three-way behavioral decision making (TW-BDM) with hesitant fuzzy information systems (HFIS) from the perspective of the past, present, and future. First, we provide a brief historical account of the three topics and present basic formulations. Second, we summarize the latest development trends and examine a number of basic issues, such as one-sidedness of reference points and subjective randomness for result values, and then report the results of a comparative analysis of existing methods. Finally, we point out key challenges and future research directions.
Article
Full-text available
Background Long-term care facilities (LCFs) in South Korea have limited knowledge of and capability to care for patients with delirium. They also often lack an electronic medical record system. These barriers hinder systematic approaches to delirium monitoring and intervention. Therefore, this study aims to develop a web-based app for delirium prevention in LCFs and analyse its feasibility and usability. Methods The app was developed based on the validity of the AI prediction model algorithm. A total of 173 participants were selected from LCFs to participate in a study to determine the predictive risk factors for delerium. The app was developed in five phases: (1) the identification of risk factors and preventive intervention strategies from a review of evidence-based literature, (2) the iterative design of the app and components of delirium prevention, (3) the development of a delirium prediction algorithm and cloud platform, (4) a pilot test and validation conducted with 33 patients living in a LCF, and (5) an evaluation of the usability and feasibility of the app, completed by nurses (Main users). Results A web-based app was developed to predict high risk of delirium and apply preventive interventions accordingly. Moreover, its validity, usability, and feasibility were confirmed after app development. By employing machine learning, the app can predict the degree of delirium risk and issue a warning alarm. Therefore, it can be used to support clinical decision-making, help initiate the assessment of delirium, and assist in applying preventive interventions. Conclusions This web-based app is evidence-based and can be easily mobilised to support care for patients with delirium in LCFs. This app can improve the recognition of delirium and predict the degree of delirium risk, thereby helping develop initiatives for delirium prevention and providing interventions. Moreover, this app can be extended to predict various risk factors of LCF and apply preventive interventions. Its use can ultimately improve patient safety and quality of care.
Article
Full-text available
The evolution of the coronavirus (COVID-19) disease took a toll on the social, healthcare, economic, and psychological prosperity of human beings. In the past couple of months, many organizations, individuals, and governments have adopted Twitter to convey their sentiments on COVID-19, the lockdown, the pandemic, and hashtags. This paper aims to analyze the psychological reactions and discourse of Twitter users related to COVID-19. In this experiment, Latent Dirichlet Allocation (LDA) has been used for topic modeling. In addition, a Bidirectional Long Short-Term Memory (BiLSTM) model and various classification techniques such as random forest, support vector machine, logistic regression, naive Bayes, decision tree, logistic regression with stochastic gradient descent optimizer, and majority voting classifier have been adapted for analyzing the polarity of sentiment. The effectiveness of the aforesaid approaches along with LDA modeling has been tested, validated, and compared with several benchmark datasets and on a newly generated dataset for analysis. To achieve better results, a dual dataset approach has been incorporated to determine the frequency of positive and negative tweets and word clouds, which helps to identify the most effective model for analyzing the corpora. The experimental result shows that the BiLSTM approach outperforms the other approaches with an accuracy of 96.7%.
Article
Full-text available
The accurate and effective prediction of the traffic flow of vehicles plays a significant role in the construction and planning of signalized road intersections. The application of artificially intelligent predictive models in the prediction of the performance of traffic flow has yielded positive results. However, much uncertainty still exists in the determination of which artificial intelligence methods effectively resolve traffic congestion issues, especially from the perspective of the traffic flow of vehicles at a four-way signalized road intersection. A hybrid algorithm, an artificial neural network trained by a particle swarm optimization model (ANN-PSO), and a heuristic Artificial Neural Network model (ANN) were compared in the prediction of the flow of traffic of vehicles using the South Africa transportation system as a case study. Two hundred and fifty-nine (259) traffic datasets were obtained from the South African road network using inductive loop detectors, video cameras, and GPS-controlled equipment. For the ANN and ANN-PSO training and testing, 219 traffic data were used for the training, and 40 were used for the testing of the ANN-PSO model, while training (160), testing (40), and validation (59) was used for the ANN. The ANN result presented a logistic sigmoid transfer function with a 13–6–1 model and a testing R2 of 0.99169 compared to the ANN-PSO result, which showed a testing performance of R2 0.99710. This result shows that the ANN-PSO model is more efficient and effective than the ANN model in the prediction of the traffic flow of vehicles at a four-way signalized road intersection. Furthermore, the ANN and ANN-PSO models are robust enough to predict traffic flow due to their better testing performance. The modelling approaches proposed in this study will assist transportation engineers and urban planners in designing a traffic control system for traffic lights at four-way signalized road intersections. Finally, the results of this research will assist transportation engineers and traffic controllers in providing traffic flow information and travel guidance for motorists and pedestrians in the optimization of their travel time decision-making.
Article
Full-text available
Clinical decisions are more promising and evidence-based, hence, big data analytics to assist clinical decision-making has been expressed for a variety of clinical fields. Due to the sheer size and availability of healthcare data, big data analytics has revolutionized this industry and promises us a world of opportunities. It promises us the power of early detection, prediction, prevention, and helps us to improve the quality of life. Researchers and clinicians are working to inhibit big data from having a positive impact on health in the future. Different tools and techniques are being used to analyze, process, accumulate, assimilate, and manage large amount of healthcare data either in structured or unstructured form. In this review, we address the need of big data analytics in healthcare: why and how can it help to improve life?. We present the emerging landscape of big data and analytical techniques in the five sub-disciplines of healthcare, i.e., medical image analysis and imaging informatics, bioinformatics, clinical informatics, public health informatics and medical signal analytics. We present different architectures, advantages and repositories of each discipline that draws an integrated depiction of how distinct healthcare activities are accomplished in the pipeline to facilitate individual patients from multiple perspectives. Finally, the paper ends with the notable applications and challenges in adoption of big data analytics in healthcare.
Article
Full-text available
This study examines the current state of artificial intelligence (AI)-based technology applications and their impact on the healthcare industry. In addition to a thorough review of the literature, this study analyzed several real-world examples of AI applications in healthcare. The results indicate that major hospitals are, at present, using AI-enabled systems to augment medical staff in patient diagnosis and treatment activities for a wide range of diseases. In addition, AI systems are making an impact on improving the efficiency of nursing and managerial activities of hospitals. While AI is being embraced positively by healthcare providers, its applications provide both the utopian perspective (new opportunities) and the dystopian view (challenges to overcome). We discuss the details of those opportunities and challenges to provide a balanced view of the value of AI applications in healthcare. It is clear that rapid advances of AI and related technologies will help care providers create new value for their patients and improve the efficiency of their operational processes. Nevertheless, effective applications of AI will require effective planning and strategies to transform the entire care service and operations to reap the benefits of what technologies offer.
Article
Full-text available
The growing healthcare industry is generating a large volume of useful data on patient demographics, treatment plans, payment, and insurance coverage—attracting the attention of clinicians and scientists alike. In recent years, a number of peer-reviewed articles have addressed different dimensions of data mining application in healthcare. However, the lack of a comprehensive and systematic narrative motivated us to construct a literature review on this topic. In this paper, we present a review of the literature on healthcare analytics using data mining and big data. Following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we conducted a database search between 2005 and 2016. Critical elements of the selected studies—healthcare sub-areas, data mining techniques, types of analytics, data, and data sources—were extracted to provide a systematic view of development in this field and possible future directions. We found that the existing literature mostly examines analytics in clinical and administrative decision-making. Use of human-generated data is predominant considering the wide adoption of Electronic Medical Record in clinical care. However, analytics based on website and social media data has been increasing in recent years. Lack of prescriptive analytics in practice and integration of domain expert knowledge in the decision-making process emphasizes the necessity of future research.
Article
Full-text available
Consumption of antimicrobial drugs, such as antibiotics, is linked with antimicrobial resistance. Surveillance of antimicrobial drug consumption is therefore an important element in dealing with antimicrobial resistance. Many countries lack sufficient surveillance systems. Usage of web mined data therefore has the potential to improve current surveillance methods. To this end, we study how well antimicrobial drug consumption can be predicted based on web search queries, compared to historical purchase data of antimicrobial drugs. We present two prediction models (linear Elastic Net, and non-linear Gaussian Processes), which we train and evaluate on almost 6 years of weekly antimicrobial drug consumption data from Denmark and web search data from Google Health Trends. We present a novel method of selecting web search queries by considering diseases and drugs linked to antimicrobials, as well as professional and layman descriptions of antimicrobial drugs, all of which we mine from the open web. We find that predictions based on web search data are marginally more erroneous but overall on a par with predictions based on purchases of antimicrobial drugs. This marginal difference corresponds to <1<1\% point mean absolute error in weekly usage. Best predictions are reported when combining both web search and purchase data. This study contributes a novel alternative solution to the real-life problem of predicting (and hence monitoring) antimicrobial drug consumption, which is particularly valuable in countries/states lacking centralised and timely surveillance systems.
Chapter
Full-text available
INTRODUCTION Cloud computing is all about the usage of resources – everything from Software-as-a-Service (SaaS) to Platform-as-a-Service (PaaS) to Infrastructure-as-a-Service (IaaS) – over the Internet on a pay-per-use basis. It is either a kind of revolutionary new paradigm for Information Technology (IT) service delivery of the products or a new name for a service delivering products as old as IT itself. When we deal with cloud computing, the very first thing comes to our mind is data sharing. A lot of data gets stored on the data servers from various regions of the world, which is stored over cloud running various applications. These applications consume small amount of data and gener-ate large data sets. The basic cloud computing services provided by organisations are often subdivided based on whether they provide SaaS, PaaS, and IaaS, but their needs for standards appear similar. The cloud is being used over a wide range of locations, for various purposes in different areas which include healthcare, education, government, communities, e-commerce, etc. The cloud computation involves a large network of servers, dumb terminals, data storages which all together work in real-time environment.
Article
Full-text available
Covering based rough sets [3,15,16,17,18,19] have been introduced as an extension of the basic rough sets introduced by Pawlak [6]. There is only one way to define the lower approximation of a set using covering. However, as many as four definitions have been proposed for the upper approximation. Accordingly, four types of covering based rough sets have been defined. In this paper we introduced the general concept of kinds of covering based rough sets. Also, some topological properties involving the union and intersection of such sets have been established, which lead to determination of kinds of union and intersection of covering based rough sets. These properties can be used in the representation of distributed knowledge base systems.
Article
Full-text available
this paper we describe an approach to usage-based Web personalization taking into account the full spectrum of Web mining techniques and activities. Our approach is described by the architecture shown in Figure 1, which heavily uses data mining techniques, thus making the personalization process both automatic and dynamic, and hence up-to-date. Specifically, we have developed techniques for preprocessing of Web usage logs and grouping URL references into sets called user transactions [CMS99]. A user transaction is a unit of semantic activity, and performing data mining on them is more meaningful. We describe and compare three different Web usage mining techniques, based on transaction clustering, usage clustering, and association rule discovery, to extract usage knowledge for the purpose of Web personalization. We also propose techniques for combining this knowledge with the current status of an ongoing Web activity to perform real-time personalization. Finally, we provide an experimental evaluation of the proposed techniques using real Web usage data.
Article
Full-text available
The World Wide Web (WWW) continues to grow at an astounding rate in both the sheer volume of traffic and the size and complexity of Web sites. The complexity of tasks such as Web site design, Web server design, and of simply navigating through a Web site have increased along with this growth. An important input to these design tasks is the analysis of how a Web site is being used. Usage analysis includes straightforward statistics, such as page access frequency, as well as more sophisticated forms of analysis, such as finding the common traversal paths through a Web site. Web Usage Mining is the application of data mining techniques to usage logs of large Web data repositories in order to produce results that can be used in the design tasks mentioned above. However, there are several preprocessing tasks that must be performed prior to applying data mining algorithms to the data collected from server logs. This paper presents several data preparation techniques in order to identify unique users and user sessions. Also, a method to divide user sessions into semantically meaningful transactions is defined and successfully tested against two other methods. Transactions identified by the proposed methods are used to discover association rules from real world data using the WEBMINER system [15].
Article
Guest Editors Ivana Išgum, Bennett A. Landman, and Tomaž Vrtovec introduce the JMI Special Section on Advances in High-Dimensional Medical Image Processing.
Article
Factor selection is crucial for any enterprise to make a quick and accurate quotation decision. For the objects of quotation with missing values, traditional rough sets construct their relations (e.g. tolerance, cover) either with objects having known values, or only with those that also include the missing values of the attribute domains. Such classifications may not work well for reduction in many real-world problems. In this paper, by measuring the similarity of objects, an incomplete covering rough set is proposed to derive a cover for attribute reduction. Firstly, a similarity relation is defined by an approximation degree that tunes the relation in line with the semantics of objects, and then a cover is induced. Secondly, a reduct is derived by the relations of objects with respect to covers; the properties of reduction are proven. Finally, an approach is developed by discernibility matrix. The experimental results of the UCI (University of California Irvine) Repository and real-life quotation data sets show the incomplete covering rough set outperforms the compared rough set in the accuracy of factor selection within the comparable computation time. It is also demonstrated that the proposed quotation model is effective in quote prediction with various proportions of missing data.
Article
In real cases, missing values tend to contain meaningful information that should be acquired or should be analyzed before the incomplete dataset is used for machine learning tasks. In this work, two algorithms named jointly fuzzy C-Means and VQNN (Vaguely Quantified Nearest Neighbor) imputation (JFCM-VQNNI) and jointly fuzzy C-Means and fitted VQNN imputation (JFCM-FVQNNI) have been proposed by considering clustering conception and sufficient extraction of uncertain information. In the proposed JFCM-VQNNI and JFCM-FVQNNI algorithm, the missing value is regarded as a decision feature and then the prediction is generated for the objects that containing at least one missing value. Specially, as for JFCM-VQNNI algorithm, indistinguishable matrixes, tolerance relations, and fuzzy membership relations are adopted to identify the potential closest filled values based on corresponding similar objects and related clusters. On the basis of JFCM-VQNNI algorithm, JFCM-FVQNNI algorithm synthetic analyzes the fuzzy membership of the dependent features for instances with each cluster. In order to fill the missing values more accurately, JFCM-FVQNNI algorithm performs fuzzy decision membership adjustment in each object with respect to the related clusters by considering highly relevant decision attributes. The experiments have been carried out on five datasets. Based on the analysis of RMSE, MAE, imputation values with actual values comparison, and classification accuracy results analysis, we can draw the conclusion that the proposed JFCM-FVQNNI and JFCM-VQNNI algorithms yields sufficient and reasonable imputation performance results by comparing with fuzzy C-Means parameter-based imputation algorithm and fuzzy C-Means rough parameter-based imputation algorithm.
Article
Rough Set theory (RST) is a mathematical tool and used to deal with vagueness, impreciseness, inconsistence and uncertain type knowledge. RST-based research has been applied in machine learning, inductive reasoning, decision support systems and knowledge discovery applications. Popular methods like finding of reducts, core, feature selection and reduction through the concepts of approximations have attracted researchers to use RST further in the field of high dimensional data like social networks, IoT applications and Big data analytics. In this article we make an attempt to summarize the basic concepts, characteristics of RST, some evolutionary extensions of RST and applications limited to Medical data analysis. In keeping the view of learners, a survey on RST based software tools and packages outlined with their exhaustive functionalities. It also identifies the importance of RST in the domain of medical or clinical data analytics, and also exhibits the strengths and limitations of the respective underlying approaches.
Article
Data analytics in granular computing framework is considered for several mining applications, such as in video analysis, bioinformatics and online social networks which have all the characteristics of Big data. The role of granulation, lower approximation and r–f information measure is exhibited. While the lower approximation over a video sequence signifies the object model for unsupervised tracking, it characterizes the probability (relative frequency) of definite regions in ranking miRNAs for normal and cancer classification. For neural learning, the information on definite region is used as the initial knowledge for encoding while generating the networks through evolution. Granules considered are of different sizes and dimensions with fuzzy and crisp boundaries. The tracking method is effective in handling different ambiguous situations, e.g., overlapping objects, newly appeared object(s), multiple objects in different directions and speeds, in unsupervised mode. The ranking algorithm could find only 1% miRNAs to result in significantly higher F-score than the entire set. Fuzzy–rough communities detected over the granular model of social networks are suitable in dealing with overlapping virtual communities in Big data. The knowledge encoding based on fuzzy–rough set provides superior performance than that of rough set. Future directions of research and challenges including the significance of z-numbers in precisiation of granules are stated. The article includes some of the results published elsewhere.
Conference Paper
Consumption of antimicrobial drugs, such as antibiotics, is linked with antimicrobial resistance. Surveillance of antimicrobial drug consumption is therefore an important element in dealing with antimicrobial resistance. Many countries lack sufficient surveillance systems. Usage of web mined data therefore has the potential to improve current surveillance methods. To this end, we study how well antimicrobial drug consumption can be predicted based on web search queries, compared to historical purchase data of antimicrobial drugs. We present two prediction models (linear Elastic Net, and nonlinear Gaussian Processes), which we train and evaluate on almost 6 years of weekly antimicrobial drug consumption data from Denmark and web search data from Google Health Trends. We present a novel method of selecting web search queries by considering diseases and drugs linked to antimicrobials, as well as professional and layman descriptions of antimicrobial drugs, all of which we mine from the open web. We find that predictions based on web search data are marginally more erroneous but overall on a par with predictions based on purchases of antimicrobial drugs. This marginal difference corresponds to < 1% point mean absolute error in weekly usage. Best predictions are reported when combining both web search and purchase data. This study contributes a novel alternative solution to the real-life problem of predicting (and hence monitoring) antimicrobial drug consumption, which is particularly valuable in countries/states lacking centralised and timely surveillance systems.
Article
Purpose: To our knowledge, integration of Web content mining of publicly available addresses with a geographic information system (GIS) has not been applied to the timely monitoring of medical technology adoption. Here, we explore the diffusion of a new breast imaging technology, digital breast tomosynthesis (DBT). Methods: We used natural language processing and machine learning to extract DBT facility location information using a set of potential sites for the New England region of the United States via a Google search application program interface. We assessed the accuracy of the algorithm using a validated set of publicly available addresses of locations that provide DBT from the DBT technology vendor, Hologic. We quantified precision, recall, and F1 score, aiming for an F1 score of ≥ 95% as the desirable performance. By reverse geocoding on the basis of the results of the Google Maps application program interface, we derived a spatial data set for use in an ArcGIS environment. Within the GIS, a host of spatiotemporal analyses and geovisualization techniques are possible. Results: We developed a semiautomated system that integrated DBT location information into a GIS that was feasible and of reasonable quality. Initial accuracy of the algorithm was poor using only a search term list for information retrieval (precision, 35%; recall, 44%; F1 score, 39%), but performance dramatically improved by leveraging natural language processing and simple machine learning techniques to isolate single, valid instances of DBT location information (precision, 92%; recall, 96%; F1 score, 94%). Reverse geocoding yielded reliable geographic coordinates for easy implementation into a GIS for mapping and planned monitoring. Conclusion: Our novel approach can be applicable to technologies beyond DBT, which may inform equitable access over time and space.
Article
The deterioration in a corporation's profitability not only threatens its interests and sustainable development but also causes tremendous losses to other investors. Hence, constructing an effective pre-warning model for performance forecasting is an urgent requirement. Most previous studies only analyzed monetary-based ratios, but merely considering such ratios does not depict the full perspective of a corporation's business conditions. This study thus extends monetary-based ratios to non-monetary-based ratios and aggregates them through the analytic network process (ANP) with a risk-adjusted strategy to establish performance ranks of corporations. Analyzing a corporation's business relationships can help it to react to changes in the market and improve profit margins, as it draws upon such relationship networks for the transfer of scarce resources and knowledge. We believe that no current study adopts such a method to construct a forecasting model. To fill this gap in the literature, this study implements the social network (SN) technique to examine a corporation's competitive edge from seemingly noisy big media data, which are subsequently fed into an artificial intelligence (AI)-based technique to construct the model. The introduced model, examined through real-life cases under numerous conditions, offers a promising alternative for performance forecasting.
Chapter
In today’s world, companies not only compete on products or services but also on how they can analyze and mine data in order to gain insights for competitive advantages and long term growth. With the exponential growth of data, companies now face unprecedented challenges, however are also presented with numerous opportunities for competitive growth. Advancement in data capturing devices and the existence of multi-generation systems in organizations have increased the number of data sources. Typically, data generated from different devices may not be compatible with each other, which calls for data integration. Although, ETL market offers a wide variety of tools for data integration, it is still common for companies to use SQL to manually produce in-house ETL tools. There are technological and managerial challenges to deal with data integration. During data integration, data quality must be embedded in it. Big data analytics delivers insights which can be used for effective business decisions. However, some of these insights may invade consumer privacy. With more and more data related to consumer behavior being collected and the advancement in big data analytics, privacy has become an increasing concern. Therefore, it is necessary to address issues related to privacy laws, consumer protections and best practices to safeguard privacy. In this chapter, we will discuss topics related to big data in the area of big data integration, big data quality, big data privacy, and big data analytics.
Article
We investigate in this paper approximate operations on sets, approximate equality of sets, and approximate inclusion of sets. The presented approach may be considered as an alternative to fuzzy sets theory and tolerance theory. Some applications are outlined.
Article
A fuzzy set is a class of objects with a continuum of grades of membership. Such a set is characterized by a membership (characteristic) function which assigns to each object a grade of membership ranging between zero and one. The notions of inclusion, union, intersection, complement, relation, convexity, etc., are extended to such sets, and various properties of these notions in the context of fuzzy sets are established. In particular, a separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.
Conference Paper
Say you are looking for information about a particular person. A search engine returns many pages for that person's name but which pages are about the person you care about, and which are about other people who happen to have the same name? Furthermore, if we are looking for multiple people who are related in some way, how can we best leverage this social network? This paper presents two unsupervised frameworks for solving this problem: one based on link structure of the Web pages, another using Agglomerative/Conglomerative Double Clustering (A/CDC)---an application of a recently introduced multi-way distributional clustering method. To evaluate our methods, we collected and hand-labeled a dataset of over 1000 Web pages retrieved from Google queries on 12 personal names appearing together in someones in an email folder. On this dataset our methods outperform traditional agglomerative clustering by more than 20%, achieving over 80% F-measure.
Conference Paper
Many researches on mining the Web, especially CGM (Consumer Generated Media) such as Web logs, for knowledge about various phenomena and events in the physical world have been done actively, and Web services with the Web-mined knowledge have begun to be developed for the public. However, there is no detailed investigation on how accurately Web-mined data reflect real-world data. It must be problematic to idolatrously utilize the Web-mined data in public Web services without ensuring their accuracy sufficiently. Therefore, this paper defines the basic Web log Sensor with a neutral, positive, or negative description for a target phenomenon, and their linearly-combined Web log Sensors, and tries to validate the potential and reliability of these Web log Sensors' spatio-temporal data by measuring the correlation with weather (precipitation) and earthquake (maximum seismic intensity and number of felt quakes) statistics per day by region of Japan Meteorological Agency as real-world data.
Book
pdf contains a preliminary version of the book
Conference Paper
Uncertainty and incompleteness of knowledge are widespread phenomena in information systems. Rough set theory is a tool for dealing with granularity and vagueness in data analysis. Rough set method has already been applied to various fields such as process control, economics, medical diagnosis, biochemistry, environmental science, biology, chemistry psychology, and conflict analysis. Covering-based rough set theory is an extension to classical rough sets. In this paper we study relationship between several basic concepts involved in covering-based rough sets. In this way we will have a better understanding of covering-based rough sets.
AI-driven innovations in healthcare: improving diagnostics and patient care
  • B Y Kasula
Hybrid grid partition and rought set methods for generating fuzzy rules in supply chain
  • P M F Marsoit
  • P V Pernadate
  • J F Jérôme
Rough set based green cloud computing in emerging markets
  • P S Shivalkar
  • B K Tripathy
Approximation in space (U, π)
  • W Zakowski