Chapter

A Hybrid Supervised/Unsupervised Machine Learning Approach to Classify Web Services

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Reusing software is a promising way to reduce software development costs. Nowadays, applications compose available web services to build new software products. In this context, service composition faces the challenge of proper service selection. This paper presents a model for classifying web services. The service dataset has been collected from the well-known public service registry called ProgrammableWeb. The results were obtained by breaking service classification into a two-step process. First, Natural Language Processing(NLP) pre-processed web service data have been clustered by the Agglomerative hierarchical clustering algorithm. Second, several supervised learning algorithms have been applied to determine service categories. The findings show that the hybrid approach using the combination of hierarchical clustering and SVM provides acceptable results in comparison with other unsupervised/supervised combinations.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Such services need to be discovered, usually though a manual search of the developer. In SmartCLIDE service discovery is assisted by an AI agent that receives as input necessary functional information about the service that is going to be used, and goes through several sources (e.g., programmable web) for identifying fitting services [2], [3]. ...
... After the user sends a query with the service specification, through a Natural Language Query Interface, the discovery will draw results from web pages, code repositories and Service Registries by invoking 3 rd party search APIs. SmartCLIDE can rewrite the provided user input query on the basis of indexed popular search queries leveraging AI-based techniques, displaying the results of identified services to the user as a ranked list [2], [3]. ...
Conference Paper
Nowadays the majority of cloud applications are developed based on the Service-Oriented Architecture (SOA) paradigm. Large-scale applications are structured as a collection of well-integrated services that are deployed in public, private or hybrid cloud. Despite the inherent benefits that service-based cloud development provides, the process is far from trivial, in the sense that it requires the software engineer to be (at least) comfortable with the use of various technologies in the long cloud development toolchain: programming in various languages, testing tools, build / CI tools, repositories, deployment mechanisms, etc. In this paper, we propose an approach and corresponding toolkit (termed SmartCLIDE-as part of the results of an EU-funded research project) for facilitating SOA-based software development for the cloud, by extending a well-known cloud IDE from Eclipse. The approach aims at shortening the toolchain for cloud development, hiding the process complexity and lowering the required level of knowledge from software engineers. The approach and tool underwent an initial validation from professional cloud software developers. The results underline the potential of such an automation approach, as well as the usability of the research prototype, opening further research opportunities and providing benefits for practitioners.
Article
Full-text available
Over the last decades, web services are used for performing specific tasks demanded by users. The most important task of service’s classification system is to match an anonymous input service with the stored pre-classified web services. The most challenging issue is that web services are currently organized and classified according to syntax while the context of the requested service is ignored. Due to this motivation, Cloud-based Classification Methodology is proposed as it presents a new methodology based on semantic web service’s classification. Furthermore, cloud computing is used for not only storing but also allocating the high scale of web services with both high availability and accessibility. Fog technology is employed to reduce the latency and to speed up response time. The experimental results using the suggested methodology show a better performance of the proposed system regarding both precision and accuracy in comparison with most of the methods discussed in the literature of the current study.
Article
Full-text available
One of the main assets of the Service Oriented Architecture (SOA) is composition, which consists in developing higher-level services by re-using well-known functionality provided by other services in a low-cost and rapid development process. In this paper, we present IDECSE a new integrated approach for composite services engineering. By considering semantic Web services, IDECSE addresses the challenge of fully automating the classification, discovery and composition while reducing development time and cost. The classification and the discovery processes rely on adequate semantic similarity measures. Both semantic and syntactic descriptions are integrated through specific techniques for computing similarity measures between services. Formal Concept Analysis (FCA) is used then to classify Web services into concept lattices in order to facilitate relevant services identification. A graph based semantic Web service composition process was proposed within the IDECSE framework. Using semantic similarities in grouping classes of services and in composing services shows a significant improvement compared to other approaches.
Article
Full-text available
Supervised machine learning studies are gaining more significant recently because of the availability of the increasing number of the electronic documents from different resources. Text classification can be defined that the task was automatically categorized a group documents into one or more predefined classes according to their subjects. Thereby, the major objective of text classification is to enable users for extracting information from textual resource and deals with process such as retrieval, classification, and machine learning techniques together in order to classify different pattern. In text classification technique, term weighting methods design suitable weights to the specific terms to enhance the text classification performance. This paper surveys of text classification, process of different term weighing methods and comparison between different classification techniques.
Conference Paper
Full-text available
How to classify and organize the semantic Web services to help users find the services to meet their needs quickly and accurately is a key issue to be solved in the era of service-oriented software engineering. This paper makes full use the characteristics of solid mathematical foundation and stable classification efficiency of naive bayes classification method. It proposes a semantic Web service classification method based on the theory of naive bayes. It elaborates the concrete process of how to use the three stages of bayesian classification to classify the semantic Web services in the consideration of service interface and execution capacity. The information gain theory is used to determine the classification influence of different features. Finally, the experiments are used to validate the proposed methods.
Article
Full-text available
A Web service is a Web accessible software that can be published, located and invoked by using standard Web protocols. Automatically determining the category of a Web service, from several pre-defined categories, is an important problem with many applications such as service discovery, semantic annotation and service matching. This paper describes AWSC (Automatic Web Service Classification), an automatic classifier of Web service descriptions. AWSC exploits the connections between the category of a Web service and the information commonly found in standard descriptions. In addition, AWSC bridges different styles for describing services by combining text mining and machine learning techniques. Experimental evaluations show that this combination helps our classification system at improving its precision. In addition, we report an experimental comparison of AWSC with a related work.
Article
Full-text available
The Web is gradually evolving as provider of services along with its text and image processing functions. Web services markup is proposed in the Defense advance research project agency's agent markup language (DAML) family of semantic Webmarkup languages. The markup provide an agent-independant declarative API to capture the data and metadata associated with a service. Sharing, reuse, composition, mapping and succint local Web service markup is facilitated by the exploitation of ontologies by markup. A wide variety of agent technologies for automated Web services discovery, execution, composition and interoperation is enabled by this markup.
Article
Full-text available
The rapid evolution and expansion of wireless-enabled environments have increased the need for sophisticated service discovery protocols (SDPs). Typically, service discovery involves a client, service provider, and lookup or directory server. The paper discusses Bluetooth (http://www.bluetooth.com) short-range wireless technology. The Bluetooth protocol stack includes specifications that define the SDP, RFCOMM (for cable replacement), the logical link control and adaptation protocol (L2CAP), a host controller interface (HCI), the link manager protocol (LMP), the base-band protocol, and a radio frequency (RF) protocol. The paper considers Bluetooth service discovery improvements with semantic matching
Article
As per the global digital report, 52.9% of the world population is using the internet, and 42% of the world population is actively using e-commerce, banking, and other online applications. Web services are software components accessed using networked communications and provide services to end users. Software developers provide a high quality of web service. To meet the demands of user requirements, it is necessary for a developer to ensure quality architecture and quality of services. To meet the demands of user measure service quality by the ranking of web services, in this paper, we analyzed QWS dataset and found important parameters are best practices, successability, availability, response time, reliability and throughput, and compliance. We have used various data mining techniques and conducted experiments to classify QWS data set into four categorical values as class1, 2, 3, and 4. The results are compared with various techniques random forest, artificial neural network, J48 decision tree, extreme gradient boosting, K-nearest neighbor, and support vector machine. Multiple classifiers analyzed, and it was observed that the classifier technique eXtreme gradient boosting got the maximum accuracy of 98.44%, and random forest got the accuracy of 98.13%. In future, we can extend the quality of web service for mixed attributes.
Conference Paper
With the development of Web Service Technology, the quantity of the web services published on the Internet is increasing rapidly. Recognizing each web service intelligently becomes the key of efficiently using Internet. And the first step of recognization is to classify the web services accurately. To classify a huge amount of web services becomes a difficulty job. Therefore, in order to support applications of web services more effectively, an automatic web service classification method is needed. In this paper, the common WSDL files are regarded as the study object. Since web service is described by WSDL, the traditional document classification method cannot be applied directly. In the paper, a new method is proposed which applies automatic web service semantic annotation and uses three classification method: Naive Bayes, SVM and REP Tree, furthermore ensemble learning is applied. According to the experiment done on 951 WSDL files and 19 categories, the accuracy was 87.39%.
Article
With the continuous expansion of data availability in many large-scale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decision-making processes. Although existing knowledge discovery and data engineering techniques have shown great success in many real-world applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data.