Fig 5 - uploaded by Attila Kertész
Content may be subject to copyright.
Architecture of the proposed solution

Architecture of the proposed solution

Context in source publication

Context 1
... also provides a JSONbased web service for managing multiple Scrapy projects at the same time. We realized the service as a container-based Docker application as depicted in Figure 5. A Flask python server application is responsible for regularly (2 hours interval is predefined) starting Spider jobs and managing requests among the (i) Scrapy micro-service, where the crawling logic is defined, the (ii) MongoDB database micro-service, where Table I summarizes the REST endpoints of the crawler service revealing its potential internal operations: we can list the actually running Scrapyd jobs, add further jobs, and query gathered data by the jobs from the database. ...

Similar publications

Article
Full-text available
Organizations often collect private data and release aggregate statistics for the public’s benefit. If no steps toward preserving privacy are taken, adversaries may use released statistics to deduce unauthorized information about the individuals described in the private dataset. Differentially private algorithms address this challenge by slightly p...
Preprint
Full-text available
With the recent advances in A.I. methodologies and their application to medical imaging, there has been an explosion of related research programs utilizing these techniques to produce state-of-the-art classification performance. Ultimately, these research programs culminate in submission of their work for consideration in peer reviewed journals. To...
Article
Full-text available
Skin cancer is one of the most common human malignancies, which is generally diagnosed by screening and dermoscopic analysis followed by histopathological assessment and biopsy. Deep-learning-based methods have been proposed for skin lesion classification in the last few years. The major drawback of all methods is that they require a considerable a...
Article
Full-text available
Background Artificial intelligence (AI) persists as a focal subject within the realm of medical imaging, heralding a multitude of prospective applications that span the comprehensive imaging lifecycle. However, a key hurdle to the development and real-world application of AI algorithms is the necessity for large amounts of well-organized and carefu...
Article
Full-text available
Diversas tecnologias surgiram nas últimas décadas que propiciaram um crescimento considerável no volume e complexidade dos repositórios digitais de imagem. Diante desse desenvolvimento, também se faz necessária a criação de novas estratégias de tratamento da informação, sobretudo àquelas ligadas à organização e recuperação. Historicamente, o uso de...

Citations

... An enhancement to SUMMON was described by Pflanzner et al. [38] (2019), and the authors of the study referred to it as the Automatic Web Crawling Service (AWCS). The objective of this service is to conduct searches on websites that are open to the general public in order to locate data pertaining to the Internet of Things (IoT) and sensors. ...
... They want to apply a strategy that achieves an accuracy of 90.4 per cent, which is far better than the methods they now use. [38] The authors introduce a revolutionary technique to web test creation, which is implemented in a program named DANTE (Dependency-Aware Crawling-Based Web Test Generator) ...
... There are certain content providers who try to introduce unnecessary stuff into the corpus of the crawler. The motives for running such sorts of operations include the desire for monetary gain like misdirecting Traffic to commercial websites [38]. ...
Article
Full-text available
Over the last several years, there has been a significant rise in the number of people getting online and using the internet. Individual hypertext links are available, and any one of them may be used to get access to the resource. There is a variety of hypertext links available. It has been feasible to construct new websites as a result of the growth of crawlers, which has been facilitated by the rise in the number of people who use the internet. Web crawlers are highly evolved search engines that make it simpler for customers to get the information they are searching for on the internet. Web crawlers are also known as web crawlers. In a similar vein, these web crawlers have the potential to be used for more research endeavours in the months and years to come. Furthermore, the information that has been gathered may be used to detect and uncover any connections that are absent, as well as to assess the possibility for expansion inside complicated networks. This can be done by discovering any connections that are missing. The analysis of web crawlers is the primary topic of this study. Topics covered include the architecture of web crawlers, the many types of web crawlers, and the challenges that search engines have while using web crawlers.
... Pflanzner et al. [27] provide an indexing service to retrieve open traces of applications available on the internet. [20] presents a dataset from a test bench of solar-powered weather sensors. ...
... Pflanzner et al. [27] provide an indexing service to retrieve open traces of applications available on the internet. [20] presents a dataset from a test bench of solar-powered weather sensors. ...