Data & Cloud Research Group

About the lab

Innovating in Data Management & Analytics

With an extensive scientific background, the Data & Cloud Research Group delivers innovations through cutting-edge data management approaches across the data path and through advanced machine and deep learning techniques.

Enhancing Infrastructures Management

Specializing in cloud, edge / fog, IoT and 5G environments, the Data & Cloud Research Group focuses on novel, AIOps techniques facilitating service management for the complete service lifecycle within and across heterogeneous infrastructures.

Featured projects (1)

LeADS research and educational program trains a new interdisciplinary professional figure that we call Legality Attentive Data Scientist or LeADS. An expert in data science and law expected to work within and across the two disciplines, a leader in bridging scientific skills with the ethico-legal constraints of their operating environment. LeADS will develop a data science capable of maintaining its innovative solutions within the borders of law – by design and by default – and of helping expand the legal frontiers in line with innovation needs, for instance, preventing the enactments of legal rules technologically unattainable. LeADS research will set the theoretical framework and the practical implementation template of a common language for co-processing and joint-controlling key notions for both data scientists and jurists. Its outcomes will produce also a comparative and interdisciplinary lexicon that draws experts from these fields to define important crossover concepts. Through a broad, interdisciplinary, inter-sectoral network of academic and non-academic partners, LeADS will provide a cross-disciplinary training to ESRs who will work as (e.g.) data scientists or researchers in general, sales managers, project or general managers management at private entities (tech companies, consultancies and legal advisories) and public entities (research centres, universities and administration). LeADS answers to the lack of programs blending experiential learning and research in the area, offers an innovative curriculum linked to pioneering research results. It includes 6 interdisciplinary modules, sectoral courses; joint mentoring on individual research projects; TILL modules and Discussion Games to help ESRs develop pioneering soft skills; challenging secondments. Overall, LeADS research and training aims at changing the regulatory and business approach to information while training the experts able to drive the process that every data driven society needs to employ.

Featured research (19)

Big Data has proved to be vast and complex, without being efficiently manageable through traditional architectures, whereas data analysis is considered crucial for both technical and non-technical stakeholders. Current analytics platforms are siloed for specific domains, whereas the requirements to enhance their use and lower their technicalities are continuously increasing. This paper describes a domain-agnostic single access autoscaling Big Data analytics platform, namely Diastema, as a collection of efficient and scalable components, offering user-friendly analytics through graph data modelling, supporting technical and non-technical stakeholders. Diastema's applicability is evaluated in healthcare through a predicting classifier for a COVID19 dataset, considering real-world constraints.
Over the past few years, increasing attention has been given to the health sector and the integration of new technologies into it. Cloud computing and storage clouds have become essentially state of the art solutions for other major areas and have started to rapidly make their presence powerful in the health sector as well. More and more companies are working toward a future that will allow healthcare professionals to engage more with such infrastructures, enabling them a vast number of possibilities. While this is a very important step, less attention has been given to the citizens. For this reason, in this paper, a citizen-centered storage cloud solution is proposed that will allow citizens to hold their health data in their own hands while also enabling the exchange of these data with healthcare professionals during emergency situations. Not only that, in order to reduce the health data transmission delay, a novel context-aware prefetch engine enriched with deep learning capabilities is proposed. The proposed prefetch scheme, along with the proposed storage cloud, is put under a two-fold evaluation in several deployment and usage scenarios in order to examine its performance with respect to the data transmission times, while also evaluating its outcomes compared to other state of the art solutions. The results show that the proposed solution shows significant improvement of the download speed when compared with the storage cloud, especially when large data are exchanged. In addition, the results of the proposed scheme evaluation depict that the proposed scheme improves the overall predictions, considering the coefficient of determination (R2 > 0.94) and the mean of errors (RMSE < 1), while also reducing the training data by 12%.
The tremendous growth, popularity, and usage of social media in modern societies has led to the production of an enormous real-time volume of social texts and posts, including Tweets that are being produced by users. These collections of social data can be potentially useful, but the extent of meaningful data in these collections is still of high research and business interest. One of the main elements in several application domains, such as policy making, addresses the scope of identifying and categorizing these texts into natural groups based on the topics to which they refer to, in order to better understand and correlate them. The latter is recently realized through the utilization of Topic Modeling and Identification tasks, for identifying and extracting subjective information and topics from raw texts with the ultimate objective to enhance the categorization of them. This paper introduces an end-to-end pipeline that primarily focuses on the phases of the collection, text preprocessing, as well as utilization of Natural Language Processing and Topic Modeling models, which are considered to be of major importance for the successful Topic Modeling and Identification of Tweets and the final interpretation of them.
The health industry has evolved significantly through the last years by adapting to the new technologies and exploiting them in order to upgrade the services that provides to the people. In this context, a lot of effort has been focused on converting medical documents to electronic health records and storing them online. However, taking into consideration the current innovations, it is doubtless that there are many limitations when these proposals are applied in a real-life scenario. For this reason, this paper proposes a system that combines electronic data storage and health record exchange between individuals and authenticated medical staff in a secure way. The specific recommendation is being evaluated through the corresponding applications and protocols that are developed and finally, the results exhibit the solutions over existing gaps.
Healthcare Organizations need to share the health data of the patients with Research Centers in order to fulfill research purposes and improve the healthcare services provided to the patients. However, the information being processed by the Research Centers includes personal and/or sensitive data, which puts the privacy of the individuals at stake. To mitigate the risk of identity disclosure and privacy violation, a variety of privacy mechanisms, such as anonymization and pseudonymization, can be applied to the personal data of the data subjects. In this paper a mobile library is presented in order to either anonymize or pseudonymize the individuals’ personal information which follows the Fast Healthcare Interoperability Resources protocol. To evaluate the implementation and the functionalities of the library two case studies are described – one for each privacy mechanism.

Lab head

Dimosthenis Kyriazis
  • Department of Digital Systems
About Dimosthenis Kyriazis
  • Dimosthenis Kyriazis currently works at the Department of Digital Systems, University of Piraeus. Dimosthenis does research in Distributed Computing, Parallel Computing and Software Engineering.

Members (18)

Athanasios Kiourtis
  • University of Piraeus
Argyro Mavrogiorgou
  • University of Piraeus
Chrysostomos Symvoulidis
  • University of Piraeus
George Manias
  • University of Piraeus
Konstantinos Mavrogiorgos
  • University of Piraeus
Spyridon Kleftakis
  • University of Piraeus
George - Marinos
  • University of Piraeus
Philip Mavrepis
  • University of Piraeus