Conference PaperPDF Available

FAIR and GDPR Compliant Population Health Data Generation, Processing and Analytics

Authors:

Abstract and Figures

Generating and analysing patient data in clinical settings is an inherently sensitive process, requiring collaborative effort between clinicians and informaticians to generate value from these data, while mitigating risks to the data subject. As a result, efforts in utilizing external patient data pose significant challenges. We propose a data-centric framework based on the FAIR principles and GDPR guidelines to enhance data management at the point of care. By using the process of data visiting, a cross-facility method for federated data analytics, we can automate generation of novel aggregate data which was previously not realizable. In two sequential studies we show that these techniques, supported by a data stewardship programme, increase community-wide involvement in data generation, improve transparency and trust, provide direct value and data ownership, and enable regulatory and ethically compliant, cross-national data visiting under curated accessibility patterns for federated analytics.
Content may be subject to copyright.
Copyright © 2022 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
FAIR and GDPR Compliant Population Health
Data Generation, Processing and Analytics
Ruduan Plug1, Yan Liang1, Mariam Basajja1, Aliya Aktau1, Putu Hadi
Purnama Jati2, Samson Yohannes Amare3, Getu Tadele Taye3, Mouhamad
Mpezamihigo4, Francisca Oladipo4, and Mirjam van Reisen1,2
1Leiden University, 2311 EZ Leiden, Netherlands
2Tilburg University, 5037 AB Tilburg, Netherlands
3Mekelle University, 231 Mekelle, Tigray, Ethiopia
4Kampala International University, 20000 Kampala, Uganda
r.b.f.plug@umail.leidenuniv.nl
Abstract. Generating and analysing patient data in clinical settings
is an inherently sensitive process, requiring collaborative effort between
clinicians and informaticians to generate value from these data, while
mitigating risks to the data subject. As a result, efforts in utilizing exter-
nal patient data pose significant challenges. We propose a data-centric
framework based on the FAIR principles and GDPR guidelines to en-
hance data management at the point of care. By using the process of
data visiting, a cross-facility method for federated data analytics, we
can automate generation of novel aggregate data which was previously
not realizable. In two sequential studies we show that these techniques,
supported by a data stewardship programme, increase community-wide
involvement in data generation, improve transparency and trust, pro-
vide direct value and data ownership, and enable regulatory and ethi-
cally compliant, cross-national data visiting under curated accessibility
patterns for federated analytics.
Keywords: FAIR Data ·GDPR ·Data Management ·Data Stewardship
·Clinical Data ·Biomedical Ontologies ·Data Federation ·Data Visiting
1 Introduction
The generation and management of clinical Electronic Health Record (EHR)
data requires strong safeguards on adherence to regulations, data security and
protection of patient privacy and confidentiality [19]. These factors complicate
facilitation of regional analytics and data exchange, which is seen as a criti-
cal factor in concerns of global and cross-national population health. Various
methods have been developed to address concerns of data security and privacy
protection [10, 32]. However, these methods tend to be problematic in practical
use and lack ontology-based standards for cross-facility interoperability and the
versatility to enable adherence to regulations set out by the relevant national
Ministry of Health (MoH) and regional legislature.
2 Plug et al.
The study implemented by the Virus Outbreak Data Network (VODAN)
Africa investigates the preparation and use of digital patient data in Africa. The
African continent is least represented in global health data, and the limitations
and challenges on digitisation of health data that lead to biases in globally
available data are well-documented [20]. Highly developed nations generate the
vast majority of the medical data, and as a result see the most representation
and benefit from research, while low-resource and rural areas tend to generate
few data and consequently are underrepresented in health research.
Efforts of developed nations to generate digital patient data from remote and
impoverished regions with vulnerable populations have often led to extractive
practices [15, 3], producing data sets that do not become available to or directly
serve the benefit of local populations and their health facilities. The transfer of
patient data aggregates from the facility where the data is produced to external
research facilities, poses ethical and legal concerns, in terms of the ownership of
the data and the link to the point of care [6].
These practices in data generation have led to a lack of trust due to the ab-
sence of standards in data ownership [14] and insufficiency of procedural trans-
parency within the generation and use of the data. Lack of capacity of data ana-
lytics within facilities compounds the problem of delaying adaptation of localized
information systems within clinics that can enable medical data generation and
regional clinical data exchange [1], while these localized data management prac-
tices at point of care are essential to the development of trustworthy and legally
compliant data generation methods [26, 23]. The lack of ownership and meaning-
ful use of the data further undermines the potential acceptance of digitisation of
patient data by the patient and other stakeholders. Hence, the quality and com-
pleteness of such data can be affected by the obstacles to adoptation of proposed
digitisation processes of electronic patient information [6].
Data management methods based on FAIR have been proposed as data lo-
calization strategies to improve the standards for patient data generation and
interoperability, and GDPR was utilized as a baseline standard to bridge the gap
to governance, which resulted in two studies implemented in Africa, spanning
from April 2020 - September 2020 and October 2020 to October 2021 [20].
2 FAIR and GDPR Standards
A central standard for regulatory frameworks within this study is encapsulated
by GDPR, which forms the basis of the initial trial by conceptualizing the point-
of-care as both data processor and data controller [4]. By using these standards,
explicit data ownership for the data subject and full control over data are pro-
vided at local levels while allowing for usage of these data under informed con-
sent. Within initial conversations with stakeholders across eight African nations,
including Tanzania, Uganda, Ethiopia, Somalia, Nigeria, Kenya, Tunisia and
Zimbabwe, this baseline was found to provide sufficient common ground while
being flexible to more stringent regulations layered upon GDPR as required by
local regulators [28].
FAIR and GDPR Compliant Health Data 3
Electronic
Patient Records
Medical
Measurements
Pseudo-
Anonymous
Data
Anonymous
Data
Processed
Data
Personal
Data
Data
Aggregated
Data
Accessibility
Specificity
Sensitivity
Fig. 1. Levels of Data Processing and Access Control.
An advantage of this approach is that GDPR in itself already provides a le-
gal framework to enable consent-based exchange of processed, anonymized data,
requiring an assessment of the relevance of the purpose of the data-collection.
However, to enable collaborative use of data such as federated analytics, we have
to look towards FAIR and ontology-based metadata to provide transparent, con-
sistent and machine-readable structure to data across different health facilities
[13], which can be sourced from HMIS already in use.
By ensuring FAIR compliance at the point of data generation, we provide
a set of transparent rules for permissions under which data can be found and
accessed, which is essential in forming trust in management of sensitive data.
A six-level system of access is illustrated in Figure 1, in which personal data
are not permissible to leave the facility while aggregated, processed and anony-
mous data may be exchanged through incremental levels of auditing required
before clearance is provided [9]. Interoperability is enabled through biomedical
ontologies, defined by research communities, providing the semantic links be-
tween data which can then be put into practice through metadata templating
[16]. The World Health Organization SMART guidelines recognize the relevance
of interoperable digital data use all of these levels, including the importance of
the meaningful use of data for quality health access at point of care [12].
4 Plug et al.
3 Study Results
To address concerns on security and privacy of patient data, which requires ca-
pacities to purposefully address the data production and assignment of responsi-
bilities regarding permission, a data stewardship programme was conceptualized
that aims to build a network of local experts on data management and gover-
nance [31]. Foundational to a versatile platform of trust and expertise in regard
to local and regional circumstances lies the interaction between human domain
experts and novel technology, and by bridging this gap, improvements in trust
and safety can be attained. Data stewards are primarily trained to handle data
management and auditing of data processing directly at the point of care.
Utilizing FAIR, assisted by biomedical ontology services such as NCBO Bio-
Portal [30], has already seen great potential in managing, analysing and reusing
biomedical samples across research facilities, for which we show an example in
Figure 2. Unique identifiers and data provenance support the documentation of
data ownership, while the use of common terminologies and semantics through
ontologies ensures that analytics across facilities is possible. Making such tech-
niques common practice for EHR data makes cross-facility and global analytics
of population health data possible without loss of data ownership or extensive
post-processing. This is critical for observational research with very limited data
such as rare diseases, which impose de-anonymization risks, or time-sensitive
analytics such as measuring incidence of COVID-19 across geographies.
The first study was conducted with universities within Africa, across Uganda,
Kenya, Ethiopia, Nigeria, Tunisia and Zimbabwe, in a collaboration of Kampala
International University (KIU), Tangaza University, Mekelle University, Addis
Ababa University, Ibrahim Badamasi University, University de Sousse, Great
Zimbabwe University (GZU), as well as the Leiden University Medical Center
(LUMC) [8, 20, 21] in Europe consisting of two core components. The first core
component was the sustainable data stewardship programme Training of Train-
ers (ToT) to train experts in data process curation and data management, based
on the FAIR principles under GO TRAIN [25].
The data stewards in turn are also equipped with skills to transfer this exper-
tise to other aspiring data experts, contributing to the UN sustainable develop-
ment goals [22]. The training program has resulted in 30 trained data stewards
whom can produce human and machine readable vocabulary relevant to patient
data records [28], from which ontologies can be defined that provide mappings
of data to semantics for FAIRification during point-of-care data production.
Adhering to the process of building expertise through ToT, the technological ar-
chitecture was developed and FAIR Data Point (FDP) services were established
within clinical settings at medical facilities. The FDPs were implemented using
local deployments of DS Wizard [17] to enable FAIR data production, for which
data generation was modelled on the WHO SARS-CoV-2 electronic Case Report
Forms (eCRF) ontology [2] stored as RDF graph databases.
FAIR and GDPR Compliant Health Data 5
Identifiers
Organism
Package
Attributes
BioProject
Submission
Pathogen:
clinical
or
host-associated
sample
from
Severe
acute
respiratory
syndrome
coronavirus
2
BioSample:
SAMN14656635;
Sample
name:
hCoV-19/USA/WI-179
/2020;
SRA:
SRS6514344
Severe
acute
respiratory
syndrome
coronavirus
2
Viruses;
Riboviria;
Orthornavirae;
Pisuviricota;
Pisoniviricetes;
Nidovirales;
Cornidovirineae;
Coronaviridae;
Orthocoronavirinae;
Betacoronavirus;
Sarbecovirus;
Severe
acute
respiratory syndrome-related
coronavirus
Pathogen: clinical or host-associated; version 1.0
strain hCoV-19/USA/WI-179/2020
isolate Homo sapien
collected by Milwaukee Public Health Department
collection date 2020-03-21
geographic location USA: Wisconsin, Milwaukee
host Homo sapiens
host disease COVID-19
isolation source nasal swab
latitude and longitude 43.042180 N 87.908670 W
ARTIC barcode identifiers NB03
PRJNA614504
Retrieve all samples from this project
UW-Madison, Shelby O'Connor; 2020-04-21
Accession: SAMN14656635 ID: 14656635
Fig. 2. An example of rich, ontology-assisted metadata and associated data already
being successfully applied and used to enable interoperability and reusability in
anonymized pathogen samples isolated from patients [24] (NCBI).
Following deployment, experiments were performed with local, in-residence
data production and subsequent cross-national SPARQL queries using the FAIR
data visiting model [18]. The first such clinical query utilizing the findability
and accessibility framework of FAIR was held on 29 September 2020 between
the FDPs at KIU and LUMC. This study demonstrated the feasibility of data-
querying of federated analytics across two continents, involving patient data held
in residence, curated and stored in the place where the data was produced.
A successful proof of concept was presented on international regulatory agree-
ments and a clinical implementation of the data ownership preserving framework
modelled using the FAIR concepts and GDPR. During this experiment, inter-
national cooperation and expertise was developed with focus on findability and
accessibility of clinical patient data, findable under well-specified and transparent
conditions. The aspects of interoperability and reusability were not operationally
implemented during this study and there was only one eCRF as an immutable
ontology which limited the flexibility of use.
6 Plug et al.
In direct continuation of the first trial, a second study was conducted to ad-
dress novel methods to combine ontology-assisted technology and community-
expertise in order to enable cross-facility interoperability and ultimately reusabil-
ity of data [20]. The second study period saw the number of participating nations
increase from six to eight including clinics and hospitals from Ethiopia, Kenya,
Nigeria, Somalia, Tanzania, Uganda, Tunisia, and Zimbabwe.
Essential to these efforts were retooling and deployment of localized CEDAR
[7] instances, which provide an open source platform assisted by BioPortal on-
tologies to produce, share and curate metadata templates and the data generated
from these templates in RDF format. This ensures that data has full providence
during production and provides interoperability through the open and transpar-
ent definitions of the ontologies. Different templates based on the same ontologies
are inherently interoperable on Common Data Elements (CDEs) [11], while data
from different ontologies can be matched by similarly utilizing common terms
and ontological semantic linkages [5, 27], which match the semantics from one
graph structure to another as a translation layer.
Data
Processor
Data
Controller
Regulatory
Data
Subject
Auditing
Data Steward
Production
Clinician
FAIRification Aggregation
Facility
Storage
Analytics
Fig. 3. FAIR and GDPR-based Framework for Data Processing and Analytics.
Central to the advantages offered by this approach are the engagement of
the scientific community, medical facilities, data stewards and legislature, which
all have been involved in the design and deployment of this architecture. In
FAIR and GDPR Compliant Health Data 7
addition, broad scale support was received from both the medical community
as well as the local MoHs [29]. During the second study, country coordinators
have been specified for each country to liaison with local facilities and MoHs,
while technical leads form the bridge between country coordinators and the
deployment. Data stewards are primarily tasked with guiding and auditing the
day-to-day operation of data generation and processing tasks.
The study pioneered a novel, fully FAIR and GDPR compliant, localized
health data generation procedure as a distributed network of FDPs that can
either function entirely independently or collaborate through data visiting pro-
cedures [20]. The resulting minimal viable product resolved the issue of data
ownership by fully FAIR local data production being conducted and utilizing
expertise from data stewards to conduct audits on data visiting requests, which
ensures that all data visiting queries, either to specific facilities or across all
indexed FDPs, comply to data ownership standards and regulations. This is fa-
cilitated by means of local data processing, such that the original data never
leaves the confinement of the medical facility, towards completely anonymized
processed data or aggregates modelled as federated analytics.
The complete procedure of this study is illustrated in Figure 3. This shows
the flow of data from the data owner, in this case the data subject, interpreted
by local clinicians, processed by data stewards using the FAIR data tooling
and then being made available in local storage. Often these data originate from
current health information systems such as DHIS2, from which data can also be
imported into CEDAR as JSON or RDF formatted data. Upon request for data
access using transparent accessibility procedures, under predefined conditions
and permission by the data controller, aggregated data can be made available
upon clearance of audit by the data steward.
4 Conclusion
During this study we have investigated, implemented and deployed a novel FAIR,
GDPR compliant data management architecture for curating, repositing and
analysing patient health data across health facilities. We have shown that by
using the FAIR principles, we can utilize biomedical ontologies to formally struc-
ture the data generation process through facility-catered metadata templating,
while retaining interoperability among data sources defined by these templates.
These formal specifications for interoperability provide an essential component
for privacy-oriented federated analytics across health facilities.
With this study we have identified the universal need for the recognition of
data ownership and control of patient data in relation to the health facilities
where data is produced, and the recognition of data origin and legal rights of
the patient as data subject. Data stewardship is proposed as a key instrument
in ensuring there is transparency, community-based trust and accountability for
repositing and processing patient data, as well as being instrumental to audit-
ing aggregated analytics performed on these data. This has shown encouraging
results with broad support from both health facilities and national MoHs.
8 Plug et al.
In addition, we recognize the importance of the locale of data generation.
By keeping full control over the data at the most localized level, we ensure that
data are handled in accordance with local regulations and ethical foundations.
Based on the support from legislature and research communities, we have found
evidence that doing so leads to a higher engagement in data production within
previously underserved communities. Broad engagement is essential in reducing
data bias and can encourage that aggregated data are being used and analysed
in a way that is meaningful within the local context.
By securely repositing data at the most localized level, while exposing cu-
rated, rich metadata under FAIR, we enable the possibility for federated data
analytics upon individual, controlled authorization without the risk of exposing
the underlying sensitive data. While generating FAIR data can be enabled using
a systematic ontology-matching approach, by linking the data generation process
to FAIR templates based on domain ontologies, the auditing of data processing
and analytical queries still requires significant knowledge and responsibility to
comply with ethical standards and local regulations, for which data stewardship
forms an essential area of local expertise.
Underlining these findings lies the importance between the relationship of
data generation and the in(direct) purpose of such data collection and processing
activities. Significant progress in EHR data analytics can be made by improving
the processes from the very origin of the data and ensuring that these processes
are transparent, well-defined and FAIR, which is in line with the SMART guide-
lines presented by the World Health Organization.
References
1. Basajja, M., Suchanek, M., Taye, G.T., Amare, S.Y., Mutwalibi, N., Folorunso, S.,
Plug, R., Oladipo, F., van Reisen, M.: Proof of concept and horizons on deployment
of fair in the covid-19 pandemic. Data Intelligence, Special Issue: Launching an
international FAIR data network for COVID data. Forthcoming. (2021)
2. Bonino, L.: Who covid-19 rapid version crf semantic data model, Retrieved
from: https://bioportal.bioontology.org/ontologies/COVIDCRFRAPID (accessed
9 September 2021)
3. Chu, K.M., Jayaraman, S., Kyamanywa, P., Ntakiyiruta, G.: Building research
capacity in africa: Equity and global health collaborations. PLoS Medicine 11
(2014)
4. Council, E.P..: Regulation (eu) 2016/679 on the protection of natural persons with
regard to the processing of personal data and on the free movement of such data,
and repealing directive 95/46/ec (general data protection regulation). Official Jour-
nal of the European Union 119, 1–88 (2016)
5. Euzenat, J., Shvaiko, P.: Ontology matching. In: Springer Berlin Heidelberg (2013)
6. Garrib, A., Stoops, N., Mckenzie, A., Dlamini, L.C., Govender, T., Rohde, J.E.,
Herbst, K.: An evaluation of the district health information system in rural south
africa. South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde
98 7, 549–52 (2008)
7. Gon¸calves, R.S., O’Connor, M., Romero, M.M., Egyedi, A.L., Willrett, D., Gray-
beal, J., Musen, M.: The cedar workbench: An ontology-assisted environment for
FAIR and GDPR Compliant Health Data 9
authoring metadata that describe scientific experiments. The semantic Web–ISWC:
International Semantic Web Conference proceedings. International Semantic Web
Conference 10588, 103–110 (2017)
8. Jacobsen, A., de Miranda Azevedo, R., Juty, N.S., Batista, D., Coles, S.J., Cornet,
R., Courtot, M., Crosas, M., Dumontier, M., Evelo, C.T.A., Goble, C.A., Guiz-
zardi, G., Hansen, K.K., Hasnain, A., Hettne, K.M., Heringa, J., Hooft, R.W.,
Imming, M., Jeffery, K.G., Kaliyaperumal, R., Kersloot, M.G., Kirkpatrick, C.R.,
Kuhn, T., Labastida, I., Magagna, B., McQuilton, P., Meyers, N., Montesanti, A.,
van Reisen, M., Rocca-Serra, P., Pergl, R., Sansone, S.A., da Silva Santos, L.O.B.,
Schneider, J., Strawn, G.O., Thompson, M., Waagmeester, A., Weigel, T., Wilkin-
son, M.D., Willighagen, E., Wittenburg, P., Roos, M., Mons, B., Schultes, E.: Fair
principles: Interpretations and implementation considerations. Data Intelligence 2,
10–29 (2020)
9. Jati, P.H.P., Flikkenschild, E., Meerman, B., Nodehi, S., Plug, R., Oladipo, F.,
van Reisen, M.: Data access, control, and privacy protection on vodan africa ar-
chitecture. Data Intelligence, Special Issue: Launching an international FAIR data
network for COVID data. Forthcoming. (2021)
10. Jin, H., Luo, Y., Li, P., Mathew, J.P.: A review of secure and privacy-preserving
medical data sharing. IEEE Access 7, 61656–61669 (2019)
11. Lin, C.H., Wu, N.Y., Liou, D.M.: A multi-technique approach to bridge electronic
case report form design and data standard adoption. Journal of biomedical infor-
matics 53, 49–57 (2015)
12. Mehl, G., Tun¸calp, ¨
O., Ratanaprayul, N., Tamrat, T., Barreix, M., Lowrance,
D.W., Bartolomeos, K., Say, L., Kostanjsek, N., Jakob, R., Grove, J.T., Mariano,
B., Swaminathan, S.: Who smart guidelines: optimising country-level use of guide-
line recommendations in the digital age. The Lancet. Digital health (2021)
13. Mons, B., Neylon, C., Velterop, J., Dumontier, M., da Silva Santos, L.O.B., Wilkin-
son, M.D.: Cloudy, increasingly fair; revisiting the fair data guiding principles for
the european open science cloud. Inf. Serv. Use 37, 49–56 (2017)
14. Musolino, N., Lazdi¸n, J., Toohey, J., IJsselmuiden, C.: Cohred fairness index for
international collaborative partnerships. The Lancet 385, 1293–1294 (2015)
15. Odekunle, F.F., Odekunle, R.O., Shankar, S.: Why sub-saharan africa lags in elec-
tronic health record adoption and possible strategies to increase its adoption in
this region. International Journal of Health Sciences 11, 59 64 (2017)
16. ran Park, J.: Metadata quality in digital repositories: A survey of the current state
of the art. Cataloging & Classification Quarterly 47, 213 228 (2009)
17. Pergl, R., Hooft, R.W., Such´anek, M., Knaisl, V., Slifka, J.: ”data stewardship
wizard”: A tool bringing together researchers, data stewards, and data experts
around data management planning. Data Sci. J. 18, 59 (2019)
18. Plug, R., Liang, Y., Aktau, A., Basajja, M., Oladipo, F., van Reisen, M.: Termi-
nology on a fair-framework for the virus outbreak data network. Data Intelligence,
Special Issue: Launching an international FAIR data network for COVID data.
Forthcoming. (2021)
19. Price, W.N., Cohen, I.G.: Privacy in the age of medical big data. Nature Medicine
25, 37–43 (2019)
20. van Reisen, M., Oladipo, F.O., Stokmans, M., Mpezamihgo, M., Folorunso, S.,
Schultes, E., Basajja, M., Aktau, A., Amare, S., Taye, G.T., Jati, P.H.P., Chindoza,
K., Wirtz, M., Ghardallou, M.E.L., van Stam, G., Ayele, W., Nalugala, R.M.,
Abdullahi, I., Osigwe, O., Graybeal, J., Medhanyie, A.A., Kawu, A.A., Liu, F.,
Wolstencroft, K., Flikkenschild, E., Lin, Y., Stocker, J.W., Musen, M.A.: Design
10 Plug et al.
of a fair digital data health infrastructure in africa for covid-19 reporting and
research. Advanced Genetics (Hoboken, N.j.) 2(2021)
21. van Reisen, M., Stokmans, M., Basajja, M., Ong’ayo, A., Kirkpatrick, C.R., Mons,
B.: Towards the tipping point for fair implementation. Data Intelligence 2, 264–275
(2020)
22. van Reisen, M., Stokmans, M., Mawere, M., Basajja, M., Ong’ayo, A., Nakazibwe,
P., Kirkpatrick, C.R., Chindoza, K.: Fair practices in africa. Data Intelligence 2,
246–256 (2020)
23. Sahay, S., Rashidian, A., Doctor, H.V.: Challenges and opportunities of using dhis2
to strengthen health information systems in the eastern mediterranean region: A
regional approach. The Electronic Journal of Information Systems in Developing
Countries 86 (2020)
24. Schriml, L.M., Chuvochina, M., Davies, N., Eloe-Fadrosh, E.A., Finn, R.D., Hugen-
holtz, P.B., Hunter, C.I., Hurwitz, B.L., Kyrpides, N.C., Meyer, F., Mizrachi, I.K.,
Sansone, S.A., Sutton, G.G., Tighe, S.W., Walls, R.B.: Covid-19 pandemic reveals
the peril of ignoring metadata standards. Scientific Data 7(2020)
25. Schultes, E., Mons, A., Mons, B., Kuzak, M., van Gelder, C., Hodson, S., Meer-
man, B., Jansen, M., Bonino, L.O., Drefs, I., Kriegel, K., Sustkova, H.P., Dumon-
tier, M., de Miranda Azevedo, R., Presutti, V.: Go train pillar, Retrieved from:
https://osf.io/za96b/ (accessed 8 September 2021)
26. Shaffer, J.G., Doumbia, S., Ndiaye, D., Diarra, A., Gomis, J.F., Nwakanma, D.C.,
Abubakar, I., Ahmad, A., Affara, M., Lukowski, M., Valim, C., Welty, J.C., Mather,
F.J., Keating, J., Krogstad, D.J.: Development of a data collection and manage-
ment system in west africa: challenges and sustainability. Infectious Diseases of
Poverty 7(2018)
27. Shvaiko, P., Euzenat, J.: Ontology matching: State of the art and future challenges.
IEEE Transactions on Knowledge and Data Engineering 25, 158–176 (2013)
28. VODAN-Africa: About vodan africa, Retrieved from: https://www.vodan-
totafrica.info/vodan-africa.php?i=1&a=about-vodan-africa (accessed 7 September
2021)
29. VODAN-Africa: Vodan letters of support, Retrieved from: https://www.vodan-
totafrica.info/vodan-africa.php?i=15&a=letters-of-support (accessed 29 Septem-
ber 2021)
30. Whetzel, P.L., Noy, N., Shah, N.H., Alexander, P.R., Nyulas, C., Tudorache, T.,
Musen, M.A.: Bioportal: enhanced functionality via new web services from the
national center for biomedical ontology to access and use ontologies in software
applications. Nucleic Acids Research 39, W541 W545 (2011)
31. Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M.,
Baak, A., Blomberg, N., Boiten, J.W., da Silva Santos, L.O.B., Bourne, P.E.,
Bouwman, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Ed-
munds, S.C., Evelo, C.T.A., Finkers, R., Gonz´alez-Beltr´an, A.N., Gray, A.J.G.,
Groth, P., Goble, C.A., Grethe, J.S., Heringa, J., ‘t Hoen, P.A.C., Hooft, R.W.W.,
Kuhn, T., Kok, R.G., Kok, J.N., Lusher, S.J., Martone, M.E., Mons, A., Packer,
A.L., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R.C., Sansone, S.A.,
Schultes, E., Sengstag, T., Slater, T., Strawn, G.O., Swertz, M.A., Thompson, M.,
van der Lei, J., van Mulligen, E.M., Velterop, J., Waagmeester, A., Wittenburg,
P., Wolstencroft, K., Zhao, J., Mons, B.: The fair guiding principles for scientific
data management and stewardship. Scientific Data 3(2016)
32. Yang, J.J., Li, J., Niu, Y.: A hybrid solution for privacy preserving medical data
sharing in the cloud environment. Future Gener. Comput. Syst. 43-44, 74–86
(2015)
... Data Intelligence Just Accepted MS. https://doi.org/10.1162/dint_a_00166 Figure 9. Accessibility spectrum of processed personal data [50] The second phase is being rolled out with the participation of Leiden University and over 80 health facilities in 8 countries in Africa. ...
... CEDAR as a FAIR Data Point[50] ...
Article
Full-text available
The incompleteness of patient health data is a threat to the management of COVID-19 in Africa and globally. This has become particularly clear with the recent emergence of new variants of concern. The Virus Outbreak Data Network (VODAN)-Africa has studied the curation of patient health data in selected African countries and identified that health information flows often do not involve the use of health data at the point of care, which renders data production largely meaningless to those producing it. This modus operandi leads to disfranchisement over the control of health data, which is extracted to be processed elsewhere. In response to this problem, VODAN-Africa studied whether or not a design that makes local ownership and repositing of data central to the data curation process would 2 have a greater chance of being adopted. The design team based their work on the legal requirements of the European Union's General Data Protection Regulation (GDPR); the FAIR Guidelines on curating data as Findable, Accessible (under well-defined conditions), Interoperable and Reusable (FAIR); and national regulations applying in the context where the data is produced. The study concluded that the visiting of data curated as machine actionable and reposited in the locale where the data is produced and renders services has great potential for access to a wider variety of data. A condition of such innovation is that the innovation team is intradisciplinary, involving stakeholders and experts from all of the places where the innovation is designed, and employs a methodology of co-creation and capacity-building.
... The curation of health data requires adherence to ethical and legal standards, which becomes a complicated topic when research is carried out across national borders (Plug et al., 2022a). Care has to be taken with regards to data access control, privacy, and local regulations , as well as the interoperability of data to be able to ethically generate value from localised health data. ...
... It has been adopted by numerous research funders and institutions as a way to improve the management and sharing of research data. The FAIR data management principles have formed the basis for data points implementation studies on COVID-19, population health data, life sciences data, and so on (Alves et al., 2020;Basajja et al., 2022;Folorunso et al., 2022;Ghardallou et al., 2022;Kersloot et al., 2021;Plug et al., 2022). Some of these FAIRification implementations were necessitated by challenges such as the need to create persistent identifiers to data so that it can be easily located and cited; how to capture information about the history and lineage of data, including how it was produced, where it came from, and who has worked on it; difficulty in making data interoperable; lack of a standard for metadata description and so on. ...
Conference Paper
Full-text available
Academic projects generate vast amounts of valuable data that can greatly benefit the research community if properly managed and shared. However, the lack of structured and standardized design processes for creating fair data repositories hinders data accessibility and reuse. In response to this challenge, this study presents a UML (Unified Modelling Language) approach to designing a Findable (F), Accessible (A), Interoperable (I), and Reusable (R) data point repository for academic projects. The need for a structured and standardized design process for creating a fair data repository is addressed, with the aim of promoting data accessibility and facilitating the reuse of scholarly resources. The UML models serve as a blueprint for implementation, guiding the development team and facilitating the integration of FAIR principles into the repository's design. Although this study is currently in progress towards developing a FAIR Data Point Repository for Academic Projects, the adoption of a UML approach holds significant potential. This approach shows promise in establishing a robust and FAIR data infrastructure that not only supports academic research but also enhances data accessibility and enables the valuable reuse of scholarly resources.
... The former happens through the standardization of the data flow and creation of controlled vocabularies which allows interoperability between clinics in various countries. Data ownership is ensured by a smart aggregating data visiting system (van Reisen et al., 2021;Plug et al., 2022). ...
... RFID and NFC sensors can generate huge amounts of data worsening the challenge. General-purpose Cloud services have then to be adopted so that two issues related to the privacy of the patient arise: i) compliance of the monitoring platform with the regulatory framework on sensitive data, e.g., the European General Data Protection Regulation [38]; ii) adoption of proper technical solutions that ensure secure access to the data with no participation of the Cloud service, that, in general, could be an honest-but-curious, or even malicious, actor. For this reason, many solutions have been proposed in literature trying to deal with the major issue of ensuring privacy and data security in Cloud services [39]- [41]. ...
Article
Points-of-care (PoCs) augment healthcare systems by performing care whenever needed and are becoming increasingly crucial for the well-being of the worldwide population. Personalized medicine, chronic illness management, and cost reduction can be achieved thanks to the widespread adoption of PoCs. Significant incentives for PoCs deployment are nowadays given by wearable devices and, in particular, by RFID (RadioFrequency IDentification) and NFC (Near Field Communications), which are rising among the technological cornerstones of the healthcare internet of things (H-IoT). To fully exploit recent technological advancements, this paper proposes a system architecture for RFID-and NFC-based PoCs. The architecture comprises in a unitary framework both interfaces to benefit from their complementary features, and gathered data are shared with medical experts through secure and user-friendly interfaces that implement the Fast Health Interoperability Resource (FHIR) emerging healthcare standard. The selection of the optimal UHF and NFC components is discussed concerning the employable sensing techniques. The secure transmission of sensitive medical data is addressed by developing a user-friendly "PoC App" that is the first web app exploiting attribute-based encryption (ABE). An application example of the system for monitoring the pH and cortisol levels in sweat is implemented and preliminarily tested by a healthy volunteer.
... Changes can be made to this version by the publisher prior to publication. be adopted so that two issues related to the privacy of the patient arise: i) compliance of the monitoring platform with the regulatory framework on sensitive data, e.g., the European General Data Protection Regulation [38]; ii) adoption of proper technical solutions that ensure secure access to the data with no participation of the Cloud service, that, in general, could be an honest-but-curious, or even malicious, actor. For this reason, many solutions have been proposed in literature trying to deal with the major issue of ensuring privacy and data security in Cloud services [39]- [41]. ...
Preprint
Full-text available
Points-of-care (PoCs) augment healthcare systems by performing care whenever needed and are becoming increasingly crucial for the well-being of the worldwide population. Personalized medicine, chronic illness management, and cost reduction can be achieved thanks to the widespread adoption of PoCs. Significant incentives for PoCs deployment are nowadays given by wearable devices and, in particular, by RFID (RadioFrequency IDentification) and NFC (Near Field Communications), which are rising among the technological cornerstones of the healthcare internet of things (H-IoT). To fully exploit recent technological advancements, this paper proposes a system architecture for RFID- and NFC-based PoCs. The architecture comprises in a unitary framework both interfaces to benefit from their complementary features, and gathered data are shared with medical experts through secure and user-friendly interfaces that implement the Fast Health Interoperability Resource (FHIR) emerging healthcare standard. The selection of the optimal UHF and NFC components is discussed concerning the employable sensing techniques. The secure transmission of sensitive medical data is addressed by developing a user-friendly "PoC App" that is the first web app exploiting attribute-based encryption (ABE). An application example of the system for monitoring the pH and cortisol levels in sweat is implemented and preliminarily tested by a healthy volunteer.
... The overarching goal of the initiative is to create a system where healthcare providers can collect and retain sovereignty over their patients records [23], where the data in every section of the project follows the Findable, Accessible, Interoperable and Reusable (FAIR) principles [17]. Significant effort has been poured into making VODAN-A adhere to the FAIR principles and the project has achieved success, particularly on findability and accessibility parts, as well as supporting data reuse [16]. This article deals with the question of how advancing interoperability could be considered in VODAN-A. ...
Article
Full-text available
This paper makes the case for the potential use case of nanopublications to expand the interoperability of the Virus Outbreak Data Analysis Network – Africa given their similarity to FAIR Digital Objects.
... Cloud-based architectures are convenient for managing the vast amount of data that can be generated by wireless sensors without overwhelming local computer systems [23,24]. Still, the use of external public services raises a two-level issue regarding the privacy of the patient: (i) The monitoring platform must comply with the regulatory framework on sensitive data, e.g., the European GDPR [25], and (ii) technical solutions must be adopted to ensure secure access to the data without involving the cloud service, which, in general, could be an honest-but-curious or even malevolent actor. Hence, ensuring privacy and data security in cloud services is a major issue, which has been widely discussed in the literature [26]. ...
Article
Full-text available
World population and life expectancy have increased steadily in recent years, raising issues regarding access to medical treatments and related expenses. Through last-generation medical sensors, NFC (Near Field Communication) and radio frequency identification (RFID) technologies can enable healthcare internet of things (H-IoT) systems to improve the quality of care while reducing costs. Moreover, the adoption of point-of-care (PoC) testing, performed whenever care is needed to return prompt feedback to the patient, can generate great synergy with NFC/RFID H-IoT systems. However, medical data are extremely sensitive and require careful management and storage to protect patients from malicious actors, so secure system architectures must be conceived for real scenarios. Existing studies do not analyze the security of raw data from the radiofrequency link to cloud-based sharing. Therefore, two novel cloud-based system architectures for data collected from NFC/RFID medical sensors are proposed in this paper. Privacy during data collection is ensured using a set of classical countermeasures selected based on the scientific literature. Then, data can be shared with the medical team using one of two architectures: in the first one, the medical system manages all data accesses, whereas in the second one, the patient defines the access policies. Comprehensive analysis of the H-IoT system can be useful for fostering research on the security of wearable wireless sensors. Moreover, the proposed architectures can be implemented for deploying and testing NFC/RFID-based healthcare applications, such as, for instance, domestic PoCs.
Article
Full-text available
The limited volume of COVID-19 data from Africa raises concerns for global genome research, which requires a diversity of genotypes for accurate disease prediction, including on the provenance of the new SARS-CoV-2 mutations. The Virus OutbreakData Network (VODAN)-Africa studied the possibility of increasing the production of clinical data, finding concerns about data ownership, and the limited use of health data for quality treatment at the point of care. To address this, VODAN Africa developed an architecture to record clinical health data and research data collected on the incidence of COVID-19, producing these as human- and machine-readable data objects in a distributed architecture of locally governed, linked, human- and machine-readable data. This architecture supports analytics at the point of care and—through data visiting, across facilities for generic analytics. An algorithm was run across FAIR DataPoints to visit the distributed data and produce aggregate findings. The FAIR data architecture is deployed in Uganda, Ethiopia, Liberia, Nigeria, Kenya, Somalia, Tanzania, Zimbabwe, and Tunisia.
Article
Full-text available
Efficient response to the pandemic through the mobilization of the larger scientific community is challenged by the limited reusability of the available primary genomic data. Here, the Genomic Standards Consortium board highlights the essential need for contextual genomic data FAIRness, for empowering key data-driven biological questions.
Article
Full-text available
The Data Stewardship Wizard is a tool for data management planning that is focused on getting the most value out of data management planning for the project itself rather than on fulfilling obligations. It is based on FAIR Data Stewardship, in which each data-related decision in a project acts to optimize the Findability, Accessibility, Interoperability and/or Reusability of the data. The background to this philosophy is that the first reuser of the data is the researcher themselves. The tool encourages the consulting of expertise and experts, can help researchers avoid risks they did not know they would encounter by confronting them with practical experience from others, and can help them discover helpful technologies they did not know existed. In this paper, we discuss the context and motivation for the tool, we explain its architecture and we present key functions, such as the knowledge model evolvability and migrations, assembling data management plans, metrics and evaluation of data management plans.
Article
Full-text available
This article investigates expansion of the Internet of FAIR Data and Services (IFDS) to Africa, through the three GO FAIR pillars: GO CHANGE, GO BUILD and GO TRAIN. Introduction of the IFDS in Africa has a focus on digital health. Two examples of introducing FAIR are compared: a regional initiative for digital health by governments in the East Africa Community (EAC) and an initiative by a local health provider (Solidarmed) in collaboration with Great Zimbabwe University in Zimbabwe. The obstacles to introducing FAIR are identified as underrepresentation of data from Africa in IFDS at this moment, the lack of explicit recognition of situational context of research in FAIR at present and the lack of acceptability of FAIR as a foreign and European invention which affects acceptance. It is envisaged that FAIR has an important contribution to solve fragmentation in digital health in Africa, and that any obstacles concerning African participation, context relevance and acceptance of IFDS need to be removed. This will require involvement of African researchers and ICT-developers so that it is driven by local ownership. Assessment of ecological validity in FAIR principles would ensure that the context specificity of research is reflected in the FAIR principles. This will help enhance the acceptance of the FAIR Guidelines in Africa and will help strengthen digital health research and services.
Article
Full-text available
Globally, there is acceptance of the role of strengthened health information system (HIS) to inform evidence-based decision-making and improve health services delivery. Within the context of increasing demands to generate data of high quality for monitoring progress towards SDGs, we review challenges and approaches in adopting a regional based approach to strengthen HIS within the WHO Eastern Mediterranean Region and its 22 Member States. The HIS situation in the WHO Eastern Mediterranean Region is mixed: while some countries have developed robust HIS, others lag behind on both technological and institutional dimensions. Technically, most countries have not capitalized on the benefits of ICT advancements such as the internet, cloud and mobile computing. As a result, systems remain manual or standalone; and data is not shareable. Institutionally, systems are broadly fragmented, with non-standardized data, weak human resources capacity, and limited evidence of data being used to inform action. In order to consolidate efforts to enhance HISs, we recommend leveraging upon innovative information and communication technology (ICT) solutions for data capturing, processing, analysis, and reporting either at the individual or aggregate levels. Countries need to develop similar such processes in their contexts, while ensuring that they learn and build upon from their neighbors – building network of networks.
Article
Full-text available
In the digital healthcare era, it is of the utmost importance to harness medical information scattered across healthcare institutions to support in-depth data analysis and achieve personalized healthcare. However, the cyberinfrastructure boundaries of healthcare organizations and privacy leakage threats place obstacles on the sharing of medical records. Blockchain, as a public ledger characterized by its transparency, tamper-evidence, trustlessness and decentralization, can help build a secure medical data exchange network. This paper surveys the state-of-the-art schemes on secure and privacy-preserving medical data sharing of the past decade with a focus on blockchain-based approaches. We classify them into permissionless blockchain-based approaches and permissioned blockchain-based approaches, and analyze their advantages and disadvantages. We also discuss potential research topics on blockchain-based medical data sharing.
Article
Full-text available
Background Developing and sustaining a data collection and management system (DCMS) is difficult in malaria-endemic countries because of limitations in internet bandwidth, computer resources and numbers of trained personnel. The premise of this paper is that development of a DCMS in West Africa was a critically important outcome of the West African International Centers of Excellence for Malaria Research. The purposes of this paper are to make that information available to other investigators and to encourage the linkage of DCMSs to international research and Ministry of Health data systems and repositories. Methods We designed and implemented a DCMS to link study sites in Mali, Senegal and The Gambia. This system was based on case report forms for epidemiologic, entomologic, clinical and laboratory aspects of plasmodial infection and malarial disease for a longitudinal cohort study and included on-site training for Principal Investigators and Data Managers. Based on this experience, we propose guidelines for the design and sustainability of DCMSs in environments with limited resources and personnel. Results From 2012 to 2017, we performed biannual thick smear surveys for plasmodial infection, mosquito collections for anopheline biting rates and sporozoite rates and year-round passive case detection for malarial disease in four longitudinal cohorts with 7708 individuals and 918 households in Senegal, The Gambia and Mali. Major challenges included the development of uniform definitions and reporting, assessment of data entry error rates, unstable and limited internet access and software and technology maintenance. Strengths included entomologic collections linked to longitudinal cohort studies, on-site data centres and a cloud-based data repository. Conclusions At a time when research on diseases of poverty in low and middle-income countries is a global priority, the resources available to ensure accurate data collection and the electronic availability of those data remain severely limited. Based on our experience, we suggest the development of a regional DCMS. This approach is more economical than separate data centres and has the potential to improve data quality by encouraging shared case definitions, data validation strategies and analytic approaches including the molecular analysis of treatment successes and failures. Electronic supplementary material The online version of this article (10.1186/s40249-018-0494-4) contains supplementary material, which is available to authorized users.
Article
Full-text available
Poor health information system has been identified as a major challenge in the health-care system in many developing countries including sub-Saharan African countries. Electronic health record (EHR) has been shown as an important tool to improve access to patient information with attendance improved quality of care. However, EHR has not been widely implemented/adopted in sub-Saharan Africa. This study sought to identify factors that affect the adoption of an EHR in sub-Saharan Africa and strategies to improve its adoption in this region. A comprehensive literature search was conducted on three electronic databases: PubMed, Medline, and Google Scholar. Articles of interest were those published in English that contained information on factors that limit the adoption of an EHR as well as strategies that improve its adoption in sub-Saharan African countries. The available evidence indicated that there were many factors that hindered the widespread adoption of an EHR in sub-Saharan Africa. These were high costs of procurement and maintenance of the EHR system, lack of financial incentives and priorities, poor electricity supply and internet connectivity, and primary user's limited computer skills. However, strategies such as implementation planning, financial supports, appropriate EHR system selection, training of primary users, and the adoption of the phased implementation process have been identified to facilitate the use of an EHR. Wide adoption of an EHR in sub-Saharan Africa region requires a lot more effort than what is assumed because of the current poor level of technological development, lack of required computer skills, and limited resources.
Article
Big data has become the ubiquitous watch word of medical innovation. The rapid development of machine-learning techniques and artificial intelligence in particular has promised to revolutionize medical practice from the allocation of resources to the diagnosis of complex diseases. But with big data comes big risks and challenges, among them significant questions about patient privacy. Here, we outline the legal and ethical challenges big data brings to patient privacy. We discuss, among other topics, how best to conceive of health privacy; the importance of equity, consent, and patient governance in data collection; discrimination in data uses; and how to handle data breaches. We close by sketching possible ways forward for the regulatory system.