Available via license: CC BY-NC 4.0
Content may be subject to copyright.
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 1
ETHICAL CONSIDERATIONS IN DATA COLLECTION AND
ANALYSIS: A REVIEW: INVESTIGATING ETHICAL
PRACTICES AND CHALLENGES IN MODERN DATA
COLLECTION AND ANALYSIS
Gold Nmesoma Okorie1, Chioma Ann Udeh2, Ejuma Martha Adaga3,
Obinna Donald DaraOjimba4, & Osato Itohan Oriekhoe5
1Independent Researcher, Dallas TX, USA
2Independent Researcher, Lagos, Nigeria
3Independent Researcher, NJ, USA
4Department of Information Management, Ahmadu Bello University, Zaria Nigeria
5Independent Researcher, UK
______________________________________________________________________________
Corresponding Author: Obinna Donald DaraOjimba
Corresponding Author Email: donaldojimba@gmail.com
Article Received: 23-10-23 Accepted: 20-12-23 Published: 02-01-24
Licensing Details: Author retains the right of this article. The article is distributed under the terms of the
Creative Commons Attribution-Non Commercial 4.0 License
(http://www.creativecommons.org/licences/by-nc/4.0/) which permits non-commercial use, reproduction
and distribution of the work without further permission provided the original work is attributed as specified
on the Journal open access page.
______________________________________________________________________________
ABSTRACT
In an era where data profoundly influences decision-making across various sectors, this
comprehensive review critically examines the evolving landscape of data science ethics,
particularly focusing on the interplay between technological advancements and ethical standards.
The study aims to investigate and synthesize current ethical practices and challenges in modern
data collection and analysis, tracing the evolution of ethical standards in data science,
understanding the significance of ethical considerations in contemporary data practices, and
exploring the development of global regulatory and ethical frameworks.
OPEN ACCESS
International Journal of Applied Research in Social Sciences
P-ISSN: 2706-9176, E-ISSN: 2706-9184
Volume 6, Issue 1, P.No. 1-22, January 2024
DOI: 10.51594/ijarss.v6i1.688
Fair East Publishers
Journal Homepage: www.fepbl.com/index.php/ijarss
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 2
The paper encompasses a systematic literature review, focusing on core ethical principles in data
collection and analytical processes, the roles of consent and privacy, the complexities introduced
by big data, and the intricacies of ethical frameworks across different regions. It bridges the
research gap in data ethics by providing insights into practical ethical frameworks and instructional
models, guiding researchers, policymakers, and practitioners in ethical data handling.
The study concludes that ethical considerations are integral to data science practices and contribute
significantly to societal well-being. It recommends enhanced ethical education in data science,
development of inclusive ethical frameworks, strengthening regulatory oversight, and promoting
public engagement in data ethics discussions. These steps are essential for ensuring responsible
and beneficial use of data in an ethically complex landscape.
Keywords: Data Ethics, Data Science, Ethical Standards, Privacy, Consent, Regulatory.
_____________________________________________________________________________
INTRODUCTION
Overview of Ethical Dimensions in Modern Data Collection and Analysis
The evolution of ethical standards in data science has been a critical area of focus, as the field
rapidly expands and integrates into various sectors. Wang et al. (2019) emphasize the importance
of ethical approaches in data science, particularly in minimizing adverse effects that may arise
during data collection, analysis, and storage. Their research, which involved surveying data
science students and practitioners, highlights the growing awareness and concern for ethical
practices within the community. This study underscores the need for a comprehensive
understanding of the factors influencing ethical practices in data science, a field that is increasingly
impacting every aspect of modern life.
Santos (2023) discusses the expansion of data science, particularly in the context of dataset
standardization. The author points out that with the advent of more sophisticated data analysis
techniques and the availability of vast amounts of data, there is a pressing need to establish data
standards. This is crucial for ensuring interoperability between datasets and software, both
commercial and open-source. Santos's analysis through a Strengths, Weaknesses, Opportunities,
and Threats (SWOT) framework provides a nuanced understanding of the strategic approaches
required for efficient data management, directly impacting ethical considerations in data handling.
In the realm of sports and exercise science research, Harriss, Jones, and MacSween (2022) provide
an updated perspective on ethical standards. Their work, which is embedded in the context of
national and international laws, offers a detailed view of the ethical considerations that have
evolved over time in response to changes in data collection processes, research designs, and
settings. This continuous updating of ethical guidelines reflects the dynamic nature of data science
and the need for adaptable ethical frameworks that can respond to new challenges and
technologies.
Padmapriya and Parthasarathy (2021) delve into the ethical concerns in healthcare research,
particularly in medical image analysis (MIA). They propose a structured approach to ethical data
collection, acknowledging the rapid advancement of data science and artificial intelligence in
healthcare. Their framework aims to guide data scientists in addressing potential ethical concerns
before commencing data analytics on medical datasets. This approach is indicative of the proactive
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 3
measures needed to ensure ethical compliance in data science, especially in sensitive areas like
healthcare.
The evolution of ethical standards in data science is marked by a growing recognition of the
complexities and potential risks associated with data collection and analysis. From the general
principles outlined by Wang et al. (2018) to the specific contexts discussed by Santos, Harriss,
Jones, and MacSween (2022), and Padmapriya and Parthasarathy (2021) it is evident that ethical
considerations are becoming increasingly integral to data science practices.
Building upon the initial discussion, it is crucial to delve deeper into the factors that influence
ethical practices in data science, as highlighted by Wang et al. (2018). The ethical landscape in
data science is not only shaped by the technological advancements but also by the societal and
cultural contexts in which data is collected and analyzed. The attitudes and feelings of those
involved in data science, including students and practitioners, play a significant role in shaping
these ethical practices. Their perspectives offer valuable insights into the evolving nature of ethical
considerations, reflecting a diverse range of experiences and expectations.
Santos (2023) further emphasizes the importance of dataset standardization in the ethical
management of data. The expansion of data science brings with it the challenge of managing and
sharing vast amounts of data. The establishment of data standards is not just a technical necessity
but also an ethical imperative. By ensuring interoperability and accessibility, these standards
facilitate ethical data sharing and collaboration, which are essential for advancing scientific
knowledge while respecting individual rights and privacy.
The work of Harriss, Jones, and MacSween (2022) in the context of sports and exercise science
research provides a sector-specific example of how ethical standards evolve in response to
changing research methodologies and societal expectations. Their continuous updating of ethical
guidelines in light of new developments in data collection and analysis techniques serves as a
model for other fields. This adaptability is crucial in maintaining the integrity of research and
ensuring that it adheres to both ethical and legal standards.
In the healthcare sector, the approach proposed by Padmapriya and Parthasarathy (2021) for ethical
data collection in medical image analysis underscores the need for sector-specific ethical
frameworks. The sensitive nature of healthcare data demands a high level of ethical scrutiny. The
proposed framework not only addresses the immediate ethical concerns but also anticipates future
challenges, demonstrating a forward-thinking approach to ethical data handling in healthcare
research.
The evolution of ethical standards in data science is also influenced by the increasing public
awareness and concern over data privacy and security. As data becomes a valuable commodity in
the digital age, the ethical implications of data collection, storage, and analysis become more
pronounced. The public's trust in data science practices is contingent upon the transparency and
accountability of these practices. Therefore, establishing robust ethical guidelines and standards is
not only a professional responsibility but also a societal necessity.
Furthermore, the global nature of data science calls for a harmonization of ethical standards across
different regions and cultures. The comparative review of ethical standards, as suggested by Santos
(2023), is essential in this regard. Understanding the similarities and differences in ethical practices
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 4
across various regions can lead to the development of more inclusive and universally applicable
ethical frameworks.
Tracing the Evolution of Ethical Standards in Data Science
The evolution of ethical standards in data science reflects the field's rapid growth and its increasing
impact on society. Kuc-Czarnecka and Olczyk (2020) conducted a bibliometric analysis to explore
the development of ethical concerns in the realm of Big Data. Their study reveals that ethical issues
in Big Data, while gaining attention, are still relatively underrepresented in scientific literature.
This finding is significant as it underscores the nascent stage of ethical discourse in data science,
particularly in areas like health and technology where data's role is becoming increasingly critical.
Harriss, Jones, and MacSween (2022) provide insights into the ethical standards in sports and
exercise science research, offering a perspective on how ethical considerations are integrated into
specific research domains. Their work highlights the dynamic nature of ethical standards, which
evolve in response to changes in research methodologies, data collection processes, and societal
norms. This evolution is indicative of the broader trends in data science, where ethical standards
must adapt to the rapidly changing landscape of data usage and technological advancements.
Gordon et al. (2022) present a multi-stakeholder analysis on computing ethics, shedding light on
the diverse perspectives that shape ethical standards in data science. Their study reveals
overlapping concerns among academics, industry professionals, and citizens, particularly
regarding data accuracy, completeness, and representativeness. This diversity of viewpoints is
crucial for understanding the multifaceted nature of ethical challenges in data science, emphasizing
the need for inclusive and comprehensive ethical frameworks.
Kearns and Roth (2021) delve into the ethical implications of algorithm design, exploring how
algorithms, as a core component of data science, can be developed with social awareness. Their
work on "The Ethical Algorithm" highlights the complexity of ethical issues in algorithmic
decision-making, especially in the context of machine learning and artificial intelligence. The
ethical component in algorithm design is critical, as it directly impacts fairness, transparency, and
accountability in data-driven decisions.
The trajectory of ethical standards in data science is marked by an increasing awareness of the
ethical implications of data collection, analysis, and usage. The studies by Kuc-Czarnecka and
Olczyk (2020), Harriss, Jones, and MacSween (2022), Gordon et al. (2022) and Kearns and Roth
(2-19) collectively illustrate the growing complexity of ethical challenges in data science. These
challenges stem not only from the technological aspects of data science but also from its societal
impact, necessitating a holistic approach to ethical considerations.
As data science continues to permeate various sectors, the need for robust ethical standards
becomes more pronounced. The evolution of these standards is influenced by a multitude of
factors, including technological advancements, societal expectations, and the diverse perspectives
of stakeholders involved in data science. This evolution is a continuous process, requiring ongoing
dialogue, research, and adaptation to ensure that ethical considerations keep pace with the rapid
developments in the field.
Significance of Ethical Considerations in Contemporary Data Practices
The significance of ethical considerations in contemporary data practices cannot be overstated,
especially in an era where data is ubiquitous and integral to various aspects of society. Zhang-
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 5
Kennedy and Chiasson's (2021) research provides a critical insight into consumer perspectives on
privacy regulations and corporate data practices. Their study reveals that while consumers are
generally aware of their privacy rights, there is a notable gap in their ability to exercise these rights
effectively. This gap underscores the importance of not only establishing ethical standards but also
ensuring that these standards are accessible and understandable to the general public. The concept
of a "moral code" based on trust, transparency, control, and access emerges as a key framework
for assessing privacy violations, highlighting the need for ethical practices that go beyond mere
legal compliance.
Andrews et al. (2023) address the ethical considerations specific to the collection of human-centric
image datasets. Their work is particularly relevant in the context of privacy and bias, two critical
issues in data science. The authors propose a set of ethical considerations and practical
recommendations for collecting more ethically-minded human-centric image data, covering
aspects such as purpose, privacy, consent, and diversity. This research is pivotal in guiding data
practitioners towards responsible data curation practices, ensuring that data collection processes
are not only technically sound but also ethically robust.
Fisher et al. (2020) delve into the ethical challenges in eHealth HIV intervention research,
particularly focusing on informational risk in recruitment, data maintenance, and consent
procedures. Their study highlights the tension between protecting participant confidentiality and
the evolving risks posed by online platforms. The need for updated technical competencies and
participant education on privacy protection is emphasized, along with additional protections in
interventions involving peer or community support. This research underscores the evolving nature
of ethical challenges in digital health technologies and the importance of continually adapting
ethical practices to address these challenges.
In the contemporary landscape of data practices, the significance of ethical considerations is
multifaceted. It encompasses not only the protection of individual privacy and rights but also the
broader implications of data usage on societal norms and values. The insights from Zhang-
Kennedy and Chiasson (2021), Andrews et al. (2023) and Fisher et al. (2020) collectively highlight
the need for ethical frameworks that are dynamic, inclusive, and responsive to the changing nature
of data science and its impact on society.
As data continues to play a pivotal role in shaping various sectors, from healthcare to consumer
industries, the ethical dimensions of data practices become increasingly complex. The
development of ethical standards and practices must therefore be an ongoing process, involving a
wide range of stakeholders, including data scientists, ethicists, policymakers, and the public. This
collaborative approach is essential in ensuring that data practices not only adhere to ethical
principles but also contribute positively to societal well-being.
Core Ethical Principles in Data Collection and Analytical Processes
The core ethical principles in data collection and analytical processes are fundamental to ensuring
the integrity and societal acceptance of data science practices. Jameel and Majid (2018) emphasize
the importance of these principles in their discussion on research fundamentals. They highlight
that ethical considerations are not just an addendum but a central component of the research
process, encompassing data collection, analysis, and dissemination. This perspective underscores
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 6
the need for researchers to be cognizant of the ethical implications of their work from the outset,
ensuring that their methodologies and analyses adhere to established ethical norms.
Hosseini, Wieczorek, and Gordijn (2022) delve into the ethical issues specific to social science
research employing big data. They identify unique challenges in this domain, such as the
interpretative nature of social science and big data, complexities in managing risks in publication
and reuse of big data, and the lack of regulatory oversight. The authors propose using David
Resnik's research ethics framework, focusing on principles like honesty, carefulness, openness,
efficiency, respect for subjects, and social responsibility. This framework provides a
comprehensive approach to addressing ethical issues in big data research, particularly those related
to methodological biases, data availability, and individual and social harms.
Padmapriya and Parthasarathy (2021) contribute to the discussion by proposing an ethical data
collection framework for medical image analysis. Their work is particularly relevant in the context
of healthcare research, where the stakes are high due to the sensitive nature of the data involved.
The proposed framework addresses the ethical concerns and risks associated with data science
applications in healthcare, emphasizing the need for a structured approach to guide data scientists
in ethical data handling. This approach is crucial in balancing the rapid advancements in data
science and artificial intelligence in healthcare with the ethical imperatives of patient privacy and
data security.
The core ethical principles in data collection and analytical processes are multifaceted and
encompass a range of considerations, from respecting participant privacy and ensuring data
accuracy to managing risks associated with data reuse and publication. These frameworks should
not only guide researchers in conducting ethically sound research but also foster public trust in
data science practices.
As data science continues to evolve and permeate various sectors, the importance of adhering to
core ethical principles becomes increasingly critical. Researchers and practitioners must be
equipped with the knowledge and tools to navigate the ethical complexities of data collection and
analysis. This involves a continuous process of education, reflection, and adaptation to ensure that
ethical considerations are integrated into every stage of the data science lifecycle.
onsent and Privacy: Pillars of Ethical Data Collection
Consent and privacy stand as fundamental pillars in the ethical landscape of data collection,
particularly in an era where data-driven decision-making is prevalent across various sectors.
Plutzer (2019) explores the intricate balance between maximizing data quality and adhering to
ethical obligations to protect the privacy of respondents and ensure informed consent. The study
highlights how sensitive topics and the consent process can contribute to errors in representation
and measurement, thereby affecting data quality. This underscores the importance of designing
consent processes that are not only ethically sound but also conducive to obtaining high-quality
data.
Kreuter et al. (2020) delve into the challenges and opportunities presented by the European General
Data Protection Regulation (GDPR) in the context of digital data collection. Their study, based on
a German nationwide probability app study, reveals insights into participant willingness to share
digital trace data and the effectiveness of GDPR-compliant consent processes. The findings
suggest that despite being provided with more decision-related information, participants often do
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 7
not differentiate between different data requests. This observation points to the need for more
effective communication strategies that can adequately inform and protect participants, thereby
enhancing the ethical integrity of data collection processes.
Lee (2021) addresses the ethical considerations surrounding consent in the genomic research
ecosystem, particularly in the context of precision medicine and large-scale data sharing initiatives.
The review emphasizes the need to balance individual autonomy and privacy with the potential for
public good, highlighting the emerging ethical issues in consent processes. The shifting landscape
of genomic research, with its increasing demands for broad data sharing, calls for careful
consideration of ethical trade-offs and the development of consent models that are adaptable to
these evolving needs.
The ethical considerations of consent and privacy in data collection are complex and multifaceted.
They involve not only the protection of individual rights but also the broader implications of data
usage on societal norms and values. The insights from Plutzer (2019), Kreuter et al. (2020) and
Lee (2021) collectively highlight the need for consent processes that are transparent, respectful of
individual autonomy, and adaptable to the changing nature of data science and technology. These
processes must be designed to address the dual objectives of protecting individual privacy and
ensuring the integrity and utility of the data collected.
Addressing the Ethical Complexities of Big Data
The ethical complexities of big data are a growing concern in the digital age, where the vast
amounts of data collected and analyzed can have significant implications for democracy,
individual rights, and societal norms. Christodoulou and Iordanou (2021) explore the challenges
posed by the use of Artificial Intelligence (AI) and Big Data in digital media, particularly in the
context of democratic societies. Their research highlights the incongruence of Big Data and AI
usage with fundamental democratic principles and human rights, emphasizing the covert
exploitation and erosion of individual agency and autonomy. This study underscores the need for
ethical frameworks that can mitigate the negative implications of Big Data and AI, ensuring that
digital media contributes positively to democratic processes and the well-being of citizens.
Hosseini, Wieczorek, and Gordijn (2022) delve into the ethical issues specific to social science
research employing big data. They identify unique challenges such as the interpretative nature of
social science and big data, complexities in managing risks in publication and reuse, and the lack
of regulatory oversight. The authors propose using David Resnik's research ethics framework,
focusing on principles like honesty, carefulness, openness, efficiency, respect for subjects, and
social responsibility. This framework provides a comprehensive approach to addressing ethical
issues in big data research, particularly those related to methodological biases, data availability,
and individual and social harms.
Weinhardt (2021) discusses the ethical concerns of big data in the social sciences, highlighting its
research potential and the implications arising from its characteristics. The paper points out that
while big data allows for the analysis of actual behavior and networks on a grand scale, it also
raises ethical issues such as the need for documentation and dissemination of methods, data, and
results, the challenges of anonymization and re-identification, and the ability of stakeholders to
handle ethical issues. The risks involved in the misuse of big data, valuable to companies,
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 8
criminals, and state actors alike, necessitate a careful approach to its adoption in social science
research.
The ethical complexities of big data demand a multifaceted approach that balances the potential
benefits of data-driven insights with the protection of individual rights and democratic values.
These studies highlight the need for ethical frameworks that are adaptable, transparent, and
inclusive. These frameworks should guide researchers and practitioners in navigating the ethical
challenges posed by big data, ensuring that its usage is aligned with societal values and contributes
to the public good.
As big data continues to play a pivotal role in shaping various sectors, from healthcare to consumer
industries, the significance of addressing its ethical complexities becomes increasingly critical.
Researchers and practitioners must navigate these ethical challenges with a deep understanding of
the evolving regulatory landscape and the diverse needs and expectations of data subjects. This
involves a continuous process of education, reflection, and adaptation to ensure that ethical
considerations are integrated into every stage of the data lifecycle.
Overview of Global Regulatory and Ethical Frameworks in Data Science
The development and implementation of global regulatory and ethical frameworks in data science
are crucial for addressing the challenges posed by the rapid advancement of technology and the
increasing reliance on data. Ochang, Eke, and Stahl (2023) explore the ethical positions that
underpin global brain data governance. Their study focuses on the ethical and legal principles
applicable to 'big brain data', highlighting the need for a global governance framework in this field.
The diversity of ethical and legal principles across different jurisdictions presents a complex
landscape for collaborative efforts in brain data research. This research underscores the importance
of understanding these principles to catalyze the development of a comprehensive global
governance framework.
Georgieva et al. (2022) contribute to the discourse on AI ethics by mapping AI ethical principles
onto the lifecycle of AI-based digital services and products. Their work emphasizes the gap
between theoretical ethical frameworks and practical data science applications. The study provides
a critical reflection on the operationalization of ethics in data science, highlighting the need for
explicit governance models that clarify responsibilities and facilitate the practical implementation
of ethical principles in data science.
Austin (2023) discusses the data governance gap revealed by the COVID-19 pandemic,
particularly in the context of public health emergencies. The paper argues for a broader framework
of data governance, addressing foundational questions about decision-making, accountability, and
oversight in data flows. This approach accommodates emerging data themes such as access to data,
collective decision-making, data intermediaries, and social trust. The experience with contact
tracing apps during the pandemic demonstrates the unresolved governance challenges and the need
for normative frameworks to address these issues effectively.
The overview of global regulatory and ethical frameworks in data science reveals a landscape
marked by diverse challenges and opportunities. These frameworks should guide researchers,
policymakers, and practitioners in navigating the ethical challenges posed by data science,
ensuring that its usage is aligned with societal values and contributes to the public good.
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 9
As data science continues to evolve and integrate into various domains, the significance of
developing and implementing robust regulatory and ethical frameworks becomes increasingly
critical. These frameworks must be capable of addressing the complex ethical dilemmas posed by
advancements in technology and the growing volumes of data. Researchers and practitioners must
navigate these challenges with a deep understanding of the evolving regulatory landscape and the
diverse needs and expectations of data subjects. This involves a continuous process of education,
reflection, and adaptation to ensure that ethical considerations are integrated into every stage of
the data lifecycle.
Bridging the Research Gap in Data Ethics
Bridging the research gap in data ethics is essential for ensuring responsible and ethical use of data
in various fields. Hosseini, Wieczorek, and Gordijn (2022) address the ethical issues in social
science research employing big data, highlighting the intersection between big data ethics, social
science research, and research ethics. They emphasize the interpretative character of both social
science and big data, the complexities of managing risks in publication and reuse, and the lack of
regulatory oversight. The study proposes using David Resnik's research ethics framework to
analyze ethical issues in big data social science research, focusing on principles such as honesty,
carefulness, openness, efficiency, respect for subjects, and social responsibility.
Phan et al. (2022) discuss a model for data ethics instruction for non-experts, addressing the
disparities in academic curriculum for data and computational science. They highlight the
significant gaps in ethics training for the next generation of data-intensive researchers and propose
an interdisciplinary workshop approach to meet the need for additional training in data ethics.
Their model emphasizes the importance of highlighting resources that can be used by non-experts
to engage productively with data ethics topics.
Hirsch et al. (2019) explore corporate data ethics and data governance transformations in the age
of advanced analytics and AI. Their research provides insights into why leading companies are
pursuing data ethics and what this looks like in practice. The study offers first-hand accounts of
the ethical dilemmas companies encounter, the emerging substantive frameworks they use to
assess them, and the management processes employed to pursue data ethics. This research
highlights the importance of understanding corporate data ethics in practice and provides useful
ideas for companies seeking to pursue data ethics.
Reed‐Berendt et al. (2022) analyze the ethical implications of big data research in public health,
specifically in the context of the UK-REACH study. They advocate for a "Big Data Ethics by
Design" approach, arguing that ethical values and principles in big data health research projects
are best adhered to when integrated into the project aims and methods at the design stage. This
principle extends the work of those who advocate ethics by design by addressing prominent issues
in big data health research projects.
The research gap in data ethics is characterized by a need for comprehensive ethical frameworks,
practical models for ethics instruction, and an understanding of how ethical principles are
operationalized in different sectors. The insights from Hosseini, Wieczorek, and Gordijn (2022),
Phan et al. (2022), Hirsch et al. (2019) and Reed‐Berendt et al. (2022) collectively highlight the
need for frameworks and models that are adaptable, transparent, and inclusive. These frameworks
and models should guide researchers, policymakers, and practitioners in navigating the ethical
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 10
challenges posed by data science, ensuring that its usage is aligned with societal values and
contributes to the public good.
The significance of developing and implementing robust ethical frameworks and instructional
models becomes increasingly critical as data science continues to evolve and integrate into various
domains. These frameworks and models must be capable of addressing the complex ethical
dilemmas posed by advancements in technology and the growing volumes of data. Researchers
and practitioners must navigate these challenges with a deep understanding of the evolving ethical
landscape and the diverse needs and expectations of data subjects. This involves a continuous
process of education, reflection, and adaptation to ensure that ethical considerations are integrated
into every stage of the data lifecycle.
Defining the Scope and Aims of This Comprehensive Review
The scope and aims of this comprehensive review are to critically examine and synthesize the
current state of ethical considerations in data collection and analysis within the rapidly evolving
field of data science. This review intends to provide a thorough exploration of the ethical
dimensions that underpin modern data practices, tracing the evolution of ethical standards and
highlighting their significance in contemporary applications. It delves into core ethical principles
that govern data collection and analytical processes, emphasizing the crucial roles of consent and
privacy as foundational elements in ethical data handling. Furthermore, the review addresses the
multifaceted ethical complexities introduced by big data, exploring how these challenges are being
navigated in various sectors and the implications for global regulatory and ethical frameworks.
The aim is to bridge the existing research gap in data ethics, offering insights into the development
of practical ethical frameworks and instructional models that can guide researchers, policymakers,
and practitioners. By doing so, the review seeks to contribute to the discourse on ethical data
science, providing a nuanced understanding of how ethical considerations are integrated into data
science practices and the impact of these practices on society. The ultimate goal is to foster a
deeper comprehension of the ethical landscape in data science, encouraging responsible and
transparent data practices that align with societal values and contribute positively to the public
good. This comprehensive review, therefore, serves as a valuable resource for academics, industry
professionals, and policymakers, offering a detailed and critical examination of the ethical
dimensions in data science and suggesting pathways for future research and practice.
METHODS
Methodology for Systematic Literature Review and Analysis
The methodology for this systematic literature review in data ethics involves a comprehensive and
structured approach to identify, evaluate, and synthesize relevant research. Davies, Ives, and Dunn
(2015) emphasize the importance of a systematic review in empirical bioethics, highlighting the
need for a consensus on appropriate methodologies. Their review presents various empirical
bioethics methodologies, underscoring the dialogical or consultative nature of these approaches.
This insight is crucial for understanding how normative conclusions in ethical research are justified
and reached.
Stegenga et al. (2018) demonstrate the application of mixed methods in a systematic scoping
review, particularly in the context of big data use in early intervention research. Their approach,
which combines qualitative and quantitative analyses, provides a foundational understanding of
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 11
the literature and identifies strengths, challenges, and implications for researchers and
policymakers. This mixed methods approach is particularly relevant for exploring the ethical and
practical implications of big data in various fields.
Establishing Criteria for Assessing Ethical Practices in Data Studies
Establishing criteria for assessing ethical practices in data studies is a critical step in ensuring the
integrity and societal acceptance of data science practices. Neff et al. (2017) propose a practice-
based framework for improving critical data studies and data science. Their framework emphasizes
the interpretive nature of data, the inextricability of data from context, and the sociomaterial
arrangements that produce data. This approach is instrumental in developing criteria for ethical
practices in data studies, focusing on communication, collective sense-making, and the storytelling
nature of data.
Nock et al. (2021) provide a consensus statement on ethical and safety practices for conducting
digital monitoring studies, particularly with individuals at risk of suicide and related behaviors.
Their work establishes criteria for inclusion, informed consent elements, technical and safety
procedures, data review practices, and response strategies for participant risk. This consensus
statement offers valuable guidance for researchers in developing ethical criteria for data studies
involving sensitive topics.
RESULTS OF THE STUDY
Current Ethical Practices in Data Collection and Analysis
The current ethical practices in data collection and analysis are evolving rapidly, influenced by
technological advancements and the increasing complexity of data-driven research. Padmapriya
and Parthasarathy (2021) discuss the ethical challenges and limitations in the data collection
process during medical image analysis (MIA) in healthcare research. They propose a structured
approach to ethical data collection, emphasizing the need for data scientists to address potential
ethical concerns before commencing data analytics on medical datasets. This approach is
indicative of the proactive measures needed to ensure ethical compliance in data science,
especially in sensitive areas like healthcare.
Facca et al. (2020) explore the ethical dimensions and challenges associated with conducting
digital data collection in research involving minors. Their scoping review identifies key ethical
issues such as consent, data handling, minors’ data rights, and the distinction between private and
public data. The study highlights the uncertainty and ethical considerations that arise when minors
are involved in digital technology research, suggesting co-producing ethical practice between
researchers and minors as a mechanism to address these concerns.
Stainton and Iordanova (2017) reflect on the ethics of utilizing travel blog content as a method of
data collection. They argue that due to the diverse and continuously evolving nature of travel blogs,
a blanket ethical approach is insufficient. The paper proposes a set of broad ethical principles for
travel blog analysis, focusing on the blogger's role as a human subject, the public or private nature
of data, the need for informed consent, and the blogger's status as an author or respondent. This
perspective contributes to the body of ethical research by addressing the unique challenges in
analyzing digital content.
The current ethical practices in data collection and analysis are characterized by a heightened
awareness of the ethical implications of data usage and the need for adaptable ethical frameworks.
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 12
The study by Padmapriya and Parthasarathy (2021), Facca et al. (2020) and Stainton and Iordanova
(2017) provide diverse aspects of these ethical considerations, ranging from healthcare research to
digital data collection involving minors and the analysis of online content. These developments
reflect a broader trend towards responsible and transparent data handling, ensuring that the benefits
of data science are realized without compromising ethical principles.
Identifying Trends in Ethical Data Analysis Techniques
The field of data science is continuously evolving, with new trends and challenges emerging in
ethical data analysis techniques. Juliardi and Malik (2023) conducted a bibliometric analysis of
the top scientific articles in data science, revealing significant developments in the field. Their
study highlights the importance of data science, big data, data analysis, and machine learning,
indicating a trend towards more sophisticated and ethically aware data analysis techniques.
Singh et al. (2023) delve into contemporary trends in data collection, analysis, and visualization
methods in data science. Their study emphasizes the significant role of artificial intelligence in
augmenting data collection processes and underscores the importance of advanced data analysis
techniques in extracting meaningful insights. The research conducted by Singh et al. points to a
trend of integrating innovative approaches in data science, not solely for technological
advancement but also for ensuring that these advancements align with ethical standards. This trend
is vital for maintaining public trust in data-driven technologies and for ensuring that the benefits
of these technologies are realized responsibly and equitably.
Goyal et al. (2020) discuss the emerging trends and challenges in data science and big data
analytics, focusing on the concerns and complexities faced by data scientists. They address issues
such as scalability, privacy, and trust in data analysis, highlighting the need for ethical
considerations in handling large datasets. The study provides a comparative analysis of the
challenges in data science, suggesting a trend towards more comprehensive and ethically sound
data analysis practices.
The current trajectory in data science indicates a significant shift towards integrating ethical
considerations into data analysis techniques. The work of Juliardi and Malik (2023), and Goyal et
al. (2020) collectively underscore this trend, revealing how ethical considerations are becoming
increasingly central in the field.
Juliardi and Malik's (2023) bibliometric analysis not only highlights the rapid development in data
science but also suggests an underlying movement towards ethical awareness. The prominence of
topics like machine learning and big data in their analysis points to a growing recognition of the
ethical implications associated with these technologies. This trend is indicative of a broader shift
in the data science community, where ethical considerations are increasingly seen as integral to
the research and application of data science methodologies.
Hirsch et al. (2020) explore contemporary trends in data science, further reinforcing the shift
towards ethical considerations in the field. Their emphasis on the ethical use of artificial
intelligence (AI) and data analytics in business decision-making processes reflects a growing
awareness of the need for responsible data handling. The integration of innovative data collection
and analysis methods, as highlighted in their research, is not just about technological advancement
but also about ensuring that these advancements are aligned with ethical standards. This trend is
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 13
crucial for maintaining public trust in data-driven technologies and for ensuring that the benefits
of these technologies are realized responsibly and equitably.
Goyal et al. (2020) address the challenges and concerns in data science, particularly in the context
of big data analytics. Their discussion on privacy, scalability, and trust issues in data analysis
underscores the ethical complexities inherent in handling large datasets. The trend towards
addressing these ethical challenges head-on is a positive development in the field. It suggests a
growing recognition of the need for data scientists to not only be skilled in technical aspects of
data analysis but also be adept in navigating the ethical landscape in which these analyses take
place.
Ethical Challenges and Dilemmas in Data Science: A Comprehensive Overview
The ethical challenges and dilemmas in data science are diverse and multifaceted, reflecting the
complexity and rapid evolution of the field. Tanweer et al. (2017) present a case study in the
context of a "data science for social good" project, highlighting the ethical dilemmas encountered
in projects aimed at improving navigation for people with limited mobility. The study illustrates
how ethical decisions in data science are not always clear-cut and often involve navigating a series
of dilemmas, such as balancing privacy concerns with the benefits of data collection and use. This
case study underscores the importance of ethical thinking as an integral part of data science
projects, especially those with social implications.
Myers (2021)) delves into the ethical dilemmas, power imbalances, and challenges posed by big
data analytics. The paper discusses the ethical implications of big data, particularly in terms of
how it can create or exacerbate power imbalances. Myers emphasizes the need for ethical
considerations in the design and implementation of big data projects, highlighting the role of
design science research in addressing these challenges. This perspective is crucial for
understanding the broader societal implications of data science and the responsibility of data
scientists to consider the ethical dimensions of their work.
Hesse et al. (2019) explore the ethical challenges in qualitative research in the era of big data.
Their work provides a comprehensive overview of the ethical issues arising from new data sources,
research methods, and forms of data storage that big data introduces to social sciences. The authors
propose five principles for qualitative researchers to navigate these emerging research landscapes,
including valuing methodological diversity, retaining context and specificity, addressing ethical
dilemmas beyond legal concerns, recognizing regional and disciplinary differences, and
considering the entire lifecycle of research. These principles are essential for guiding ethical
decision-making in qualitative research in the context of big data.
The ethical challenges and dilemmas in data science are characterized by a need for careful
consideration of the implications of data collection, analysis, and use. These studies highlight the
diverse ethical considerations that data scientists must navigate, from privacy and consent to power
imbalances and the impact of data on marginalized populations. As data science continues to
evolve, these ethical challenges will likely become more complex, requiring ongoing dialogue,
research, and adaptation to ensure that data science practices are aligned with ethical principles
and societal values.
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 14
The Relationship Between Ethical Practices and Data Integrity
The relationship between ethical practices and data integrity is a cornerstone of credible research.
Berg's (2016) exploration of ethics in doctoral research underscores the importance of ethical
considerations throughout the research process. The 'Eight Ethical Principles' proposed serve as a
guideline for researchers, highlighting the role of ethics in maintaining data integrity. This
perspective is crucial for understanding how ethical considerations underpin the reliability and
validity of research findings.
Masic (2023) discusses the importance of integrity and ethics in research and science publication.
His work underscores the fundamental values of science and research ethics, such as honesty,
transparency, respect, and accountability. The emphasis on good management of scientific activity
and the intrinsic relationship between research integrity and ethics in publication highlights the
critical role of ethical practices in ensuring data integrity.
Comparative Review of Ethical Standards Across Different Regions
The comparative review of ethical standards across different regions is a complex and multifaceted
endeavor, reflecting the diversity of cultural, academic, and professional practices worldwide.
Berg's exploration of the ethical dimension in doctoral research provides a foundational
understanding of how ethical principles are applied in academic settings. The 'Eight Ethical
Principles' outlined by Berg (2016) serve as a universal framework, yet their application can vary
significantly depending on the cultural and regional context. This variation underscores the
importance of contextualizing ethical standards to ensure they are relevant and effective in
different settings.
Gasparyan (2017) sheds light on the diversity of ethical practices in research and science
publication. His analysis emphasizes the fundamental values underpinning scientific research,
such as honesty, transparency, respect, and accountability. However, the application of these
ethical principles can significantly vary across different regions, influenced by local cultural
norms, legal frameworks, and institutional policies. For instance, approaches to authorship,
conflict of interest, and data sharing may differ, reflecting diverse understandings and expectations
of ethical conduct in various academic and research communities.
The comparative analysis also reveals that while some ethical principles are universally
recognized, their interpretation and implementation can be subject to regional variations. For
example, the concept of informed consent, a cornerstone of ethical research, may be understood
and practiced differently in various cultural contexts. Similarly, the handling of sensitive data,
privacy concerns, and the balance between transparency and confidentiality can differ,
necessitating a nuanced approach to ethical research practices in different regions.
Moreover, the review highlights the need for ongoing dialogue and collaboration across regions to
harmonize ethical standards while respecting cultural diversity. This is particularly important in
an increasingly interconnected global research environment, where cross-border collaborations are
common. Establishing common ethical ground, while allowing for regional specificities, is crucial
for fostering international cooperation and ensuring the integrity and credibility of research
conducted across different geographical and cultural landscapes.
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 15
Case Studies: Ethical Successes and Failures in Data Handling
The exploration of case studies in ethical successes and failures in data handling provides valuable
insights into the practical application of ethical principles in various contexts. Ntlhakana et al.
(2021) present a case study from the mining industry in South Africa, focusing on the ethical
challenges and data sharing practices related to occupational hearing loss among platinum miners.
This case highlights the complexities of handling sensitive health data, including issues of
confidentiality, informed consent, and the ethical use of data for research purposes. The study
underscores the importance of clear policies and ethical guidelines in industries where employee
health data is a critical component of research and healthcare delivery.
Kabeyi (2018) discusses ethical and unethical leadership issues through various case studies,
shedding light on the ethical dilemmas faced by organizations and individuals. The paper
emphasizes the role of ethical leadership in creating an environment that fosters ethical decision-
making and behavior. The case studies illustrate how unethical practices can lead to significant
consequences for organizations, employees, and stakeholders, highlighting the need for ethical
vigilance and integrity in leadership.
Hlaing et al. (2023) provide a qualitative analysis of health professions students' approaches to
practice-driven ethical dilemmas. This case-based study offers insights into how future healthcare
professionals navigate complex ethical situations, emphasizing the role of education in shaping
ethical decision-making. The findings reveal the importance of incorporating ethical training in
healthcare education, preparing students to handle ethical challenges effectively in their
professional practice.
Kanengoni et al. (2017) examine the ethical considerations in handling a complaint against a
clinical trial study team. This case study from Zimbabwe highlights the critical role of ethics
committees in protecting participants and ensuring the integrity of research. The investigation into
the complaint demonstrates the importance of transparency, accountability, and adherence to
ethical principles in clinical research, particularly in addressing concerns raised by participants or
external entities.
Public Trust and Perception in Ethical Data Management
Public trust and perception play a pivotal role in ethical data management, influencing how data-
driven systems are received and utilized by society. Hartman et al. (2020) conducted a
comprehensive survey in the UK to gauge public views on data management approaches. Their
findings reveal a general dissatisfaction with the current approach where commercial organizations
control personal data. The public prefers methods that offer them control over their data, involve
regulatory oversight, or provide options to opt out of data gathering. This study highlights the
public's desire for 'good data management,' which aligns only partially with the principles
identified by policy experts and researchers. It underscores the importance of aligning data
practices with public expectations to maintain trust and ethical integrity.
Kim and Choi (2023) conduct an empirical analysis of the relationship between government trust
and public perception, particularly in the context of contracting out public services. Their study
demonstrates that effective contracting out between public and private organizations is greatly
enhanced by strong trust and collaboration among key stakeholders, including the government,
private companies, and civil society organizations. The research underscores the significant impact
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 16
of public trust in government and its data management practices on public perception and the
acceptance of outsourcing services. According to Kim and Choi, maintaining high levels of trust
is crucial for ethical data management and the success of public-private partnerships.
The intersection of public trust, perception, and ethical data management is a complex and dynamic
area that requires continuous attention and adaptation. As data-driven systems become
increasingly integrated into various aspects of society, ensuring public trust through ethical data
management practices is crucial for the successful implementation and acceptance of these
systems.
DISCUSSION OF THE RESULTS
Analyzing the Broader Impact of Ethics on Data Science
The broader impact of ethics on data science is a critical area of study, encompassing various
aspects such as privacy, consent, and data utility. Martens (2022) addresses the ethical dimensions
of data science, highlighting both the positive outcomes and negative consequences associated
with the field. The book emphasizes the need for data scientists and business managers to be
trained in ethical considerations, suggesting techniques like differential privacy and explainable
AI to address privacy and discrimination concerns. This work is pivotal in understanding the
balance between ethical concerns and data utility, illustrating the importance of ethical thinking in
data science.
Hand (2018) explores the ethical challenges in data science, particularly in the context of big data.
The paper discusses the potential for misuse of data and the need for ethical oversight to ensure a
balance between technological advancement and ethical constraints. Hand delves into various
ethical issues, including data ownership, consent, trustworthiness of algorithms, and privacy. This
comprehensive analysis provides insights into the intrinsic complexity of data ethics and the need
for a structured approach to address these challenges.
MacPherson and Pham (2020) discuss the ethical considerations in health data science, focusing
on the responsibilities that accompany the adoption of digital health technologies. The chapter
emphasizes the importance of community and end-user needs in ethical decision-making,
addressing concerns such as data privacy, security, and consent. The authors advocate for a
community-centered approach, where ethical considerations follow from prioritizing the needs of
the concerned community. This perspective is crucial for understanding the ethical implications of
leveraging technology for global health, especially in resource-poor regions.
The impact of ethics on data science is multifaceted, requiring a careful balance between
technological advancements and ethical considerations. As data science continues to evolve, these
ethical considerations will become increasingly central, guiding the development of new
methodologies and shaping the future of the field.
Navigating the Intersection of Privacy, Consent, and Data Utility
Navigating the intersection of privacy, consent, and data utility in data science is a complex and
evolving challenge. Carrigan, Green, and Rahman-Davies (2021) explore the concept of consent
in data science, particularly in the context of gender relations and the extraction of human
experiences for Big Data. Their study introduces the idea of 'techniques of invisibility,' which
refers to the imbalance between exposure and opacity in data science. This concept is crucial for
understanding how power dynamics and ethical considerations intersect in the field, particularly
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 17
in relation to privacy and consent. The authors argue that these techniques, including epistemic
injustice and the Brotherhood, contribute to exploitative relations in data science, highlighting the
need for bidirectional transparency in algorithm production and application.
Arellano et al. (2018) delve into the privacy policies and technological aspects of biomedical data
science. Their review covers key topics such as the Common Rule, HIPAA privacy and security
rules, governance, patients' viewpoints, consent practices, and research ethics. The paper addresses
the challenges of protecting patient privacy while facilitating biomedical research, emphasizing
the importance of ethical data management practices. The authors discuss various technological
solutions, including deidentification methods and privacy-preserving predictive modeling, to
balance the trade-off between data utility and privacy concerns. This work is instrumental in
understanding how privacy policies and technological advancements can work together to ensure
ethical data management in biomedical research.
The intersection of privacy, consent, and data utility in data science requires a nuanced approach
that considers the ethical implications of data collection and analysis. The insights from Carrigan,
Green, and Rahman-Davies (2021) and Arellano et al. (2018) highlight the importance of
transparent and ethical data practices that respect individual privacy while maximizing the utility
of data. As data science continues to evolve, these considerations will become increasingly central,
guiding the development of new methodologies and shaping the future of the field.
Formulating Robust Strategies for Ethical Data Governance
Formulating robust strategies for ethical data governance in data science is essential to address the
ethical challenges and opportunities presented by the field. Egger, Neuburger, and Mattuzzi (2022)
discuss the ethical issues in data science, particularly in the context of privacy rights, data validity,
and algorithm fairness. Their work emphasizes the need for a common conceptual framework for
ethics in data science, highlighting the importance of addressing ethical challenges in areas like
Big Data, Artificial Intelligence, and Machine Learning. This perspective is crucial for developing
comprehensive strategies that ensure ethical practices in data science, particularly in industries like
tourism where data science plays a significant role.
Leonelli's (2019) work on data governance provides a philosophical perspective on the
characteristics of data-centric research and the conceptualization of data. The paper proposes a
relational view of data, where the meaning assigned to data depends on the motivations and
instruments used to analyze them. This approach is fundamental to producing knowledge and
significantly influences its content. Leonelli emphasizes the importance of governance strategies
around data collection, management, and processing, addressing concerns around interpreting data
and assessing their quality. This perspective is instrumental in formulating strategies for ethical
data governance, ensuring that data are collected, managed, and processed in a way that respects
ethical principles and societal values.
Projecting Future Ethical Challenges in Data Science Evolution
The evolution of data science is accompanied by emerging ethical challenges that need to be
addressed to ensure responsible and beneficial use of technology. Da Bormida (2021) discusses
the benefits, threats, and ethical challenges of the Big Data world, emphasizing the 'creep factor'
of Big Data, which refers to the misuse of data that bypasses privacy and data protection laws. The
paper highlights the ethical challenges posed by new surveillance tools, data gathering techniques,
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 18
high-tech profiling, automated decision-making, and discriminatory practices. These challenges
underscore the need for ethical frameworks that can guide the responsible use of Big Data and AI,
particularly in the context of sustainable development.
Ryan et al. (2019) explore the ethical side effects of using AI and Big Data to meet the United
Nations' Sustainable Development Goals (SDGs). Through six empirical case studies, the paper
examines potential ethical issues related to the use of smart information systems (SIS) in achieving
various SDGs. The authors argue that while SIS offer great potential for meeting these goals, they
also raise ethical challenges in their implementation. The paper highlights the need for a holistic
approach to address these challenges, ensuring that the use of SIS does not exacerbate or create
new issues for the development community.
The future ethical challenges in data science require a proactive and comprehensive approach to
ensure that technological advancements are aligned with ethical principles and societal values.
CONCLUSION
This comprehensive review has successfully met its aim and objectives, offering a detailed
exploration of ethical considerations in modern data collection and analysis. At the core of this
study was an investigation into the evolving ethical practices and challenges within data science,
focusing on the development of ethical standards, the significance of ethical considerations in
contemporary data practices, and the intricacies of global regulatory and ethical frameworks. The
key findings from this research illuminate a dynamic ethical landscape in data science,
characterized by evolving standards and practices that are increasingly integral to data collection
and analytical processes. Central to these findings is the recognition of consent and privacy as
foundational elements in ethical data handling, alongside the need for adaptable and inclusive
ethical frameworks to address the complexities introduced by big data.
The study has effectively bridged the research gap in data ethics, providing valuable insights into
the development of practical ethical frameworks and instructional models. These models are
essential in guiding researchers, policymakers, and practitioners through the ethical complexities
of data collection and analysis, ensuring that data practices contribute positively to societal well-
being while adhering to ethical principles. The conclusion drawn from this study emphasizes the
necessity of integrating ethical considerations into every aspect of data science. As the field
continues to expand and integrate into various sectors, the importance of adhering to ethical
principles becomes increasingly critical. Researchers and practitioners must be equipped with the
knowledge and tools to navigate the ethical landscape, ensuring that data practices not only adhere
to ethical principles but also contribute positively to societal well-being.
In light of these findings, the study recommends enhancing ethical education and training in data
science, equipping professionals with the skills to navigate ethical dilemmas effectively. There is
a call for the development of inclusive ethical frameworks that are adaptable, transparent, and
considerate of the diverse needs and expectations of data subjects. Strengthening regulatory
oversight is crucial to ensure compliance with ethical standards in data science practices.
Additionally, promoting public engagement in discussions about data ethics is vital for maintaining
trust and transparency in data-driven technologies.
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 19
This study makes a significant contribution to the discourse on ethical data science, providing a
nuanced understanding of the challenges and opportunities in the field and suggesting pathways
for future research and practice.
References
Andrews, J.T., Zhao, D., Thong, W., Modas, A., Papakyriakopoulos, O., Nagpal, S., & Xiang, A.
(2023). Ethical considerations for collecting human-centric image datasets. arXiv preprint
arXiv:2302.03629. DOI: 10.48550/arXiv.2302.03629
Arellano, A.M., Dai, W., Wang, S., Jiang, X., & Ohno-Machado, L. (2018). Privacy policy and
technology in biomedical data science. Annual Review of Biomedical Data Science, 1, 115-
129. DOI: 10.1146/annurev-biodatasci-080917-013416
Austin, L.M. (2023). COVID-19 and the data governance gap. Annual Review of Law and Social
Science, 19. DOI: 10.1146/annurev-lawsocsci-050520-101947
Berg, C. (2016). The rules of engagement': the ethical dimension of doctoral research.
Bormida, M.D. (2021). The big data world: benefits, threats and ethical challenges. In Ethical
Issues in Covert, Security and Surveillance Research (pp. 71-91). Emerald Publishing
Limited. DOI: 10.1108/s2398-601820210000008007
Carrigan, C., Green, M.W., & Rahman-Davies, A. (2021). “The revolution will not be supervised”:
Consent and open secrets in data science. Big Data & Society, 8(2), 20539517211035673.
DOI: 10.1177/20539517211035673
Christodoulou, E., & Iordanou, K. (2021). Democracy under attack: challenges of addressing
ethical issues of AI and big data for more democratic digital media and societies. Frontiers
in Political Science, 3, 682945. DOI: 10.3389/fpos.2021.682945
Davies, R., Ives, J., & Dunn, M. (2015). A systematic review of empirical bioethics
methodologies. BMC Medical Ethics, 16(1), 1-13. DOI: 10.1186/s12910-015-0010-3
Egger, R., Neuburger, L., & Mattuzzi, M. (2022). Data science and ethical issues: between
knowledge gain and ethical responsibility. In Applied Data Science in Tourism:
Interdisciplinary Approaches, Methodologies, and Applications (pp. 51-66). Cham:
Springer International Publishing. DOI: 10.1007/978-3-030-88389-8_4
Facca, D., Smith, M.J., Shelley, J., Lizotte, D., & Donelle, L. (2020). Exploring the ethical issues
in research using digital data collection strategies with minors: A scoping review. Plos
One, 15(8), e0237875. DOI: 10.1371/journal.pone.0237875
Fisher, C.B., Bragard, E., & Bloom, R., (2020). Ethical considerations in HIV eHealth intervention
research: Implications for informational risk in recruitment, data maintenance, and consent
procedures. Current HIV/AIDS Reports, 17, 180-189. DOI: 10.1007/s11904-020-00489-z
Gasparyan, A.Y. (2017). Publication ethics standards should be the same for all peer-reviewed
journals. Croatian Medical Journal, 58(2), 195. DOI: 10.3325/CMJ.2017.58.195
Georgieva, I., Lazo, C., Timan, T., & van Veenstra, A.F. (2022). From AI ethics principles to data
science practice: a reflection and a gap analysis based on recent frameworks and practical
experience. AI and Ethics, 2(4), 697-711. DOI: 10.1007/s43681-021-00127-3
Gordon, D., Stavrakakis, I., Gibson, J.P., Tierney, B., Becevel, A., Curley, A., Collins, M.,
O’mahony, W., & O’sullivan, D. (2022). Perspectives on computing ethics: a multi-
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 20
stakeholder analysis. Journal of Information, Communication and Ethics in Society, 20(1),
72-90. DOI: 10.1108/jices-12-2020-0127
Goyal, D., Goyal, R., Rekha, G., Malik, S., & Tyagi, A.K. (2020). Emerging trends and challenges
in data science and big data analytics. In 2020 International conference on emerging trends
in information technology and engineering (ic-ETITE) (pp. 1-8). IEEE. DOI: 10.1109/ic-
ETITE47903.2020.316
Hand, D.J. (2018). Aspects of data ethics in a changing world: Where are we now?. Big Data, 6(3),
176-190. DOI: 10.1089/big.2018.0083
Harriss, D.J., Jones, C., & MacSween, A. (2022). Ethical standards in sport and exercise science
research: 2022 update. International Journal of Sports Medicine, 43(13), 1065-1070. DOI:
10.1055/a-1957-2356
Hartman, T., Kennedy, H., Steedman, R., & Jones, R. (2020). Public perceptions of good data
management: Findings from a UK-based survey. Big Data & Society, 7(1),
2053951720935616. DOI: 10.1177/2053951720935616
Hesse, A., Glenna, L., Hinrichs, C., Chiles, R., & Sachs, C. (2019). Qualitative research ethics in
the big data era. American Behavioral Scientist, 63(5), 560-583. DOI:
10.1177/0002764218805806
Hirsch, D.D., Bartley, T., Chandrasekaran, A., Norris, D., Parthasarathy, S., & Turner, P.N. (2020).
Business data ethics: emerging trends in the governance of advanced analytics and
AI. Ohio State Legal Studies Research Paper, (628). DOI: 10.2139/ssrn.3828239
Hirsch, D.D., Bartley, T., Chandrasekaran, A., Parthasarathy, S., Turner, P.N., Norris, D., Lamont,
K., & Drummond, C. (2019). Corporate data ethics: Data governance transformations for
the age of advanced analytics and AI. Ohio State Public Law Working Paper, (522). DOI:
10.2139/ssrn.3478826
Hlaing, P.H., Hasswan, A., Salmanpour, V., Shorbagi, S., AlMahmoud, T., Jirjees, F.J., Kawas,
S.A., Guraya, S.Y., & Sulaiman, N. (2023). Health professions students’ approaches
towards practice-driven ethical dilemmas; a case-based qualitative study. BMC Medical
Education, 23(1), 1-9. DOI: 10.1186/s12909-023-04089-4
Hosseini, M., Wieczorek, M., & Gordijn, B. (2022). Ethical issues in social science research
employing big data. Science and Engineering Ethics, 28(3), 29. DOI: 10.1007/s11948-022-
00380-7
Jameel, B., & Majid, U. (2018). Research fundamentals: Data collection, data analysis, and
ethics. Undergraduate Research in Natural and Clinical Science and Technology
Journal, 2, 1-8. DOI: 10.26685/URNCST.39
Juliardi, M., & Malik, I. (2023). A Bibliometric Analysis of Data Science: Trends, Contributions,
and Research Developments. West Science Interdisciplinary Studies, 1(07), 387-397. DOI:
10.58812/wsis.v1i07.81
Kabeyi, M.J. (2018). Ethical and unethical leadership issues, cases, and dilemmas with case
studies. International Journal of Applied Research, 4(7), 373-379. DOI:
10.22271/allresearch.2018.v4.i7f.5153
Kanengoni, M., Ruzario, S., Ndebele, P., Shana, M., Tarumbiswa, F., Musesengwa, R., & Gutsire,
R. (2017). Ethical considerations in the handling of a complaint report against a study team:
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 21
case of a clinical trial (Earnest) Participant. BMJ Global Health, 2(Suppl 2). DOI:
10.1136/BMJGH-2016-000260.117
Kearns, M., & Roth, A. (2019). The ethical algorithm: The science of socially aware algorithm
design. Oxford University Press. DOI: 10.56315/pscf3-21kearns
Kim, C., & Choi, T. (2023). Contracting out and the fiscal sustainability of public services. Public
Performance & Management Review, 1-30. DOI: 10.1080/15309576.2023.2204073
Kreuter, F., Haas, G.C., Keusch, F., Bähr, S., & Trappmann, M. (2020). Collecting survey and
smartphone sensor data with an app: Opportunities and challenges around privacy and
informed consent. Social Science Computer Review, 38(5), 533-549. DOI:
10.1177/0894439318816389
Kuc-Czarnecka, M., & Olczyk, M. (2020). How ethics combine with big data: a bibliometric
analysis. Humanities and Social Sciences Communications, 7(1), 1-9. DOI:
10.1057/s41599-020-00638-0
Lee, S.S.J. (2021). The ethics of consent in a shifting genomic ecosystem. Annual Review of
Biomedical Data Science, 4, 145-164. DOI: 10.1146/annurev-biodatasci-030221-125715
Leonelli, S. (2019). Data governance is key to interpretation: Reconceptualizing data in data
science. DOI: 10.1162/99608F92.17405BB6
MacPherson, Y., & Pham, K. (2020). Ethics in health data science. Leveraging Data Science for
Global Health, 365-372. DOI: 10.1007/978-3-030-47994-7_22
Martens, D. (2022). Data science ethics: concepts, techniques, and cautionary tales. Oxford
University Press. DOI: 10.1093/oso/9780192847263.001.0001
Masic, I. (2023). Ethics in research and publication of research articles. South Eastern European
Journal of Public Health. DOI: 10.4119/UNIBI/SEEJPH-2014-43
Myers, M.D. (2021). Big data analytics: Ethical dilemmas, power imbalances, and design science
research. Communications of the Association for Information Systems, 49(1), 19. DOI:
10.17705/1cais.04919
Neff, G., Tanweer, A., Fiore-Gartland, B., & Osburn, L. (2017). Critique and contribute: A
practice-based framework for improving critical data studies and data science. Big
Data, 5(2), 85-97. DOI: 10.1089/big.2016.0050
Nock, M.K., Kleiman, E.M., Abraham, M., Bentley, K.H., Brent, D.A., Buonopane, R.J., Castro‐
Ramirez, F., Cha, C.B., Dempsey, W., Draper, J., & Glenn, C.R. (2021). Consensus
statement on ethical & safety practices for conducting digital monitoring studies with
people at risk of suicide and related behaviors. Psychiatric Research and Clinical
Practice, 3(2), 57-66. DOI: 10.1176/appi.prcp.20200029
Ntlhakana, L., Nelson, G., Khoza-Shangase, K., & Dorkin, E. (2021). Occupational hearing loss
for platinum miners in South Africa: A case study of data sharing practices and ethical
challenges in the mining industry. International Journal of Environmental Research and
Public Health, 19(1), 1. DOI: 10.3390/ijerph19010001
Ochang, P., Eke, D., & Stahl, B.C. (2023). Towards an understanding of global brain data
governance: ethical positions that underpin brain data governance discourse. Frontiers in
Big Data, 6, 1240660. DOI: 10.3389/fdata.2023.1240660
International Journal of Applied Research in Social Sciences, Volume 6, Issue 1, January 2024
Okorie, Udeh, Adaga, DaraOjimba, & Oriekhoe, P.No. 1-22 Page 22
Padmapriya, S.T., & Parthasarathy, S. (2023). Ethical data collection for medical image analysis:
a structured approach. Asian Bioethics Review, .1-14. DOI: 10.1007/s41649-023-00250-9
Pessanha, S. N. (2023). The expansion of data science: dataset standardization. Standards, 3(4),
400-410. DOI: 10.3390/standards3040028
Phan, L., Ali, I., Labou, S., & Foster, E. (2022). A model for data ethics instruction for non-
experts. IASSIST Quarterly, 46(4). DOI: 10.29173/iq1028
Plutzer, E. (2019). Privacy, sensitive questions, and informed consent: Their impacts on total
survey error, and the future of survey research. Public Opinion Quarterly, 83(S1), 169-184.
DOI: 10.1093/POQ/NFZ017
Reed‐Berendt, R., Dove, E.S., Pareek, M., & UK‐REACH Study Collaborative Group (2022). The
Ethical implications of big data research in public health:“big data ethics by design” in the
UK‐REACH Study. Ethics & human research, 44(1), 2-17. DOI: 10.1002/eahr.500111
Ryan, M., Antoniou, J., Brooks, L., Jiya, T., Macnish, K., & Stahl, B. (2019). Technofixing the
Future: ethical side effects of using ai and big data to meet the SDGs. DOI:
10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00101
Singh, K.U., Pandey, S.K., Yadav, D.P., Singh, T., Kumar, G., & Kumar, A. (2023). Data Science–
A compendious study on statistical methods and visualization techniques. In 2023
International Conference on Computational Intelligence and Sustainable Engineering
Solutions (CISES) (pp. 227-232). IEEE. DOI: 10.1109/CISES58720.2023.10183429
Stainton, H., & Iordanova, E. (2017). An ethical perspective for researchers using travel blog
analysis as a method of data collection. Methodological Innovations, 10(3),
2059799117748136. DOI: 10.1177/2059799117748136
Stegenga, S.M., Munger, K., Squires, J., & Anderson, D. (2018). A mixed methods systematic
scoping review of the use of big data in early intervention research: Ethical and practical
implications. DOI: 10.31219/osf.io/5jg2p
Tanweer, A., Bolten, N., Drouhard, M., Hamilton, J., Caspi, A., Fiore-Gartland, B., & Tan, K.
(2017). Mapping for accessibility: A case study of ethics in data science for social
good. arXiv preprint arXiv:1710.06882.
Wang, Y., Shaw, E., Kruse, B., & Ghods, M. (2018). On identifying factors affecting ethical
practices in data science domains. SMU Data Science Review, 1(2), 2.
Weinhardt, M. (2021). Big data: Some ethical concerns for the social sciences. Social
Sciences, 10(2), 36. DOI: 10.3390/SOCSCI10020036
Zhang-Kennedy, L., & Chiasson, S. (2021). " Whether it's moral is a whole other story": Consumer
perspectives on privacy regulations and corporate data practices. In Seventeenth
Symposium on Usable Privacy and Security (SOUPS 2021) (pp. 197-216).