Article

Introduction: Making sense of data-driven research in the biological and biomedical sciences

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

This article belongs to a special issue: Data-Driven Research in the Biological and Biomedical Sciences On Nature and Normativity: Normativity, Teleology, and Mechanism in Biological Explanation. Edited By Sabina Leonelli, Lenny Moss and Daniel J. Nicholson

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Nonetheless, several aspects of current research practice make reflection on the origin of scientific ideas essential. Ecology is moving into the realm of Big Science (Powers & Hampton, 2019), data-driven research is considered as the fourth methodological paradigm (see Leonelli, 2012), and bioinformatics as a framework that enables scientists to generate new knowledge through innovative approaches for managing and analyzing huge data sets (Michener & Jones, 2012). ...
... Bioinformatics is delighting biology researchers because it is difficult to develop authentic predictive theory in the discipline (Coveney et al., 2016;Weinberg, 2010). Although attractive, the idea of exploratory research as hypothesis generator is not sufficiently clear (Bunge, 2001;Desjardins et al., 2021;Leonelli, 2012). Given that massive data exploration, e.g., by using machine learning or other algorithms, is expected to increase its influence on ecology and environmental sciences (Nichols et al., 2019), it may be valuable to deepen reflection on the type of ideas that data-driven research might generate, as well as the type of research it stimulates (Leonelli, 2012). ...
... Although attractive, the idea of exploratory research as hypothesis generator is not sufficiently clear (Bunge, 2001;Desjardins et al., 2021;Leonelli, 2012). Given that massive data exploration, e.g., by using machine learning or other algorithms, is expected to increase its influence on ecology and environmental sciences (Nichols et al., 2019), it may be valuable to deepen reflection on the type of ideas that data-driven research might generate, as well as the type of research it stimulates (Leonelli, 2012). I organize the reflection under three questions (a) Can observation give rise to novel hypotheses? ...
Preprint
Few textbooks on research methods offer more than a few words of advice on how to devise scientific hypotheses. Big data is conceived as a hypothesis-generator procedure, a disruptive analytical innovation that is reconfiguring ecological research. My theses are (a) the hypotheses that big-data can originate “stricto sensu” are empirical generalizations that do not provide ecological understanding, (b) empirical generalizations may encourage instrumentalist research, but cannot supply ecological explanation, and (c) generalizations emerging from data-driven research can serve as a problem-generating procedure if they are reflected in the context of the theoretical framework surrounding the research. Discovery (e.g., novel patterns shown by big-data analysis) and invention (e.g., hypotheses on mechanisms and processes conjectured by the human mind) are complementary tools in ecological research because they play different epistemological roles. Data-driven research provides a useful analytical tool, but it does not justify any epistemological or methodological paradigm shift.
... According to these drastic views, the data are supposed to speak for themselves being free of theory. The critics of this radical form of empiricism argue that each system, including the ML ones, is designed to capture certain kinds of data (Berry, 2011;Leonelli, 2012) and, then, the results cannot be considered as free from theory, neither can they simply ask from themselves free of human bias (Gould, 1981). Between extreme positions that characterize the debate, it became very common lately to consider a "data-driven science" (Kitchin, 2014) as a hybrid combination of abductive, inductive, and deductive approaches. ...
... First, data are not simply natural, values-free elements that can be abstracted from the world, but they are generated through a complex assemblage that actively shapes its constitution and often includes the perpetuation of biases (Ribes & Jackson, 2013). Second, theory is embedded in data: no inductive strategy of knowledge extraction from data occurs in a scientific vacuum, because it is framed by previous findings, theories, experience, and knowledge (Leonelli, 2012). Third, data cannot speak for themselves free of human bias: they are always examined through a particular lens that influences how they are interpreted, and the construction of algorithms is imbued with particular values and contextualized within a particular scientific approach. ...
Article
Full-text available
In a historical moment in which Artificial Intelligence and machine learning have become within everyone’s reach, science education needs to find new ways to foster “AI literacy.” Since the AI revolution is not only a matter of having introduced extremely performant tools but has been determining a radical change in how we conceive and produce knowledge, not only technical skills are needed but instruments to engage, cognitively, and culturally, with the epistemological challenges that this revolution poses. In this paper, we argue that epistemic insights can be introduced in AI teaching to highlight the differences between three paradigms: the imperative procedural, the declarative logic, and the machine learning based on neural networks (in particular, deep learning). To do this, we analyze a teaching-learning activity designed and implemented within a module on AI for upper secondary school students in which the game of tic-tac-toe is addressed from these three alternative perspectives. We show how the epistemic issues of opacity, uncertainty, and emergence, which the philosophical literature highlights as characterizing the novelty of deep learning with respect to other approaches, allow us to build the scaffolding for establishing a dialogue between the three different paradigms.
... Application of technologies such as expression microarrays, large-scale sequencing of DNA and RNA, or mass spectrometry, yielded vast amounts of information reminiscent of the massive volumes of data in astronomy and elementary particles physics. Queries of such big data, (Bassett et al., 1999;Brent, 2000;Brown & Botstein, 1999;Leonelli, 2012Leonelli, , 2014Leonelli, , 2016Ratti, 2015;Valencia, 2002;Wang et al., 2009) aspire to detect regularities that would ultimately yield integrative explanatory theories. Although such data-guided explorations are fundamentally distinct from theory-or hypothesis-driven investigations, they are not completely free of theory. ...
... Although such data-guided explorations are fundamentally distinct from theory-or hypothesis-driven investigations, they are not completely free of theory. Thus, theoretical frameworks delineate curation of data, theoretical background informs choice of significant inferences from retrieved pieces of data, and to make sense of emerging knowledge it is necessary to formulate and test provisional hypotheses (Allen, 2001;Colaço, 2018;Kell & Oliver, 2004;Leonelli, 2012;Valencia, 2002). ...
Article
Philosophers of science diverge on the question what drives the growth of scientific knowledge. Most of the twentieth century was dominated by the notion that theories propel that growth whereas experiments play secondary roles of operating within the theoretical framework or testing theoretical predictions. New experimentalism, a school of thought pioneered by Ian Hacking in the early 1980s, challenged this view by arguing that theory-free exploratory experimentation may in many cases effectively probe nature and potentially spawn higher evidence-based theories. Because theories are often powerless to envisage workings of complex biological systems, theory-independent experimentation is common in the life sciences. Some such experiments are triggered by compelling observation, others are prompted by innovative techniques or instruments, whereas different investigations query big data to identify regularities and underlying organizing principles. A distinct fourth type of experiments is motivated by a major question. Here I describe two question-guided experimental discoveries in biochemistry: the cyclic adenosine monophosphate mediator of hormone action and the ubiquitin-mediated system of protein degradation. Lacking underlying theories, antecedent data bases, or new techniques, the sole guides of the two discoveries were respective substantial questions. Both research projects were similarly instigated by theory-free exploratory experimentation and continued in alternating phases of results-based interim working hypotheses, their examination by experiment, provisional hypotheses again, and so on. These two cases designate theory-free, question-guided, stepwise biochemical investigations as a distinct subtype of the new experimentalism mode of scientific enquiry.
... The type of data shared is less than 100% due to researchers saying that it is not very important in their field. genetic targets, which prompted considerable outcry(Leonelli 2012). In March 2000, U.S. President Bill Clinton announced that the data on the Human Genome Project (HGP) sequence should be made freely available to the entire research community. ...
... In March 2000, U.S. President Bill Clinton announced that the data on the Human Genome Project (HGP) sequence should be made freely available to the entire research community. The HGP propelled discourse and debates on open research data to the forefront of molecular biology research(Leonelli 2012) and spawned a new generation of information infrastructures to generate, integrate, and curate the growing data pools with commonly used tools and analytical methods(Grossman et al. 2016;Vamathevan et al. 2019). As a result, the discipline has been very active in developing infrastructures based upon data commons, one of them being Open Targets (Pujol Priego and Wareham 2018). ...
Article
Full-text available
Government funding entities have placed data sharing at the centre of scientific policy. While there is widespread consensus that scientific data sharing benefits scientific progress, there are significant barriers to its wider adoption. We seek a deeper understanding of how researchers from different fields share their data and the barriers and facilitators of such sharing. We draw upon the notions of epistemic cultures and collective action theory to consider the enablers and deterrents that scientists encounter when contributing to the collective good of data sharing. Our study employs a mixed-methods design by combining survey data collected in 2016 and 2018 with qualitative data from two case studies sampled within two scientific communities: high-energy physics and molecular biology. We describe how scientific communities with different epistemic cultures can employ modularity, time delay, and boundary organisations to overcome barriers to data sharing.
... The application of statistical and computational techniques is seen as crucial to the decoding of the human genome (Garcia-Sancho 2012). The subsequent resurgence of life science has involved the increasing automation of gene sequencing and made available escalating volumes of data (Leonelli 2012;Vermeulen 2016). These changes have been accompanied by the emergence of large-scale life science facilities, enabling new models of scientific research organisation. ...
... Wong (2016) sees this as representing the commodification of research. 21 In contrast, Leonelli (2012) flags that the increasing scale and data intensity of research is "changing what counts as good science." (Leonelli 2012, 3) As we see in the next section, the key driver for BGI was the scientific impact (reflected by papers in top journals from major investigations, often in partnership with leading international research groups) that could be leveraged from the large data sets it could relatively quickly and cheaply generate with its efficient sequencing capabilities. ...
Article
Full-text available
The increasing importance of computational techniques in post-genomic life science research calls for new forms and combinations of expertise that cut across established disciplinary boundaries between computing and biology. These are most marked in large scale gene sequencing facilities. Here new ways of organising knowledge production, drawing on industrial models, have been perceived as pursuing efficiency and control to the potential detriment of academic autonomy and scientific quality. We explore how these issues are played out in the case of BGI (Beijing Genomics Institute prior to 2008). BGI (in Pinyin, Hua Da Jiyin– Big China Genome) is today the world’s largest centre for gene sequencing research. Semi-detached from traditional academic institutions, BGI has developed distinctive models for organising research and for developing expertise, informed by practices in US Information Technology and Life Science Laboratories, that differ from existing models of interdisciplinary research in academic institutions.
... Scientific research in the life sciences increasingly operates through large, international consortia. This is partly driven by the move towards a more data-intensive biology and biomedicine (Leonelli 2012;Vermeulen 2009). For the purposes of this paper, we define consortia as time-limited collective research endeavours, which operate under one or more contractual agreements, and typically have a formal management structure and governance structure (see Fig. 1). ...
... 'Data intensive' or 'data-driven' approaches to the life sciences envisage, and require, new roles and responsibilities for scientists, new infrastructures and new governance arrangements (Swierstra and Efstathiou 2020;Leonelli 2012). The great promise of large datasets is that they can be re-used over time by the wider scientific community to address new questions and attain insights unavailable with traditional datasets. ...
Article
Full-text available
Responsible Research and Innovation ('RRI') is a cross-cutting priority for scientific research in the European Union and beyond. This paper considers whether the way such research is organised and delivered lends itself to the aims of RRI. We focus particularly on international consortia, which have emerged as a common model to organise large-scale, multidisciplinary research in contemporary biomedical science. Typically, these consortia operate through fixed-term contracts, and employ governance frameworks consisting of reasonably standard, modular components such as management committees, advisory boards, and data access committees, to coordinate the activities of partner institutions and align them with funding agency priorities. These have advantages for organisation and management of the research, but can actively inhibit researchers seeking to implement RRI activities. Conventional consortia governance structures pose specific problems for meaningful public and participant involvement, data sharing, transparency, and 'legacy' planning to deal with societal commitments that persist beyond the duration of the original project. In particular, the 'upstream' negotiation of contractual terms between funders and the institutions employing researchers can undermine the ability for those researchers to subsequently make decisions about data, or participant remuneration, or indeed what happens to consortia outputs after the project is finished, and can inhibit attempts to make project activities and goals responsive to input from ongoing dialogue with various stakeholders. Having explored these challenges, we make some recommendations for alternative consortia governance structures to better support RRI in future.
... Furthermore, the participants in a citizen science project could potentially influence the data collection based on non-epistemic motives (Elliott and Rosenberg 2019). Further issues with respect to data acquisition concern the modeling activities necessary to obtain an adequate dataset (Bokulich 2018;Leonelli 2019a) and related to that, whether data obtained from computer simulations should be considered on a par with observation-based data (Lusk 2016;Parker 2016;. Epistemological issues regarding data storage could arise for example because increasing volumes of data stored on distributed servers could impact the portability of the data, which is essential for the data to serve as prospective evidence for or against scientific claims (Leonelli 2015). ...
... Further issues with respect to data acquisition concern the modeling activities necessary to obtain an adequate dataset (Bokulich 2018;Leonelli 2019a) and related to that, whether data obtained from computer simulations should be considered on a par with observation-based data (Lusk 2016;Parker 2016;. Epistemological issues regarding data storage could arise for example because increasing volumes of data stored on distributed servers could impact the portability of the data, which is essential for the data to serve as prospective evidence for or against scientific claims (Leonelli 2015). This might be exacerbated by data produced and owned by private companies and institutions (see Leonelli 2019b). ...
Thesis
Full-text available
Recent years have seen a dramatic increase in the volumes of data that are produced, stored, and analyzed. This advent of big data has led to commercial success stories, for example in recommender systems in online shops. However, scientific research in various disciplines including environmental and climate science will likely also benefit from increasing volumes of data, new sources for data, and the increasing use of algorithmic approaches to analyze these large datasets. This thesis uses tools from philosophy of science to conceptually address epistemological questions that arise in the analysis of these increasing volumes of data in environmental science with a special focus on data-driven modeling in climate research. Data-driven models, here, are defined as models of phenomena that are built with machine learning. While epistemological analyses of machine learning exist, these have mostly been conducted for fields characterized by a lack of hierarchies of theoretical background knowledge. Such knowledge is often available in environmental science and especially in physical climate science, and it is relevant for the construction, evaluation, and use of data-driven models. This thesis investigates predictions, uncertainty, and understanding from data-driven models in environmental and climate research and engages in in-depth discussions of case studies. These three topics are discussed in three topical chapters. The first chapter addresses the term “big data”, and rationales and conditions for the use of big-data elements for predictions. Namely, it uses a framework for classifying case studies from climate research and shows that “big data” can refer to a range of different activities. Based on this classification, it shows that most case studies lie in between classical domain science and pure big data. The chapter specifies necessary conditions for the use of big data and shows that in most scientific applications, background knowledge is essential to argue for the constancy of the identified relationships. This constancy assumption is relevant both for new forms of measurements and for data-driven models. Two rationales for the use of big-data elements are identified. Namely, big-data elements can help to overcome limitations in financial, computational, or time resources, which is referred to as the rationale of efficiency. Big-data elements can also help to build models when system understanding does not allow for a more theory-guided modeling approach, which is referred to as the epistemic rationale. The second chapter addresses the question of predictive uncertainties of data-driven models. It highlights that existing frameworks for understanding and characterizing uncertainty focus on specific locations of uncertainty, which are not informative for the predictive uncertainty of data-driven models. Hence, new approaches are needed for this task. A framework is developed and presented that focuses on the justification of the fitness-for-purpose of the models for the specific kind of prediction at hand. This framework uses argument-based tools and distinguishes between first-order and second-order epistemic uncertainty. First-order uncertainty emerges when it cannot be conclusively justified that the model is maximally fit-for-purpose. Second-order uncertainty emerges when it is unclear to what extent the fitness-for-purpose assumption and the underlying assumptions are justified. The application of the framework is illustrated by discussing a case study of data-driven projections of the impact of climate change on global soil selenium concentrations. The chapter also touches upon how the information emerging from the framework can be used in decision-making. The third chapter addresses the question of scientific understanding. A framework is developed for assessing the fitness of a model for providing understanding of a phenomenon. For this, the framework draws from the philosophical literature on scientific understanding and focuses on the representational accuracy, the representational depth, and the graspability of a model. Then, based on the framework, the fitness of data-driven and process-based climate models for providing understanding of phenomena is compared. It is concluded that data-driven models can, under some conditions, be fit to serve as vehicles for understanding to a satisfactory extent. This is specifically the case when sufficient background knowledge is available such that the coherence of the model with background knowledge provides good reasons for the representational accuracy of the data-driven model, which can be assessed e.g. through sensitivity analyses. This point is illustrated by discussing a case study from atmospheric physics in which data-driven models are used to better understand the drivers of a specific type of clouds. The work of this thesis highlights that while big data is no panacea for scientific research, data-driven modeling offers new tools to scientists that can be very useful for a variety of questions. All three studies emphasize the importance of background knowledge for the construction and evaluation of data-driven models as this helps to obtain models that are representationally accurate. The importance of domain-specific background knowledge and the technical challenges of implementing data-driven models for complex phenomena highlight the importance of interdisciplinary work. Previous philosophical work on machine learning has stressed that the problem framing makes models theory-laden. This thesis shows that in a field like climate research, the model evaluation is strongly guided by theoretical background knowledge, which is also important for the theory-ladenness of data-driven modeling. The results of the thesis are relevant for a range of methodological questions regarding data-driven modeling and for philosophical discussions of models that go beyond data-driven models.
... For many biomedical researchers, producing valuable knowledge involves generating, collecting, formatting and utilizing vast quantities-and different sorts-of data (Leonelli, 2012;Meloni, 2016). As social scientists studying data-intensive biomedical research, we are interested in how practitioners handle data for the production of knowledge. ...
Article
Full-text available
Data are versatile objects that can travel across contexts. While data’s travels have been widely discussed, little attention has been paid to the sites from where and to which data flow. Drawing upon ethnographic fieldwork in two connected data-intensive laboratories and the concept of domestication, we explore what it takes to bring data ‘home’ into the laboratory. As data come and dwell in the home, they are made to follow rituals, and as a result, data are reshaped and form ties with the laboratory and its practitioners. We identify four main ways of domesticating data. First, through storytelling about the data’s origins, data practitioners draw the boundaries of their laboratory. Second, through standardization, staff transform samples into digital data that can travel well while ruling what data can be let into the home. Third, through formatting, data practitioners become familiar with their data and at the same time imprint the data, thus making them belong to their home. Finally, through cultivation, staff turn data into a resource for knowledge production. Through the lens of domestication, we see the data economy as a collection of homes connected by flows, and it is because data are tamed and attached to homes that they become valuable knowledge tools. Such domestication practices also have broad implications for staff, who in the process of ‘homing’ data, come to belong to the laboratory. To conclude, we reflect on what these domestication processes—which silence unusual behaviours in the data—mean for the knowledge produced in data-intensive research.
... The relevance of statistics in biology is a longstanding one, with great statisticians being also great biologists [1]. Nowadays it is arguably even more so, with the increasingly accessible computation power and highly parallelized assays that revolutionized biology, leading to a paradigm shift from hypothesis-driven research to data driven research [6]. A more in-depth discussion on the theme can be found on the Special Issue of Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences entitled Data-Driven Research in the Biological and Biomedical Sciences. ...
Preprint
Full-text available
Background Gene expression regulates several complex traits observed. In this study, datasets comprising of transcriptome information and clinical traits regarding fat composition and vitals were analyzed via several statistical methods in order to find relations between genes and clinical outcomes. Results Biological big data is diverse and numerous, which makes for a complex case study and difficulties to stablish a metric. Histological data with semi-quantitative scores proved unreliable to correlate with other vitals, such as cholesterol composition, which complicates prediction of clinical outcomes. A composition of vitals, turned out to be a better variable for regression and factors for gene analysis. Several genes were found to be statistically significant after statistical analysis by ANOVA regarding the progressive categories of the preferred clinical variable. Conclusions ANOVA is proposed as a method for genetic information retrieval in order to extract biological meaning from RNA seq or microarray data, accounting for multiple classes of target variables. It Provides a reliable statistical method to associate genes or clusters of genes with particular traits. Supplementary information Supplementary data are available in annexes.
... In this sense, their work building baselines based on data availability resonates with the concept of "data-driven" research, which captures research thought to be led by the generation and collection of vast quantities of data in order to identify new processes and phenomena. However, while developing hypotheses from patterns identified in large datasets may be data-driven, decisions still have to be made about what data to include, how to classify data, and how to interpret data and resulting models (Kell and Oliver, 2004;Leonelli, 2012). Similarly, biological clocks are developed from decisions on the most salient features of data populations, as well as scientific and normative norms for the interpretation and use of biological clocks. ...
Article
Full-text available
This article discusses so-called biological clocks. These technologies, based on aging biomarkers, trace and measure molecular changes in order to monitor individuals’ “true” biological age against their chronological age. Drawing on the concept of decay, and building on ethnographic fieldwork in an academic laboratory and a commercial firm, we analyze the implications of the development and commercialization of biological clocks that can identify when decay is “out of tempo.”We show how the building of biological clocks rests on particular forms of knowing decay: In the academic laboratory, researchers focus on endo-processes of decay that are internal to the person, but when the technology moves to the market, the focus shifts as sta􀀀 bracket decay as exo-processes, which are seen as resulting from a person’s lifestyle. As the technology of biological clocks travels from the laboratory to the market of online testing of the consumer’s biological age, we observe shifting visions of aging: froman inevitable trajectory of decline to a malleable and plastic one. While decay is an inevitable trajectory starting at birth and ending with death, the commercialization of biological clocks points to ways of stretching time between birth and death as individuals “optimize” their biological age through lifestyle changes. Regardless of admitted uncertainties about what is measured and the connection betweenmaintenance and future health outcomes, the aging person is made responsible for their decaying body and for enacting maintenance to slow down decay. We show how the biological clock’s way of “knowing” decay turns aging and its maintenance into a life-long concern and highlight the normative implications of framing decay as malleable and in need of intervention.
... 78 Namely, such approaches are usually characterized by the size and complexity of the 79 involved datasets (see De Mauro, Greco, & Grimaldi, 2016;Floridi, 2012;Kitchin, 2014;80 Mayer-Schönberger & Cukier, 2013; Northcott, 2019; Pietsch, 2016), and by the 81 methodological approaches used to analyze these datasets, namely the use of machine 82 learning and data mining (see, Boyd & Crawford, 2012;Kitchin, 2014;Northcott, 2019;83 Pietsch, 201583 Pietsch, , 2016Veltri, 2017). Because big data is about more than just large volumes of 84 data, it has been suggested to use the term "data-intensive science" (see, e.g., Pietsch, 2015Pietsch, , 85 2016 or "data-driven science" (Leonelli, 2012), instead. According to Leonelli (2012, p ...
... Neo-institutionalist perspectives (Dobbin & Sutton, 1998;Meyer, 2008Meyer, , 2010 suggest that while early adopters of open practices (the life sciences, among others) have done so based on practical necessity (in terms of improving certain forms of doing research; Leonelli, 2012;Thessen & Patterson, 2011), later adoptions occur (mainly) based on perceived legitimacy (Dobbin & Sutton, 1998), following 'definitions, principles and purposes that are cognitively constructed in similar ways throughout the world' (Boli & Thomas, cited in Drori, 2008, p. 460). For data sharing, the relevant principles are those of transparency and reproducibility, with the (implicit) expectation that (all) research fields will take a similar trajectory of openness by adopting a specific set of practices. ...
Article
Scientific institutions have increasingly embraced formalized research data management strategies, which involve complex social practices of codifying the tacit dimensions of data practices. Several guidelines to facilitate these practices have been introduced in recent years, for example, the FAIR guiding principles. The aim of these practices is to foster transparency and reproducibility through ‘data sharing,’ the public release of data for unbounded reuse. However, a closer look suggests that many scientists’ practices of data release might be better described as what I call data handovers. These practices are not rooted in the lofty ideals of good scientific practice and global data reuse but in the more mundane necessities of research continuity, which have become more urgent in light of increasing academic mobility. The Austrian scientists interviewed for this study reinterpreted defining features of research data management – such as ensuring findability – as techniques for managing the effects of researcher mobility. This suggests that the adoption of Open Science practices might be dissociated from its stated epistemic goals, and explains why many Open Science initiatives at present are administratively strong but normatively weak.
... There has been great interest in how bioscientific research is becoming "big biology" and thus reshaped through "high-throughput" or "datadriven" approaches (cf. Leonelli 2012;Davies, Frow, and Leonelli 2013). Closer to the practices and technologies of knowledge production, questions have been posed about the so-called omics fields (e.g., genomics, metabolomics, proteomics), and how they "produce new genres of difference and variation" (McNally and Mackenzie 2013, 75). ...
Preprint
In science and technology studies today, there is a troubling tendency to portray actors in the biosciences as “cultural dopes” and technology as having monolithic qualities with predetermined outcomes. To remedy this analytical impasse, this article introduces the concept styles of valuation to analyze how actors struggle with valuing technology in practice. Empirically, this article examines how actors in a bioscientific laboratory struggle with valuing the properties and qualities of algorithms in a high-throughput setting and identifies the copresence of several different styles. The ques- tion that the actors struggle with is what different configurations of algo- rithms, devices, and humans are “good bioscience,” that is, what do the actors perform as a good distribution of agency between algorithms and humans? A key finding is that algorithms, robots, and humans are valued in multiple ways in the same setting. For the actors, it is not apparent which configuration of agency and devices is more authoritative nor is it obvious which skills and functions should be redistributed to the algorithms. Thus, rather than tying algorithms to one set of values, such as “speed,” “precision,” or “automation,” this article demonstrates the broad utility of attending to the multivalence of algorithms and technology in practice.
... La littérature évoquée ci-dessus se distingue, nous semble-t-il, en deux grandes catégories : la réflexion pluri-thématique sur les Big data et la réflexion onto-épistémique sur la modélisation numérique. Quant au premier volet, la philosophie des sciences s'est occupée, parmi d'autres choses, de déconstruire le fameux article controversé « The End of Theory » de Chris Anderson [4], en montrant par exemple que les données ne sont justement pas « données » [5], que le travail de recherche inductif n'est jamais exempté d'hypothèses préalables [6] et que les corrélations dans le déluge des données sont souvent fallacieuses [7], ou bien elle s'est attardée sur l'usage des données pour corriger les modèles [8], sur la capacité prédictive des Big data [9] ou sur l'éthique liée à leur utilisation [10]. Quant au deuxième volet, les philosophes ont surtout réfléchi à la manière de qualifier la simulation numérique, qui est le plus souvent vue comme une nouvelle voie au-delà de la théorie et de l'expérience [11], nécessitant une toute nouvelle épistémologie pour être pensée [12]. ...
... Nie jest to tylko kwestia techniczna, z którą można sobie poradzić dzięki rozwiązaniom technologicznym (Floridi, 2012). Problem ten wymaga ostrożnego podejścia i ponownego przemyślenia w odniesieniu do filozofii nauki (Leonelli, 2012). Jak zauważa Kitchin (2013), w kontekście pojawienia się nowych form myślenia empirycznego i pozytywistycznego poza naukami społecznymi oraz ich wkroczenia w badanie kwestii społecznych, politycznych i przestrzennych, konieczne będzie w geografii społecznej odpowiednie przygotowanie się do obrony swojej dyscypliny. ...
... Jo, at man forsøker å påskynde prosessen fra forskning, over oppdagelse og innovasjon, til klinisk anvendelse og marked. 6 Dersom ambisjonen er sømløs oversettelse, er det på sin plass å bemerke noen oversettelsesproblemer: Mens man med stor kraft har lyktes i å oversette biologisk liv til digital informasjon, er prosessene ikke like lette å reversere, fra digital informasjon tilbake biologisk liv (Leonelli 2012). Dette resulterer i nye oversettelsesproblemer fra laboratorium til klinikk. ...
... emerges in response to the copious amounts of data generated from genomics, epigenomics, proteomics, transcriptomics, and metabolomics (Marx 2013, Stephens et al. 2015. The goal of this biology is to manage and analyze computationally big data derived from omics technologies (Leonelli 2012, Schatz 2015. And, network biology emerges along with big data biology (Kanaya et al. 2017). ...
... In recent years, the big data revolution has paved the way for the diffusion of data-driven technologies for a whole range of heterogeneous applications. Datadriven solutions have been proposed in biology [1,2], manufacturing [3,4], and finance [5,6], for example, and their popularity can be attributed to the combined effect of three factors: (1) the availability of large amounts of data enabled by the proliferation of sensors for system monitoring; (2) the major advances in data analysis and Machine Learning (ML) algorithms that have transformed our ability to model and extract useful information from data; ...
Preprint
Dimensionality reduction is a important step in the development of scalable and interpretable data-driven models, especially when there are a large number of candidate variables. This paper focuses on unsupervised variable selection based dimensionality reduction, and in particular on unsupervised greedy selection methods, which have been proposed by various researchers as computationally tractable approximations to optimal subset selection. These methods are largely distinguished from each other by the selection criterion adopted, which include squared correlation, variance explained, mutual information and frame potential. Motivated by the absence in the literature of a systematic comparison of these different methods, we present a critical evaluation of seven unsupervised greedy variable selection algorithms considering both simulated and real world case studies. We also review the theoretical results that provide performance guarantees and enable efficient implementations for certain classes of greedy selection function, related to the concept of submodularity. Furthermore, we introduce and evaluate for the first time, a lazy implementation of the variance explained based forward selection component analysis (FSCA) algorithm. Our experimental results show that: (1) variance explained and mutual information based selection methods yield smaller approximation errors than frame potential; (2) the lazy FSCA implementation has similar performance to FSCA, while being an order of magnitude faster to compute, making it the algorithm of choice for unsupervised variable selection.
... Fortunately, the rapid development of high-throughput techniques in genetics, protein analysis and metabolite analyses in the later 20th century has provided the tools to explore meat quality in ever greater detail, using "omics" techniques, where large quantities of data are obtained about the genome, transcriptome, proteome, metabolome, phosphorylome, lipidome or degradome and correlated to variations in meat quality (Munekata, Pateiro, López-Pedrouso, Gagaoua, & Lorenzo, 2021). The experimental approach typically applied in studies using these techniques has been described as data-driven, rather than hypothesis-driven research (Leonelli, 2012;Strasser, 2012). In relation to meat quality, the application of transcriptomics, proteomics and metabolomics has been reviewed by (Guo & Dalrymple, 2017), (Picard, Gagaoua, & Hollung, 2017) and (Bertram, 2017), respectively. ...
Article
Following a century of major discoveries on the mechanisms determining meat colour and tenderness using traditional scientific methods, further research into complex and interactive factors contributing to variations in meat quality is increasingly being based on data-driven “omics” approaches such as proteomics. Using two recent meta-analyses of proteomics studies on beef colour and tenderness, this review examines how knowledge of the mechanisms and factors underlying variations in these meat qualities can be both confirmed and extended by data-driven approaches. While proteomics seems to overlook some sources of variations in beef toughness, it highlights the role of post-mortem energy metabolism in setting the conditions for development of meat colour and tenderness, and also points to the complex interplay of energy metabolism, calcium regulation and mitochondrial metabolism. In using proteomics as a future tool for explaining variations in meat quality, the need for confirmation by further hypothesis-driven experimental studies of post-hoc explanations of why certain proteins are biomarkers of beef quality in data-driven studies is emphasised.
... Fortunately, the rapid development of high-throughput techniques in genetics, protein analysis and metabolite analyses in the later 20th century has provided the tools to explore meat quality in ever greater detail, using "omics" techniques, where large quantities of data are obtained about the genome, transcriptome, proteome, metabolome, phosphorylome, lipidome or degradome and correlated to variations in meat quality (Munekata, Pateiro, López-Pedrouso, Gagaoua, & Lorenzo, 2021). The experimental approach typically applied in studies using these techniques has been described as data-driven, rather than hypothesis-driven research (Leonelli, 2012;Strasser, 2012). In relation to meat quality, the application of transcriptomics, proteomics and metabolomics has been reviewed by (Guo & Dalrymple, 2017), (Picard, Gagaoua, & Hollung, 2017) and (Bertram, 2017), respectively. ...
... Evaluating the role and usefulness of data-driven computational models and simulations is complicated in biological contexts for reasons others have explored in detail (See e.g. Leonelli 2011Leonelli , 2012Leonelli , 2016Stevens 2017 for example). Epidemiology is an even more challenging context for evaluating the role of computational models and simulations than, for example, in molecular biology for reasons we will discuss below. ...
Article
There are many tangled normative and technical questions involved in evaluating the quality of software used in epidemiological simulations. In this paper we answer some of these questions and offer practical guidance to practitioners, funders, scientific journals, and consumers of epidemiological research. The heart of our paper is a case study of the Imperial College London (ICL) covid-19 simulator, set in the context of recent work in epistemology of simulation and philosophy of epidemiology.
... One way to gather direct evidence is to look at the corpus of usage. (Devitt, 2012) Leonelli (2012 and Pietsch (2013) have discussed the function of big data in philosophy of science, but it should also be realized that big data can be equally well applied to X-Phi studies. Devitt (2012) demonstrated the application of corpus to philosophical study with an example. ...
Article
Full-text available
With the rise of experimental philosophy in the twenty-first century, the past two decades have witnessed the experimental turn in the field of philosophy of language. We delineate in this paper the experimental turn in philosophy of language before distinguishing armchair theorizing from empirical testing and highlighting the complementarity between the two approaches, and then carry out an analysis of the experimental tools and methods available for philosophical experiments with examples by classifying them into three major types, viz., the method of survey, the method of big data, and the method of cognitive neuroscience.
... However, data gathering in Physics is soaked by centuries of scientific knowledge and the associated human bias [9][10][11][12]; so, a "blind" algorithm without any information on that bias may lead to wrong predictions. Also, scientific problems suffer most times from paucity of data while involving a large number of variables that interact in complex and non-stationary ways. ...
Preprint
Full-text available
Substitution of well-grounded theoretical models by data-driven predictions is not as simple in engineering and sciences as it is in social and economic fields. Scientific problems suffer most times from paucity of data, while they may involve a large number of variables and parameters that interact in complex and non-stationary ways, obeying certain physical laws. Moreover, a physically-based model is not only useful for making predictions, but to gain knowledge by the interpretation of its structure, parameters, and mathematical properties. The solution to these shortcomings seems to be the seamless blending of the tremendous predictive power of the data-driven approach with the scientific consistency and interpretability of physically-based models. We use here the concept of physically-constrained neural networks (PCNN) to predict the input-output relation in a physical system, while, at the same time fulfilling the physical constraints. With this goal, the internal hidden state variables of the system are associated with a set of internal neuron layers, whose values are constrained by known physical relations, as well as any additional knowledge on the system. Furthermore, when having enough data, it is possible to infer knowledge about the internal structure of the system and, if parameterized, to predict the state parameters for a particular input-output relation. We show that this approach, besides getting physically-based predictions, accelerates the training process, reduces the amount of data required to get similar accuracy, filters partly the intrinsic noise in the experimental data and provides improved extrapolation capacity.
... Similarly to GIS, big data entered geography mainly as a new tool to gather and analyze data, but also led to discussions about the need for scientific theories. For instance, Luciano Floridi (2012) and Sabina Leonelli (2012) have argued that we must not only think about big data as a data deluge and a technique, but also rethink its philosophy of science. These authors have pointed out the emergence of new modes of positivism and empiricism under the context of the big data age. ...
Article
Although much has been written about the challenges of big data, there has been little reflection on the historicity of such debates and what we can learn from it. With this in mind, the aim of this article is to situate the epistemological debates over big data in geography historically. We focus on the three most relevant topics in current discussions around big data that have significant historical resonance, namely its methodological challenges, its scientific value, and its positionality. We conclude by arguing that understanding the historical resonance of current big data debates is helpful to find new ways to question its epistemological consequences.
... Evaluating the role and usefulness of data-driven computational models and simulations is complicated in biological contexts for reasons others have explored in detail (See e.g. Leonelli 2011Leonelli , 2012Leonelli , 2016Stevens 2017 for example). Epidemiology is an even more challenging context for evaluating the role of computational models and simulations than, for example, in molecular biology for reasons we will discuss below. ...
Preprint
Full-text available
There are many normative and technical questions involved in evaluating the quality of software used in epidemiological simulations. In this paper we answer some of these questions and offer practical guidance to practitioners, funders, scientific journals, and consumers of epidemiological research. The heart of our paper is a case study of the Imperial College London (ICL) COVID-19 simulator. We contend that epidemiological simulators should be engineered and evaluated within the framework of safety-critical standards developed by the consensus of the software engineering community for applications such as automotive and aircraft control.
... However, seePietsch (2016) who critically evaluates some of these claims, and with whose views I am broadly sympathetic, as well asLeonelli (2012) who considers the impact of Big Data on biological practice. ...
Article
Full-text available
In this paper, I critically evaluate several related, provocative claims made by proponents of data-intensive science and “Big Data” which bear on scientific methodology, especially the claim that scientists will soon no longer have any use for familiar concepts like causation and explanation. After introducing the issue, in section 2, I elaborate on the alleged changes to scientific method that feature prominently in discussions of Big Data. In section 3, I argue that these methodological claims are in tension with a prominent account of scientific method, often called “Inference to the Best Explanation” (IBE). Later on, in section 3, I consider an argument against IBE that will be congenial to proponents of Big Data, namely the argument due to Roche and Sober (2013) that “explanatoriness is evidentially irrelevant”. This argument is based on Bayesianism, one of the most prominent general accounts of theory-confirmation. In section 4, I consider some extant responses to this argument, especially that of Climenhaga (2017). In section 5, I argue that Roche and Sober’s argument does not show that explanatory reasoning is dispensable. In section 6, I argue that there is good reason to think explanatory reasoning will continue to prove indispensable in scientific practice. Drawing on Cicero’s oft-neglected De Divinatione, I formulate what I call the “Ciceronian Causal-nomological Requirement”, (CCR), which states roughly that causal-nomological knowledge is essential for relying on correlations in predictive inference. I defend a version of the CCR by appealing to the challenge of “spurious correlations”, chance correlations which we should not rely upon for predictive inference. In section 7, I offer some concluding remarks.
... Considering the progress achieved in the field of nanotechnology, it can be predicted that nanotechnology will play an important role in the field of science in general and in biomedical science in particular. Furthermore, certain trends in contemporary sciences suggest that boundaries between biomedical sciences will become less distinct and they will eventually converge into a limited number of highly disciplined fields of biomedical science [10]. A dominant position in biomedical sciences will be assumed by health researches that is midway between basic and clinical research and applies findings from basic biomedical sciences to prevent, predict or cure disease for the entire population. ...
Article
Full-text available
In the present study, we envisage the emerging role of information science in biomedical sector. The information science has grown rapidly in last few decades and there has been large number of applications. In medical sector, the principle of information science can be used to diagnosis, medical record management, treatment etc. Hence, the role of information sciences are growing instantly hence the present study is designed to present the roles of information sciences in biomedical sector.
... Considering the progress achieved in the field of nanotechnology, it can be predicted that nanotechnology will play an important role in the field of science in general and in biomedical science in particular. Furthermore, certain trends in contemporary sciences suggest that boundaries between biomedical sciences will become less distinct and they will eventually converge into a limited number of highly disciplined fields of biomedical science [10]. A dominant position in biomedical sciences will be assumed by health researches that is midway between basic and clinical research and applies findings from basic biomedical sciences to prevent, predict or cure disease for the entire population. ...
Article
Full-text available
In the present study, we envisage the emerging role of information science in biomedical sector. The information science has grown rapidly in last few decades and there has been large number of applications. In medical sector, the principle of information science can be used to diagnosis, medical record management, treatment etc. Hence, the role of information sciences are growing instantly hence the present study is designed to present the roles of information sciences in biomedical sector.
Article
Progress in data-driven materials design significantly accelerated the pace of discovering novel materials and provided a new direction to molecular simulations. The availability of large databases has increased the complexity of predicting new materials data using an accurate set of methods. In this article, an attempt has been made to discuss and highlight the recent developments in materials discovery and innovation. The fourth paradigm of scientific exploration has been discussed with three basic activities such as capture, curation, and analysis. Computer simulations are considered a major component that can contribute to the development of novel materials. Here, we have reviewed the research work and data published for mechanical, thermal, and tribological properties using the molecular dynamics simulation method for composite materials. The article is intended to provide an understanding of applications of atomics simulation techniques for composite materials and further use of the results for the prediction of new materials databases to reduce the materials discovery and commercialization time.
Article
Personalized medicine aims at tailoring treatment to the individual person through the sourcing of multiple health data from the population. The realization of these ambitions rest on the ability to reuse health data. But what does it take to reuse tissue and data collected from individuals in connection with treatment, for future purposes? It takes an “enabling public” consisting of not only people providing tissue and data, but also clinicians, researchers, and civil servants. Based on fieldwork and interviews from Denmark, we investigate how use attaches tissue and data to various actors in the enabling public. We argue that multiple forms of attachments and detachments co-exist and that these persist over time. Attentiveness to the character and coexistence of these attachments is crucial in discussions of the role different actors should play in the governance of tissue and data.
Chapter
Today digital technologies permeate almost every sphere of life and take part into processes of any sort. This poses an unprecedented challenge to the social sciences. This chapter first explores the “power” of algorithms, i.e. sets of instructions that allow carrying out very complex tasks. Algorithms are often considered inherently objective. And yet, they embed their programmers’ choices, models, worldviews and ideologies, and thus are shaped by social forces and at the same time act back on society. Algorithms are also used in the social sciences to process large amounts of data and find previously overlooked patterns. While this offers new opportunities to understand society, it also poses new methodological challenges. This chapter also explores the ambition to design a calculator that can act as (or better than) humans—the so-called artificial intelligence—and its quandaries.KeywordsAlgorithmsArtificial intelligenceInformation technologies
Article
This paper draws on the notion of the asset to better understand the role of innovative research technologies in researchers’ practices and decisions. Faced with both the need to accumulate academic capital to make a living in academia and with many uncertainties about the future, researchers must find ways to anticipate future academic revenues. We illustrate that innovative research technologies provide a suitable means for doing so: First, because they promise productivity through generating interesting data and hence publications. Second, because they allow a signaling of innovativeness in contexts where research is evaluated, even across disciplinary boundaries. As such, enrolling innovative research technologies as assets allows researchers to bridge partly conflicting valuations of productivity and innovativeness they are confronted with. However, the employment of innovative technologies in anticipation of future academic revenues is not always aligned with what researchers value epistemically. Nevertheless, considerations about potential future academic revenues derived from innovative research technologies sometimes seem to override particular epistemic valuations. Illustrating these dynamics, we show that processes of assetization in academia can have significant epistemic consequences which are important to unpack.
Article
Full-text available
A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on different databases about the molecular and environmental origins of disease, into analytic frameworks, allowing the development of individualized, context-dependent diagnostics, and therapeutic approaches. In this regard, artificial intelligence and machine learning approaches can be used to build analytical models of complex disease aimed at prediction of personalized health conditions and outcomes. Such models must handle the wide heterogeneity of individuals in both their genetic predisposition and their social and environmental determinants. Computational approaches to medicine need to be able to efficiently manage, visualize and integrate, large datasets combining structure, and unstructured formats. This needs to be done while constrained by different levels of confidentiality, ideally doing so within a unified analytical architecture. Efficient data integration and management is key to the successful application of computational intelligence approaches to medicine. A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints. Here, we will review some of these constraints and discuss possible avenues to overcome current challenges.
Article
A remarkable developments in science over the past thirty years has been the development of large databases that support knowledge production, representation and circulation. Increasingly, chains of instruments, people and devices are involved in compiling, organizing and documenting biodiversity data collections. Several digital participatory science initiatives studied between 2013 and 2017 illustrates the diversity of amateur participation using digital technologies at every step of the knowledge process – from contributing local field observations, to identifying species, quality control and validation, to digitizing plant specimens for inclusion in transnational databases. Collaborative platforms and databases emerge as elements that are endowed with a certain performativity through the ordering possibilities they suggest or impose. Attending to practices in these contexts raises questions about organizational innovations and novel forms of division of labor in participatory science. The participation of large numbers of people with varied backgrounds and expertise results in a distributed and less linear type of biodiversity science. As the product of numerous interventions, data are best seen as relational; their authorship or ownership need not be their defining characteristic. Indeed, the data generating technologies and practices of participatory science initiatives themselves generate a kind of untethered relational potential that deserves exploration.
Chapter
In this Chapter, data science is characterized as an inductivist approach, i.e. an approach which aims to start from the facts to infer increasingly general laws and theories. This perspective is corroborated first by a case study of successful scientific practice from the field of machine translation and second by an analysis of recent developments in statistics, in particular the shift from so-called data modeling to algorithmic modeling. Over the past century, inductivism has not been well regarded by many scientists and philosophers of science. Given that inductivism is generally considered to be a failed methodology, the fundamental epistemological problem of data science turns out to be the justification of inductivism. Some classic objections against inductivism are revisited, the most pertinent of which is the so-called problem of induction. Without a satisfying solution to the problem of induction, data science seems doomed to failure.
Chapter
In his introduction, Tupasela outlines the theoretical contours of population branding. Taking science and technology studies (STS) as his starting point, he locates his work at the intersection of critical data studies (CDS) and nation branding. Using examples from two Nordic countries—Denmark and Finland—he identifies changes during the past ten years that suggest that state-collected and maintained resources, such as biobank samples and healthcare data, have become the object of marketing practices. Tupasela argues that this phenomenon constitutes a novel form of nation branding in which relations between states, individuals, and the private sector are re-aligned. Population branding, which he identifies as originating in the field of medical genetics, has increasingly incorporated marketing practices developed in the private sector in order to market state-controlled resources. The exploration of population branding practices helps provide understanding of how state-controlled big data is increasingly being used to generate new forms of value.
Article
Data Science can be considered a technique or a science. As a technique, it is more interested in the “what” than in the “why” of data. It does not need theories that explain how things work, it just needs the results. As a science, however, working strictly from data and without theories contradicts the post‐empiricist view of science. In this view, theories come before data and data is used to corroborate or falsify theories. Nevertheless, one of the most controversial statements about Data Science is that it is a science that can work without theories. In this conceptual paper, we focus on the science aspect of Data Science. How is Data Science as a science? We propose a three‐phased view of Data Science that shows that different theories have different roles in each of the phases we consider. We focus on when theories are used in Data Science rather than the controversy of whether theories are used in Data Science or not. In the end, we will see that the statement “Data Science works without theories” is better put as “in some of its phases, Data Science works without the theories that originally motivated the creation of the data.”
Article
Full-text available
Apresenta o contexto big data e sua relação com o conhecimento científico a partir da seguinte pergunta: A extração de informações de grandes volumes de dados representa uma mudança epistemológica para a ciência de forma geral? O objetivo geral é o de refletir sobre as implicações epistemológicas do contexto big data a partir das proposições do epistemólogo Karl Popper. Especificamente, os objetivos são: a) discutir as implicações teóricas da ciência no campo da epistemologia; b) discorrer sobre os conceitos do que se convencionou chamar de “big data”; c) analisar os possíveis impactos do contexto big data ao conhecimento científico. Metodologicamente, trata-se de um estudo exploratório e descritivo de caráter teórico para auxiliar a reflexão crítica acerca do fenômeno big data e suas consequências para o fazer científico de qualquer área do conhecimento. Como resultado apresenta-se uma análise crítica acerca de um fenômeno pouco estudado no aspecto epistemológico. Conclui-se que o contexto big data tem revolucionado os processos de tomada de decisão nas empresas, mas não é possível afirmar que a mesma revolução ocorre no aspecto epistemológico.
Article
Full-text available
This essay offers an overview of how manuals and handbooks have contributed to the standardization, codification, transmission and revision of knowledge. These instructional and reference texts are distinct from related educational genres such as textbooks and editions due to their focus on practical knowledge. They are also notable for their appearance in diverse times and places, such as ancient Greece, early and medieval China and early modern Europe, as well as modern contexts worldwide. We are especially interested in the role of these often mundane texts in maintaining and resituating old knowledge, whose importance is discounted when scholars focus on innovation. Modern notions of authorship fit poorly with handbooks and manuals, which are generally derivative of other literature, though they often result in more commercially successful texts than their sources. This introduction draws on book history as well as history of science to offer a framework for the volume.
Article
Full-text available
This paper uses several case studies to suggest that 1) two prominent definitions of data do not on their own capture how scientists use data and 2) a novel perspectival account of data is needed. It then outlines some key features of what this account could look like. Those views, the relational and representational, do not fully capture what data are and how they function in science. The representational view is insensitive to the scientific context in which data are used. The relational account does not fully account for the empirical nature of data and how it is possible for data to be evidentially useful. The perspectival account surmounts these problems by accommodating a representational element to data. At the same time, data depend upon the epistemic context because they are the product of situated and informed judgements.
Thesis
The main focus of the thesis is to better understand how people decide in groups with a use case in the travel and tourism domain. The goal is to use this knowledge for the future research and design of group decision-support systems, user / group modeling and personalization, and group recommender systems. We take an approach where we observe actual groups before, during and after their decision-making process, having the groups choosing a destination to visit together. Hereby, we investigate a wide range of individual and group characteristics, for instance, explicit preferences, travel behavioral patterns, personality, group diversity, social relationships, etc., and their relation to the travel-related group discussions. Due to the complexity of the tourism product, the travel and tourism domain makes quite a challenging use case for the research of group decision-making. In our best knowledge, we have been the first with an opportunity to look at this topic by accounting for so many dimensions. Our findings indicate that the current research overlooks the importance of this dimensionality, and that future researchers and practitioners will have to account for more than users' explicit preferences when designing effective group recommender systems in order to truly help groups in their travel-related group decision-making process.
ResearchGate has not been able to resolve any references for this publication.