Article

On the Origin(s) and Development of the Term 'Big Data'

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

I investigate the origins of the now-ubiquitous term ”Big Data," in industry and academics, in computer science and statistics/econometrics. Credit for coining the term must be shared. In particular, John Mashey and others at Silicon Graphics produced highly relevant (unpublished, non-academic) work in the mid-1990s. The first significant academic references (independent of each other and of Silicon Graphics) appear to be Weiss and Indurkhya (1998) in computer science and Diebold (2000) in statistics /econometrics. Douglas Laney of Gartner also produced insightful work (again unpublished and non-academic) slightly later. Big Data the term is now firmly entrenched, Big Data the phenomenon continues unabated, and Big Data the discipline is emerging.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... A tal fine, si è impiegato un metodo di ricerca orizzontaleutilizzando un vasto numero di fonti e di informazioni per evitare di escludere a priori elementi utili -integrato da una ricerca di natura verticale, che ha consentito di approfondire le informazioni raccolte attraverso un'indagine più accurata e dettagliata degli elementi selezionati come fondamentali per la formulazione di analisi, inferenze ecc. ntroduzione Nel linguaggio ICT (Information and Communication Technologies), la locuzione generica e onnicomprensiva "Big Data" 6 indica il flusso di dati generato quotidianamente dagli utenti della rete Internet, in modo massiccio e continuo, nell'epoca dei social media e del crescente sviluppo tecnologico (Diebold, 2012;Laney, 2001). L'espressione, talvolta tradotta con il vocabolo "megadati", indica "(…) una quantità di dati ed informazioni così estesa in termini di volume, velocità e varietà da richiedere tecnologie e metodi analitici specifici per l'estrazione di valore o conoscenza. ...
... Il lavoro di Halevi & Moed (2012), esplorando l'impiego del termine "Big Data" nella produzione scientifica recente, ne fa risalire la prima apparizione ad un articolo del 1970 sui rilevamenti atmosferici e oceanici condotti nell'Arcipelago delle Barbados. Altri Autori posticipano l'origine dell'evento collocandola nel contesto informatico statunitense intorno alla metà degli anni '90, attribuendo i primi riferimenti accademici significativi a Weiss & Indurkhya (1998) in informatica e Diebold in statistica/econometria 13 (Diebold, 2012). ...
... In caso di review della letteratura, è stata riportata la sola opera principale, con rinvio ai titoli in essa citati. 13 Benché siano riportati alcuni riferimenti ai Big Data antecedenti al 2000, sia accademici che non accademici, si tratterebbe di episodi scarsamente rilevanti, poiché il termine sarebbe stato utilizzato senza una reale consapevolezza del fenomeno (Diebold, 2012 Mondiale 14 e la loro estensione, dapprima al settore commerciale e, successivamente, al vasto pubblico con la diffusione del personal computing, rappresenterebbero ulteriori pietre miliari di detto percorso evolutivo (Jackson-Barnes, 2023). ...
Book
Full-text available
The "Big Data Governance and Legal Aspects" booklet delves into the governance challenges and legal implications surrounding Big Data in an era marked by the rapid evolution of Emerging and Disruptive Technologies (EDTs). It highlights the essential processes for managing data throughout its lifecycle, emphasizing collection, storage, and analysis while ensuring data security and ethical usage. The text also navigates the balance between privacy and national security, exploring the necessity for ethical and legal frameworks that can address these evolving threats. The publication investigates the dual potential of Big Data: maximizing value for national security and minimizing privacy risks. It discusses the complexity of reconciling privacy with national security, particularly in the context of CBRNe threats. The book includes a comprehensive examination of Open Source Intelligence (OSINT) methods and the deployment of demonstrators to monitor global asymmetric threats. By analyzing regulatory landscapes and presenting case studies, it offers an integrated approach to understanding Big Data's role in contemporary security and defense, providing a valuable resource for policy makers, researchers, and security professionals.
... Big data analysis is essential for analysts, researchers and business people to make better decisions that were previously not attained. Figure 1 explains the structure of big data which contains five dimensions namely volume, velocity, variety, value and veracity [2] [3]. Volume refers the size of the data which mainly shows how to handle large scalability databases and high dimensional databases and its processing needs. ...
... Big data analytics refers to the process of collecting, organizing and analyzing large sets of data to discover patterns and other useful information. Table 1 shows the comparative study of different types of data based on its size, characteristic, tools and methods [1] [3]. ...
... In this digital era, after the development of mobile and wireless technologies provides a new platform in which people may share their information through social media sites for e.g. face book, twitter and google+ [3]. In these places, the data may be arrived continuously and it cannot be stored in computer memory because the size of the data is huge and it is considered as "Big Data". ...
... The concept and trend of thought, "Big Data," initially originated in the field of computer science before progressively expanding to other science disciplines and commerce. Scholars within the field of science date the origin of Big Data back to 1998 [1]. John Mashey, a chief scientist at Silicon Graphics (SGI), one of America's high-performance computing companies, pointed out some of the key challenges posed by the exponential data volume growth in one of the international conference reports [1]. ...
... Scholars within the field of science date the origin of Big Data back to 1998 [1]. John Mashey, a chief scientist at Silicon Graphics (SGI), one of America's high-performance computing companies, pointed out some of the key challenges posed by the exponential data volume growth in one of the international conference reports [1]. Mashey reported these challenges as understanding, obtaining, processing, and organizing data. ...
Article
Full-text available
China has developed rapidly in recent years. Chinese President Xi also pointed out:" Big data is a new stage of informatization development." Therefore, promoting and innovating the big data technology industry and building a digital economy with data as the key element will be the main development direction in the next few years. In this article, I intend to review the origin & development, current status, and trend of big data in China, then analyze the future trend of big data and give some thoughts and suggestions.
... According to Diebold (2012), the term "big data" began to be used by scholars and IT professionals, such as himself, John Mashey, Sholom M. Weiss, Nitin Indurkhya, and Douglas Laney between 1998 and2001. The most referenced definition of big data is Gartner's 4 , which is based on Laney's (2001) notion. ...
... Indeed, in the IT field, the terms "volume" and "speed" become relative. Diebold (2012) states that, in the field of econometrics, any more than 200 gigabytes (GB) of data is considered to be a large data set. However, in physics, experimental data sets usually contain much larger amounts of data. ...
Article
Full-text available
This research uses an interpretive case study strategy to investigate how big data affects tax audits in Indonesia, both with regard to tax audit management and policy, and to tax auditors’ individual audit assignments. The study reveals that the impact of big data on tax audit exists in two aspects. First, at audit policy level, big data is used as part of risk analysis in order to determine which taxpayers should be audited. Second, at the individual tax audit assignment level, tax auditors must utilise big data in order to acquire and analyse data from taxpayers and other related parties. Big data has the following characteristics: it involves huge volumes of information, it is generated at a high velocity, it includes a wide array of data types, and it contains high uncertainty. Big data can be analysed in order to reinforce the results gained from risk engines as a part of a compliance risk management system at the audit policy level. Meanwhile, at the individual tax audit assignment level, empirical evidence shows that tax auditors may deal with: (1) large volumes of data (hundreds of millions of records) that originated from previous fiscal years (historical records); (2) variations in the format and sources of data acquired from taxpayers which, to some extent, may be giving an auditor the authority to request data in a format that suits their analytical tools—with an inherent risk that the data can only be acquired in its native format; (3) data veracity that requires the tax auditors to review data sources because the adopted data analysis techniques are determined by the validity of data under audit. The main benefit expected to be gained from the implementation of big data analytics in respect of tax audits is the provision of valid and reliable information that evidences that taxpayers are compliant with tax laws.
... Definition and characteristics of big data BDA was previously defined in the 1990s by the computer industry and is now used as a catch-all term for anything negative or positive about the twenty-first-century technological society 7,8 . Interestingly, before the 2000s, big data was considered a problem 9 . ...
... Digital data, together with Internet access, is the socio-technical product of human interaction within a framework of technological advances and interactions between people and electronic devices (Morales-i-Gras 2022). As a direct consequence of the digitisation process, there has been an exponential growth in the accumulation of large volumes of data, or 'big data' of different types-such as audio, text, images and video-year after year (Diebold 2012;Mashey 1997). In addition, computers connected to the Internet since the 1990s have been joined by other telecommunication devices introduced later, such as mobile phones and, in more recent times, smart watches, TVs, or cars. ...
Article
This article explores the multidisciplinary field of computational sociolinguistics (CSLX). A social network analysis through a bibliometric approach is employed to investigate the landscape of scholarly publications through a series of keyword queries related to ‘Computational Sociolinguistics’. The study analyses publications contributing to this research area, focusing on semantic content analysis. The methodology involves the extraction and processing of data from the database Web of Science followed by semantic similarity assessments and classifications of abstracts using advanced natural language processing techniques to map the thematic clusters within the field as well as co‐authorship analysis. Key findings reveal significant multidisciplinarity in CSLX characterised by the integration of computational methods into sociolinguistic inquiries. The study identifies central topics and collaborative patterns, highlighting the influence of the computational profile over the sociolinguistic one. Conclusively, CSLX is positioned as an emerging field, with its development influenced by technological advancements and the increasing complexity of social interactions in computer‐mediated communication.
... El término fue utilizado por el sociólogo Charles Tilly en 1984 para referirse a la importancia de los datos. Desde las técnicas de la informática, se hizo referencia nuevamente al Big Data en 1987 en el marco de un principio de programación (Diebold, 2012). Aunque los periodos de transformación de la web no están ligados a fechas específicas, se presentan como transiciones impulsadas por modelos de la economía digital desde las fuerzas del biopoder. ...
... (Reinsel, Gantz, & Rydning, 2018). (Diebold, 2012) fa risalire l'etimologia del termine "big data" alla metà degli anni '90, utilizzato per la prima volta da John Mashey, ex Chief Scientist in pensione presso Silicon Graphics, per riferirsi alla gestione e all'analisi di enormi set di dati (kitchin, 2014). ...
... More importantly, Hadoop has paved the way for the further development of distributed analytics frameworks in the cloud, as well as AI-driven (artificial intelligence) analytics systems [6], which are important for managing and processing large datasets. The term "Big Data" was coined in this period [7], reflecting the growing acceptance of the volume and importance of data. ...
Article
Full-text available
The rapid advancement of artificial intelligence (AI), coupled with the global rollout of 4G and 5G networks, has fundamentally transformed the Big Data landscape, redefining data management and analysis methodologies. The ability to manage and analyze such vast and varied datasets has exceeded the capacity of any individual or organization. This study introduces an enhanced framework that expands upon the traditional four Vs of Big Data—volume, velocity, volatility, and veracity—by incorporating six additional dimensions: value, validity, visualization, variability, volatility, and vulnerability. This comprehensive framework offers a novel and straightforward approach to understanding and addressing the complexities of Big Data in the AI era. This article further explores the use of ‘Big D’, an AI‑driven, RAG‑based Big Data analytical bot powered by the ChatGPT‑4o model (ChatGPT version 4.0). This article’s innovation represents a significant advance in the field, accelerating and deepening the extraction and analysis of insights from large‑scale datasets. This will enable us to develop a more nuanced and comprehensive understanding of intricate data landscapes. In addition, we proposed a framework and analytical tools that contribute to the evolution of Big Data analytics, particularly in the context of AI‑driven processes.
... Big data is characterized by the three Vs: (i) high velocity, (ii) high variety, (iii) and high volume, as well as its flexibility, exhaustiveness, resolution, and indexing (Kitchin, 2013;Mayer-Schonberger & Cukier, 2013;Miller & Goodchild, 2015). This term was firstly used, according to Diebold (2012), in the mid-90s by Mashey (Chief Scientist at Silicon Graphics) to refer to large data sets. However, until 2008 the term big data had a residual use, excepting academics on the information technology field or industry. ...
Chapter
Without Abstract Synonyms Digital geographies Definition/Description Big data has played an increasingly important role in geography, providing new opportunities for research due to the new sources of data. These data enrich territories with multiple layers of information, creating opportunities for new forms of spatial analysis. Geography is now rethinking its epistemological approach under this new paradigm, leading to a significant revolution in geographical thought. On the one hand, geography might be becoming a data-driven science. On the other hand, positivism might be re-emerging in a different form. Some of these different positions have questioned geography as a science supported by theory, and it is necessary to find a balance between these extreme perspectives. Technological development has created a myriad of opportunities for producing new sources of information which go beyond the conventional information produced by governmental institutions. Today, any individual can produce new online content, and several companies and different digital actors such as app developers, retail chains, financial institutions, mobile phone operators, security firms, and Internet companies, among others, are actively creating online tools to generate huge databases with information extracted from users. Generally, a major part of this information is privatized, and access is expensive, as corporations commercialize their databases as they are rich in
... Los modelos predictivos son especialmente valiosos en períodos de incertidumbre económica, donde pueden ofrecer pronósticos más rápidos y adaptativos en comparación con los métodos tradicionales (Diebold, 2012). ...
Book
Full-text available
El libro aborda la integración de la inteligencia artificial (IA) en diferentes áreas de las ciencias económicas y la gestión empresarial, explorando sus impactos y beneficios. En la introducción, se destaca cómo la transformación digital redefine las estrategias organizacionales y fomenta la innovación continua, mejorando la capacidad de respuesta a las demandas del mercado . Metodológicamente, se utilizan estudios de caso y análisis de datos para ilustrar la aplicación de la IA. Los resultados muestran que, en la contabilidad, la IA automatiza tareas repetitivas, reduce errores y mejora la precisión, permitiendo a los profesionales centrarse en actividades estratégicas . En el ámbito financiero, los algoritmos de trading inteligente aumentan la velocidad y precisión de las transacciones, mejorando la competitividad y la liquidez del mercado . La gestión de riesgos se beneficia de modelos predictivos que anticipan posibles amenazas, mientras que el cumplimiento normativo se fortalece mediante el monitoreo automatizado . En términos de desarrollo sostenible, la IA optimiza la distribución de recursos y mejora la eficiencia energética, contribuyendo a políticas más verdes y equitativas . La conclusión resalta la necesidad de un enfoque ético y transparente en la implementación de IA, para asegurar decisiones justas y responsables .
... When it comes to big data resources and competencies, these variables may lead to heterogeneity among businesses (Chen et al., 2022;Dubey et al., 2019). Initially, the term Web 2.0 was introduced and immediately after one year Roger Mougalas from O'Reilly Media first coined the term "big data" (Diebold, 2012;Sangeetha & Sreeja, 2015). Interestingly, the term "big data" lacks a specific and fixed definition, leading scholars and researchers to describe it as a "dynamic concept" whose meaning evolves with the ever-expanding nature of this phenomenon (Gupta et al., 2018;Sheng et al., 2017). ...
Article
Full-text available
This research article investigates the impact of big data usage on firm performance in the dynamic context of Indian organizations. With a focus on understanding how these organizations leverage big data techniques to enhance various functional areas and the role of strategic planning within this framework, the study contributes to bridging existing knowledge gaps. The study employs the dynamic capability view theory to explore the intricate relationship between big data resources, data analysis capabilities, and competitive advantage in the Indian business ecosystem. Through a comprehensive analysis of empirical data from diverse Indian organizations, the research aims to provide practical insights into the relationship between big data usage and firm performance. The findings not only guide Indian firms in optimizing their utilization of big data but also offer valuable information for policymakers and industry stakeholders. The study further delves into the mediating role of strategic planning in the relationship between big data usage and firm performance. This research article not only contributes to the theoretical foundations of big data analytics and strategic planning but also offers practical guidance for Indian organizations seeking to harness the full potential of these capabilities for enhanced firm performance. The findings underscore the transformative power of big data and emphasize the need for strategic planning to fully capitalize on this transformative potential in the Indian business context.
... Zum Ursprung des Begriffs Big Data vgl. ausführlich Diebold (2012). 10 "Big data is also characterized by the ability to render into data many aspects that have never been quantified before: call it 'datafication'." ...
Thesis
Full-text available
Kumulative Habilitationsschrift zur Rolle der neuronalen maschinellen Übersetzung (NMÜ) als Werkzeug des fachübersetzerischen Handelns im modernen digitalisierten und datafizierten Fachübersetzungsprozess. Es handelt sich hierbei um die Mantelschrift, die die kumulative Leistung umschließt. Im vorderen Teil des Mantels wird zunächst ein größerer Kontext für die Betrachtung der neuronalen maschinellen Übersetzung als Werkzeug fachübersetzerischen Handelns aufgebaut und die vorliegende Schrift translationswissenschaftlich verortet. Im hinteren Mantelteil fließen die im vorderen Teil und in der kumulativen Leistung angestellten Überlegungen, ergänzt durch einige zusätzliche Gedanken, in einem Faktorenmodell des situierten NMÜ-gestützten Fachübersetzungsprozesses zusammen. In diesem Modell wird die „Passung des Werkzeugs, des Werkstücks und des Werkmeisters“ (Holz-Mänttäri 1984:136) in einem gegebenen soziotechnischen und sozioökonomischen Umfeld betrachtet.
... Big data: Big data is generated from a multiplying of various sources, including user-generated content, social media, internet clicks, mobile transactions, even the records of business dealings such as sales and purchase transactions. The Term Big Data for the first time was used by Roger Mougalas in year 2005 (Diebold F., 2012;Sangeetha et al., 2015). The term "big data" is defined in no specific manner, whereas academicians and researcher suggested that big data can be regarded as a "moving definition", which changes in its meaning just like its ever-changing nature (Gupta et al., 2018;Sheng et al., 2017). ...
... As the digital data in IR 4.0 is highly complex, often unstructured, and massive, it is called big data, hence establishing a close relationship between IR 4.0 and big data. Diebold (2012) traces the origin of big data as a term in the works of John Mashey and others at Silicon Graphics. The most popular explanation of big data is the three Vs explanation, i.e. volume, velocity and variety, which describes its massive size (volume), rapid speed (velocity) and numerous types (variety). ...
Chapter
Purpose This chapter conceptualises a link between Industrial Revolution 4.0 (IR 4.0), big data, data science and sustainable tourism. Design/Methodology/Approach The author adopts a grounded theory and conceptual approach to endeavour in this exploratory research. Findings The outcome shows a significant rise of big data in the tourism sector under three major dimensions, i.e. business, governance and research. And, some exemplary evidence of institutions promoting the use of big data and data science for sustainable tourism has been discussed. Originality/Value The conceptualised interlinkage of concepts like IR 4.0, big data, data science and sustainable development provides a valuable knowledge resource to policy-makers, researchers, businesses and students.
... The concept of big data (BD), introduced in the 1990s [1], typically refers to a huge information silo consisting of a vast number of datasets distributed in horizontally networked databases. This concept enriches many sectors, including healthcare [2], banking [3], media and entertainment [4], education [5], and transportation [6]. ...
Conference Paper
Full-text available
This study deals with how to develop big data analytics for smart manufacturing. The analytics consists of five integrated systems: big data preparation system, big data exploration system, data visualization system, data analysis system, and knowledge extraction system. The functional requirements of the systems are elucidated. In addition, a JAVA-based system is developed to materialize the proposed analytics. Moreover, the analytics is applied to electrical discharge machining. It is found that the analytics can find out rules to optimize electrical discharge machining. Furthermore, the analytics can be used for other machining operations. Since it is free from big-data inequality, not only large enterprises but also small and medium enterprises can use it to benefit from big data.
... The concept of big data (BD), introduced in the 1990s [1], typically refers to a huge information silo consisting of a vast number of datasets distributed in horizontally networked databases. This concept enriches many sectors, including healthcare [2], banking [3], media and entertainment [4], education [5], and transportation [6]. ...
Article
Full-text available
This paper presents a systematic approach to developing big data analytics for manufacturing process-relevant decision-making activities from the perspective of smart manufacturing. The proposed analytics consist of five integrated system components: (1) Data Preparation System, (2) Data Exploration System, (3) Data Visualization System, (4) Data Analysis System, and (5) Knowledge Extraction System. The functional requirements of the integrated system components are elucidated. In addition, JAVA™- and spreadsheet-based systems are developed to realize the proposed system components. Finally, the efficacy of the analytics is demonstrated using a case study where the goal is to determine the optimal material removal conditions of a dry Electrical Discharge Machining operation. The analytics identified the variables (among voltage, current, pulse-off time, gas pressure, and rotational speed) that effectively maximize the material removal rate. It also identified the variables that do not contribute to the optimization process. The analytics also quantified the underlying uncertainty. In summary, the proposed approach results in transparent, big-data-inequality-free, and less resource-dependent data analytics, which is desirable for small and medium enterprises—the actual sites where machining is carried out.
... However, the data volume was too large to be handled by traditional database systems. The term "big data" first emerged in the 1990s to describe such data (Diebold, 2012). Big data includes structured data and unstructured data. ...
Article
Full-text available
As the data-analysts job market grows, many colleges and universities have started offering a data analytics curriculum. However, there are potential gaps between the skills business organizations expect of data analysts and the skills universities and colleges teach their students. This study collected 2500+ data-analyst job ads posted on LinkedIn and analyzed them using distribution analysis, cross-tabulation analysis, and cluster analysis. Among many findings, this study identified five most essential nontechnical skills and five most essentials areas of technical skills. In addition, of 90+ computer programs business organizations expect data analysts to use, this study identified SQL, Microsoft Excel, Tableau, Python, and Microsoft Power BI to be the five most essential computer programs for potential data analysts to master.
... En origen, el término Big Data se remite a mediados de la década de 1990 y aparece por primera vez en un conjunto de trabajos de carácter industrial desarrollados por John Mashey como científico-jefe de Silicon Graphics. Inicialmente, él y su equipo utilizaron el término para referirse al manejo y análisis de conjuntos de datos masivos (Diebold, 2012), evidenciando una problemática clara de volumen de información (grandes cantidades de datos, que pueden generarse, por ejemplo, a partir de teléfonos móviles, estaciones meteorológicas, tarjetas de crédito, drones o fotografías). Respecto al ámbito académico, las primeras discusiones y referencias sobre el concepto aparecerán unos años más tarde, concretamente en las investigaciones iniciales de Sholom Weiss y Nitin Indurkhya (1998), en el ámbito de la informática y de Peter Christoffersen y Francis Diebold (2000) en los campos de la estadística y de la econometría. ...
Article
Full-text available
En el ampliamente citado artículo ‘The coming crisis of empirical sociology’ anticipaban una inminente problemática a la que se enfrentaría la sociología empírica, en su incapacidad de incorporar en su práctica investigadora los datos ‘transaccionales’ propios del capitalismo cognitivo. La progresiva extensión de la intermediación tecnológica en los procesos de interacción social genera una importante barrera de carácter tecnocientífico, pero sobre todo económica y política, que relega a la sociología (y sus métodos) a la retaguardia de la investigación social y de mercado. En este artículo haremos una suerte de arqueología de la construcción del concepto de ‘Big Data’, de las principales alternativas metodológicas que se han ido proponiendo desde las mismas, y de las nuevas perspectivas que se abren a partir de la concentración de la información en un menor número de empresas.
... Advanced AI and ML algorithms can be integrated in the data analysis process and make use of these extensive data to analyze, predict, and notify farmers of abnormal occurrences, identifying patterns and suggesting solutions to pressing problems in modern animal farming, driving the strategies to improve the sector's profitability [8,11]. The definition of big data is somewhat elusive and is often described in terms of three "Vs": volume, velocity, and variety [12]. ...
Article
Full-text available
Simple Summary In future decades, the demand for poultry meat and eggs is predicted to considerably increase in pace with human population growth. Although this expansion clearly represents a remarkable opportunity for the sector, it conceals a multitude of challenges related to pollution and land erosion, competition for limited resources between animal and human nutrition, animal welfare concerns, limitations on the use of growth promoters and antimicrobial agents, and increasing risks of animal infectious diseases and zoonoses. The increase in poultry production must be achieved mainly through optimization and increased efficiency. The increasing ability to generate large amounts of data (“big data”)—coupled with the availability of tools and computational power to store, share, integrate, and analyze data with automatic and flexible algorithms—offers an unprecedented opportunity to develop tools to maximize farm profitability, reduce socio-environmental impacts, and increase animal and human health and welfare. The present work reviews the application of sensor technologies, specifically, the principles and benefits of advanced statistical techniques and their use in developing effective and reliable classification and prediction models to benefit the farming system. Finally, recent progress in pathogen genome sequencing and analysis is discussed, highlighting practical applications in epidemiological tracking and control strategies. Abstract In future decades, the demand for poultry meat and eggs is predicted to considerably increase in pace with human population growth. Although this expansion clearly represents a remarkable opportunity for the sector, it conceals a multitude of challenges. Pollution and land erosion, competition for limited resources between animal and human nutrition, animal welfare concerns, limitations on the use of growth promoters and antimicrobial agents, and increasing risks and effects of animal infectious diseases and zoonoses are several topics that have received attention from authorities and the public. The increase in poultry production must be achieved mainly through optimization and increased efficiency. The increasing ability to generate large amounts of data (“big data”) is pervasive in both modern society and the farming industry. Information accessibility—coupled with the availability of tools and computational power to store, share, integrate, and analyze data with automatic and flexible algorithms—offers an unprecedented opportunity to develop tools to maximize farm profitability, reduce socio-environmental impacts, and increase animal and human health and welfare. A detailed description of all topics and applications of big data analysis in poultry farming would be infeasible. Therefore, the present work briefly reviews the application of sensor technologies, such as optical, acoustic, and wearable sensors, as well as infrared thermal imaging and optical flow, to poultry farming. The principles and benefits of advanced statistical techniques, such as machine learning and deep learning, and their use in developing effective and reliable classification and prediction models to benefit the farming system, are also discussed. Finally, recent progress in pathogen genome sequencing and analysis is discussed, highlighting practical applications in epidemiological tracking, and reconstruction of microorganisms’ population dynamics, evolution, and spread. The benefits of the objective evaluation of the effectiveness of applied control strategies are also considered. Although human-artificial intelligence collaborations in the livestock sector can be frightening because they require farmers and employees in the sector to adapt to new roles, challenges, and competencies—and because several unknowns, limitations, and open-ended questions are inevitable—their overall benefits appear to be far greater than their drawbacks. As more farms and companies connect to technology, artificial intelligence (AI) and sensing technologies will begin to play a greater role in identifying patterns and solutions to pressing problems in modern animal farming, thus providing remarkable production-based and commercial advantages. Moreover, the combination of diverse sources and types of data will also become fundamental for the development of predictive models able to anticipate, rather than merely detect, disease occurrence. The increasing availability of sensors, infrastructures, and tools for big data collection, storage, sharing, and analysis—together with the use of open standards and integration with pathogen molecular epidemiology—have the potential to address the major challenge of producing higher-quality, more healthful food on a larger scale in a more sustainable manner, thereby protecting ecosystems, preserving natural resources, and improving animal and human welfare and health.
... machine learning (ML) have effectually increased the power, availability, growth, and impact of AI (OECD, 2019). Although sociologist Charles Tilly (1984) first mentioned big data, Diebold (2012) argued that an unpublished 2001 research note by Douglas Laney (2001) at Gartner significantly enriched the concept. He is the person who originated the three Vs of the big data framework to describe big data in an up-to-date way (Egan and Haynes, 2019). ...
... But in view of the amount of data and the clear implications of human-wildlife close interactions for general health, conservation, and lay people's perceptions (independent of the context where it happens), we believe our taxonomic approach is justified and our findings extremely valuable for further discussions on the subject. Furthermore, we used webscrapping software and crowdsourcing tools ("Big Data" survey, Diebold, 2012;Gandomi & Haider, 2015), given that these are efficient methodologies to collect data on regional threats to biodiversity (Carneiro & Mylonakis, 2009;Di Minin et al., 2015;Polgreen et al., 2008). These results are highly relevant as they cover today's generation of influencers (Mccallum & Bury, 2013;Pearson et al., 2016;Wankel, 2009). ...
Article
Full-text available
There remains a debate as to whether the display of wild animals inpopular media, such as the Internet, contributes toward or erodesconservation behavior. A good model to assess these impacts arecapuchin monkeys (genera Cebus and Sapajus), given that theyhave historically been traded as pets internationally and areamong Hollywood’s most famous primate actors. We usedcrowdsourcing tools to survey social media posts (YouTube andInstagram) and news/reports (on Google) to investigate howthese primates are currently portrayed on the Internet. We found1138 capuchin-related videos on YouTube, and the ones withmore than 1 million views mainly (71%) portrayed these animalsas pets. Searches on Instagram identified that #capuchinmonkeyhad 39,000 more posts than #Cebus or #Sapajus, of which the topresults (those that generated the most engagement) were postsof individuals in anthropogenic environments and/or close tohumans. Our Google search identified an exponential growth ofnews related to the legal and illegal pet trade of capuchinmonkeys since 2017, which could be related to the increase onthe reach and engagement of social media posts with theseprimates as pets. Poor scientific knowledge or interest, along withengagement with exotic pet trade content among Internet users,may lead to negative consequences for species conservation.Given the threats facing both capuchin monkeys and otheranimals, including increasing habitat fragmentation and loss, it isessential to establish clear policies surrounding wildlife contentmanagement on social media.
... The data analytical techniques include text analytics, audio analytics, video analytics, social analytics, and predictive analytics. Diebold [7] examined the origins of Big Data, the concept, the phenomenon and the discipline. He stated that Big Data was present at Silicon Graphics (SGI) as early as the mid-1990s, when John Mashey, the retired chief scientist of SGI, produced an SGI presentation entitled "Big Data and the New Wave of InfraStress" in 1998. ...
Article
Full-text available
For modern industry, supply chain optimization is becoming very important. To stay ahead of the competition, companies must be able to optimize their supply chain. This paper examines the applications of Big Data in supply chain management, opportunities, benefits, challenges, and future trends. Big Data plays an important role in various areas of supply chain management, such as procurement planning, logistic, inventory, innovation and product design, operations efficiency and maintenance, product and market strategy development, and network design. The increasing amount of data shared by supply chains in manufacturing and service sectors justifies the use of Big Data in supply chain management. Benefits that may result from using big data in supply chain management include better decision-decision making, improving efficiency, cost reduction, better risk management, and better visibility and competition. Challenges that face introducing and adopting big data in supply chain management include IT capabilities and infrastructure, talent management and HR, information and cyber security, integration and collaboration, information management, governance and compliance, financial implications, and ethical and managerial implications. The study revealed the areas for future directions in the development of Big Data research in the supply chain management, which include big data models in supply chain management, application of big data on closed loop supply chain management, application of big data on supply chain management function level, and new big data techniques.
... But in view of the amount of data and the clear implications of human-wildlife close interactions for general health, conservation, and lay people's perceptions (independent of the context where it happens), we believe our taxonomic approach is justified and our findings extremely valuable for further discussions on the subject. Furthermore, we used webscrapping software and crowdsourcing tools ("Big Data" survey, Diebold, 2012;Gandomi & Haider, 2015), given that these are efficient methodologies to collect data on regional threats to biodiversity (Carneiro & Mylonakis, 2009;Di Minin et al., 2015;Polgreen et al., 2008). These results are highly relevant as they cover today's generation of influencers (Mccallum & Bury, 2013;Pearson et al., 2016;Wankel, 2009). ...
Article
Full-text available
In Anthropocene, approximately 70% of all terrestrial ecosystems are highly modified by human activities and more than a half of all primate’s species in the world are endangered. Here we present results of a systematic review on published articles with an Ethnoprimatology approach, aiming to assess the nationwide pattern and quality of proximity/interaction between human-nonhuman primates in Brazil, a country vulnerable to high deforestation rates while having the highest primate biodiversity in the world. The first article was published 29 years ago and add up to only 36 published articles until present time. Most studies were conducted in Atlantic forest, but higher number and diversity of interactions was described for Amazon. Sapajus, being a generalist and semi-terrestrial primate, was the most cited genus and had the greatest diversity of interactions, including garbage foraging and crop-raiding. Alouatta, the second most cite one, had more symbolic/mystic relationships. Some specialized or forest-specific primates are scarcely mentioned. Studies carried out in both rural and urban environment are almost equal in number but showed differences in types of interactions they describe: garbage foraging, crop-raiding by primates and food offering by humans happening in more urbanized areas and symbolic/mystic relationships and beliefs around nonhuman primates described in rural/indigenous settlements. We urge future studies to describe interactions and proximity carefully specifying the context where they occur. It is relevant to maintain the growing curve of Ethnoprimatological studies in Brazil as a way to aggregate information about different populations of species and help to base conservation strategies of co-existence.
... La big data apareció como categoría a mediados de los años 90 en una presentación de Masley (1998) de Silicon Graphics Inc (Diebold, 2012); desde entonces el concepto ha evolucionado. Así, Laney (2001) mencionó las famosas 3 V que identificaban a la big data: Volumen, Valor y Velocidad; posteriormente, Gartner (2021) incorpora dos más: Variabilidad y Valor, de tal forma que, para el 2001, se resume la big data en la siguiente definición: activos de información de gran volumen, velocidad y variedad que exigen formas rentables e innovadoras de procesamiento de la información para mejorar la visión y la toma de decisiones (Gartner, 2021). ...
Article
El objetivo de esta investigación fue explorar y caracterizar los principales repositorios de big data en el área de ciencias sociales disponibles en 2021. El diseño de la investigación fue no experimental, exploratoria y descriptiva. La población estuvo constituida por 110 big data localizados por el motor de búsqueda para conjuntos de datos (datasets) de Google. La muestra correspondió a los 10 principales big data. Los resultados indicaron que los repositorios y plataformas más importantes de big data se encuentran centralizados por el sector privado localizado en empresas de EE. UU., fundamentalmente.
... Autonomous vehicles collect and process huge amounts of data some of which is personal. This can be defined as big data (Diebold, 2012). ...
Article
Full-text available
At the time of writing this paper there are already some implementations of autonomous vehicles. This paper approaches this subject from a different perspective by defining what automation means in other areas of transportation as well as ethical problems regarding the operation of this kind of vehicle. In the later part although of equal importance discuss the operation of the autonomous vehicle from a security and data privacy point of view. This paper tries to define autonomous vehicles by comparison with other ways of transportation which have a higher degree of automation and defines what autonomous vehicles might impact our lives and our privacy.
... Another fundamental work, albeit not of an academic nature, which is recognized in the literature for its important contribution on the subject (Diebold, 2012;Mayer-Schönberger and Cukier, 2013) is the report produced by Doug Laney (2001) of the META Group. This work, which does not explicitly use the expression "big data," presents the growing opportunities in terms of data management according to the three-dimensional profile linked to the so-called "3Vs": volume, velocity, and variety. ...
Article
The growing importance of big data in the current business context is a recognized phenomenon in managerial studies. Many such studies have been focused on possible changes from the use of big data analytics in business, and with reference to management control systems. However, the number and extent of studies attempting to analyze the opportunities and risks of using big data analytics in control systems from an empirical perspective appear rather limited. This work conducts case studies analyzing three companies that have used big data in their decision-making processes within management control systems. The empirical analysis shows how proper management of big data can represent a fundamental opportunity for the development of managerial control systems, with some possibilities not yet fully explored even by those who have already introduced big data analytics in these systems. Big data quality and privacy protection appear to be the profiles presenting the greatest opportunities for future study. Furthermore, new challenges seem to emerge for accountants and controllers, who now are called to a new approach regarding how they should interpret their professional roles.
... The first significant academic references to Big Data in computer science were by Weiss and Indurkhya (1998) in computer science and Diebold (2000) in statistics and econometrics (Diebold 2012). The reference to Big Data in those first citations pertained to bigger data sets than normal, but since then it has evolved to include a range of characteristics (Leary 2013). ...
Article
Purpose: The burst of modern information has significantly promoted the development of global medicine into a new era of big data healthcare. Ophthalmology is one of the most prominent medical specialties driven by big data analytics. This study aimed to describe the development status and research hotspots of big data in ophthalmology. Methods: English articles and reviews related to big data in ophthalmology published from January 1, 1999, to April 30, 2024, were retrieved from the Web of Science Core Collection. The relevant information was analyzed and visualized using VOSviewer and CiteSpace software. Results: A total of 406 qualified documents were included in the analysis. The annual number of publications on big data in ophthalmology reached a rapidly increasing stage since 2019. The United States (n = 147) led in the number of publications, followed by India (n = 77) and China (n = 69). The L.V. Prasad Eye Institute in India was the most productive institution (n = 50), and Anthony Vipin Das was the most influential author with the most relevant literature (n = 45). The electronic medical records were the primary source of ophthalmic big data, and artificial intelligence served as the principal analytics tool. Diabetic retinopathy, glaucoma, and myopia are currently the main topics of interest in this field. Conclusions: The application of big data in ophthalmology has experienced rapid growth in recent years. Big data is expected to play an increasingly significant role in shaping the future of research and clinical practice in ophthalmology.
Article
Full-text available
Resumo Neste dossiê, congregamos pesquisas que descrevem e analisam relações constituídas sob o efeito dos big data. A partir de espaços bastante distintos entre si - o jurídico, o burocrático, o agrícola, o biotecnológico, o econômico, o pornográfico -, voltamo-nos às agências conflitivas e criativas implicadas no que os dados fazem ver e no que são capazes de criar por meio de seu potencial de hiper-relacionalidade. Discutimos, então, a relação dos volumosos bancos de dados com aparatos diversos, como aplicativos, satélites, microscópios, documentos e algoritmos, capazes de arranjar e distribuir elementos informacionais por meio de práticas comparativas. Examinamos procedimentos de coleta e processamento de big data em diferentes setores e instituições; a pulverização das linguagens algorítmicas e estatísticas; as exigências crescentes por transparência, precisão e celeridade; a vinculação entre o tratamento de dados e as possibilidades de desenvolvimento democrático e sustentável. É dada especial atenção às transformações éticas e epistemológicas que envolvem os big data, bem como os efeitos da mobilização de conhecimentos computacionais por especialistas, incluídos aí os/as próprios/as antropólogos/as, em diferentes ambientes de produção de conhecimento.
Chapter
The intersection of adventure travel and big data heralds a new era in understanding, managing, and enhancing nature-based experiences. This chapter explores the multifaceted applications of big data within the realm of adventure tourism, by exploring its role in managing national parks; developing mobile apps; measuring health and fitness; aiding in search, rescue, and disaster management; and assessing and mitigating environmental impact. From leveraging user-generated content for sentiment analysis to employing GPS-enabled technologies for trail mapping and navigation, big data stands as an invaluable resource for enhancing safety, planning, and storytelling during adventurous endeavours. However, amidst big data and technology’s myriad benefits in adventure travel, ethical considerations and potential disruptions to the authentic adventure experience may arise. There are ethical implications associated with data collection and with the dichotomy between harnessing technology for enhancing experiences and the desire to disconnect from digital interfaces in pursuit of unadulterated escapades. The discussion highlights the need for adventure technology, in synergy with big data, to optimize experiences while treading lightly in delicate ecosystems. This synthesis of adventure travel and big data not only presents opportunities for deeper insights and efficient management but also underscores the necessity of ethical contemplation and thoughtful integration to harmonize technology with nature-centric pursuits.
Article
Improvements in the number and resolution of Earth- and satellite-based sensors coupled with finer-resolution models have resulted in an explosion in the volume of Earth science data. This data-rich environment is changing the practice of Earth science, extending it beyond discovery and applied science to new realms. This Review highlights recent big data applications in three subdisciplines—hydrology, oceanography, and atmospheric science. We illustrate how big data relate to contemporary challenges in science: replicability and reproducibility and the transition from raw data to information products. Big data provide unprecedented opportunities to enhance our understanding of Earth’s complex patterns and interactions. The emergence of digital twins enables us to learn from the past, understand the current state, and improve the accuracy of future predictions.
Chapter
Although “smart tourism” has become a catchphrase, multiple interpretations and understandings exist, making it more of a nebulous term than a field with firm foundations. Aiming to contribute to the smart tourism literature, we have located, selected, categorised, and evaluated recent articles in computer science journals, proceedings, and conference papers authored by important academics in the field after years of intensive research. In addition to academic research articles, “grey literature” was unearthed, analysed, and ultimately contributed to the generation of fundamental scientific knowledge. Our contribution in this chapter is a complete and formal conceptualization of the so-called smart tourism sector, which could pave the way for its development and the eventual emergence of a Smart Tourism Culture. In addition, we discuss and analyse the identified concepts/approaches to expand comprehension and knowledge of them and, more importantly, to determine their applicability and role in smart tourism. Thus, the groundwork for future research in each approach is set, impediments and areas requiring greater study are recognised, and places where no further advance can be achieved, are exposed. In particular, the following approaches and concepts recognised in the smart tourism realm are discussed: Theoretical approaches, Privacy and Data Protection, Context Awareness, Cultural Heritage, Recommender Systems, Social Media, Internet of Things, User Experience, Real-Time, User Modelling, Augmented Reality, Artificial Intelligence, Big Data, and Cyber Tourism. After identifying and assessing the aforementioned key topics, we discovered gaps in current knowledge/systems and which were the inspiration to work toward building a series of frameworks that lead to a smart tourist experience. By locating habits, tools and cutting-edge technologies, we were able to formulate our research topics and contribute to the field with the frameworks and applications demonstrated in the subsequent chapters.
Article
Full-text available
Micro, Small and Medium-sized Enterprises (MSMEs) are key agents of a nation’s economic growth and a corner stone of economic sustainability. In Nigeria, MSMEs contribute almost 50% of the total GDP and employ over 90 million people. Quite often in Nigeria, MSMEs are established in essentially to create employment, fight hunger and poverty. The irony is that majority of the enterprises do not survive up to one year after their establishment no matter how good the quality of their products and/or services may be. Research reveals that one of the major stum-bling blocks for business sustainability in Nigeria the present era of the Internet of Things is poor visibility of MSMEs to local and global markets. While the Nigerian internet population continues to grow, online presence by the Nigerian MSMEs remains poor. This paper examines major factors affecting internet presence leverage amongst MSMEs in Abuja and Lagos Nigeria, as well as the impact of online presence on MSMEs profitability and growth. Quantitative research was adopted in the study using questionnaires to gauge responses from MSMEs in the two states. The responses were analyzed using statistical software package (SPSS); the results show lack of technical skill, high cost of internet access fees, poor power supply, and poor internet infrastructure, as major barriers to online presence of MSMEs in Abuja and Lagos. The paper recommends strategies for effective online visibility for the Nigerian MSMEs. Keywords: Micro, Small and Medium-sized Enterprises (MSMEs), Leverage, Online Presence, Internet Presence, E-business, Growth and Profitability
Preprint
Full-text available
This paper presents a systematic approach to developing big data analytics for manufacturing process-relevant decision-making activities from the perspective of smart manufacturing. The proposed analytics consists of five integrated system components: 1) data preparation system, 2) data exploration system, 3) data visualization system, 4) data analysis system, and 5) knowledge extraction system. The functional requirements of the integrated systems are elucidated. In addition, JAVA- and spreadsheet-based systems are developed to realize the proposed integrated system components. Finally, the efficacy of the analytics is demonstrated using a case study where the goal is to determine the optimal material removal conditions of a dry electrical discharge machining operation. The analytics identified the variables (among voltage, current, pulse-off time, gas pressure, and rotational speed) that effectively maximize the material removal rate. It also identified the variables that do not contribute to the optimization process. The analytics also quantified the underlying uncertainty. In synopsis, the proposed approach results in transparent, big-data-inequality-free, and less resource-dependent data analytics, which is desirable for small and medium enterprises—the actual sites where machining is carried out.
Conference Paper
Full-text available
In order to contribute to sustainable development goals, this research paper evaluates how big data is used in libraries (SDGs). It begins with a description of big data's significance and promise before concentrating on how new developments in the field are having an impact on libraries. Additionally, it investigates how much the SDGs are taken into consideration in strategic planning for libraries and how library staff members feel their efforts are influencing the SDGs. The most crucial information is that library personnel are aware of the SDGs, they are involved in the long-term planning, which of the five SDGs specifically listed they believe libraries have influenced the most, and they are aware of the SDGs. How frequently have the libraries been involved in initiatives to encourage the five SDGs that have been focused on? What initiatives are library employees pursuing to advance the five SDGs that have been highlighted? Inadequate technical infrastructure, a lack of new task-specific skills, problems with open data legislation, a lack of government and library collaboration, a lack of LIS workers' enthusiasm, and a lack of ambition are all problems associated with this data flood. Libraries must form a collaboration with governments, make access and privacy-related taxonomies, increase literacy in information and aggressively intensify their advocacy for information freedom while balancing it with the protection of individual privacy in order to deal with the new roles and expectations. They must also advocate for policy reform. The libraries have so much potential for the development of these sustainable development goals.
Chapter
With the use of cutting-edge technology like artificial intelligence, the Internet of Things, and blockchain, the insurance industry is about to enter a new era. Financial innovations will have a wide range of effects on economic activity. The future of the AI and blockchain-based insurance industry appears bright. The expansion of the insurance industry will be considerably aided by AI and blockchain technology. The corona situation has recently brought to light some inefficiencies of the conventional system, including fewer client–insurer interaction capabilities, challenges in targeting consumers, difficulties working from home, a lack of IT support, a lack of transparency, and sparse management. This article examines the impact of cutting-edge financial technology on the insurance industry, including AI technology and blockchain, and describes the phenomena from several angles. Insightful and scientific, ICT-based insurance makes considerable use of electrical and IT equipment for technology resolution without the consumer providing the insurance company with any natural resources. Digital insurer seems to be growing more popular at the moment, but it’s essential to understand the problems and challenges it faces, which result in poor penetration. Although there are many advantages for the insured and insurer, internet insurance providers are nevertheless subject to operational, regulatory, and company reputation, and their customers worry about the safety of their transactions and identity theft. This article aims to illustrate the fundamentals and significance of AI and blockchain technology in the insurance business, looking at benefits that accrue to both the provider and the customer. Although there are many studies already published, the focus of this research is the insurance industry. By fusing cutting-edge technical demands with established company competencies, this paper assists the management in expanding their understanding of competing for advantage.KeywordsArtificial intelligenceBlockchainTechnologyInsurTechCovid-19Insurance business
Article
The article contains a study of existing views on the economic content of big data. From among the views, within which the authors define big data, the descriptive-model, utility-digital and complex-technological approaches are formulated. Against the back- ground of the large-scale spread of digital technologies (machine learning, cloud computing, artificial intelligence, augmented and virtual reality, etc.), functioning thanks to big data, the study of their economic essence is becoming especially relevant. As a result, it was found that the basis of economic activity in the digital economy is big data. The definition of big data as a resource of the digital economy is proposed.
Article
Toutes les fonctions d’entreprise au cœur des échanges d’informations sont bousculées par les innovations numériques qui transforment nos façons de travailler ensemble. Parti d’un cas de non performance sociale, l’article interroge l’influence des outils big data RH (Ressources Humaines) sur le pilotage du Contrôle de Gestion Sociale (CGS). Pour mesurer cette influence, le cadre d’analyse mobilisé croise les singularités informationnelles de contrôle des outils big data RH avec les deux niveaux du pilotage social : l’opérationnel et le stratégique. Les résultats de cette recherche exploratoire, menée dans le service de CGS de l’entreprise appelée DIODA, montrent que les outils big data RH changent l’échelle d’analyse du pilotage social mais restent sous l’influence de l’organisation des informations RH.
ResearchGate has not been able to resolve any references for this publication.