Article

On the Origin(s) and Development of the Term 'Big Data'

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

I investigate the origins of the now-ubiquitous term ”Big Data," in industry and academics, in computer science and statistics/econometrics. Credit for coining the term must be shared. In particular, John Mashey and others at Silicon Graphics produced highly relevant (unpublished, non-academic) work in the mid-1990s. The first significant academic references (independent of each other and of Silicon Graphics) appear to be Weiss and Indurkhya (1998) in computer science and Diebold (2000) in statistics /econometrics. Douglas Laney of Gartner also produced insightful work (again unpublished and non-academic) slightly later. Big Data the term is now firmly entrenched, Big Data the phenomenon continues unabated, and Big Data the discipline is emerging.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Big data analysis is essential for analysts, researchers and business people to make better decisions that were previously not attained. Figure 1 explains the structure of big data which contains five dimensions namely volume, velocity, variety, value and veracity [2] [3]. Volume refers the size of the data which mainly shows how to handle large scalability databases and high dimensional databases and its processing needs. ...
... Big data analytics refers to the process of collecting, organizing and analyzing large sets of data to discover patterns and other useful information. Table 1 shows the comparative study of different types of data based on its size, characteristic, tools and methods [1] [3]. ...
... In this digital era, after the development of mobile and wireless technologies provides a new platform in which people may share their information through social media sites for e.g. face book, twitter and google+ [3]. In these places, the data may be arrived continuously and it cannot be stored in computer memory because the size of the data is huge and it is considered as "Big Data". ...
... According to Diebold (2012), the term "big data" began to be used by scholars and IT professionals, such as himself, John Mashey, Sholom M. Weiss, Nitin Indurkhya, and Douglas Laney between 1998 and2001. The most referenced definition of big data is Gartner's 4 , which is based on Laney's (2001) notion. ...
... Indeed, in the IT field, the terms "volume" and "speed" become relative. Diebold (2012) states that, in the field of econometrics, any more than 200 gigabytes (GB) of data is considered to be a large data set. However, in physics, experimental data sets usually contain much larger amounts of data. ...
Article
Full-text available
This research uses an interpretive case study strategy to investigate how big data affects tax audits in Indonesia, both with regard to tax audit management and policy, and to tax auditors’ individual audit assignments. The study reveals that the impact of big data on tax audit exists in two aspects. First, at audit policy level, big data is used as part of risk analysis in order to determine which taxpayers should be audited. Second, at the individual tax audit assignment level, tax auditors must utilise big data in order to acquire and analyse data from taxpayers and other related parties. Big data has the following characteristics: it involves huge volumes of information, it is generated at a high velocity, it includes a wide array of data types, and it contains high uncertainty. Big data can be analysed in order to reinforce the results gained from risk engines as a part of a compliance risk management system at the audit policy level. Meanwhile, at the individual tax audit assignment level, empirical evidence shows that tax auditors may deal with: (1) large volumes of data (hundreds of millions of records) that originated from previous fiscal years (historical records); (2) variations in the format and sources of data acquired from taxpayers which, to some extent, may be giving an auditor the authority to request data in a format that suits their analytical tools—with an inherent risk that the data can only be acquired in its native format; (3) data veracity that requires the tax auditors to review data sources because the adopted data analysis techniques are determined by the validity of data under audit. The main benefit expected to be gained from the implementation of big data analytics in respect of tax audits is the provision of valid and reliable information that evidences that taxpayers are compliant with tax laws.
... So, big data is a large complex data infrastructure that challenges traditional data storing and processing techniques. It is often defined in terms of three properties, which are the 3Vs (volume, velocity, and variety) (Diebold, 2012). Today's digital world contains a massive amount of data. ...
... The credit for coining the now-ubiquitous term "big data" goes to John R. Mashey (Diebold, 2012). It refers to a field that systematically collects, analyses, and extracts information from data sets that are too broad and complicated to manage with conventional application tools for data processing. ...
Chapter
In this chapter, we have discussed a relatively advanced and successful hybrid machine learning workflow that may help to unravel causative agents of disease from high throughput RNA-Seq datasets. The method is then applied to a breast cancer dataset taken from the Gene Expression Omnibus repository, and disease genes associated with breast cancer are identified. Finally, using the PPI network analysis approach, we observed the significance that the detected disease genes possess a role in the causal mechanism of disease. This method discussed here is universal and can be applied to any RNA-Seq data independent of disease.
... On the other hand, in the academic and publishable field, the first book in which this term was named was a 1998 data mining book [96]. All of this is according to the history of the term investigated by Francis Diebold in [97]. Sometimes Big Data is also called Data-Intensive Computing [14,98], or Big Data is referred to as the fourth paradigm [98,99]. ...
... Hadron Collider at the discovery of the Higgs Boson [97] 1 petabyte (10 15 ) Per second in 2012 ...
Article
Full-text available
Big Data has changed how enterprises and people manage knowledge and make decisions. However, when talking about Big Data, so many times there are different definitions about what it is and what it is used for, as there are many interpretations and disagreements. For these reasons, we have reviewed the literature to compile and provide a possible solution to the existing discrepancies between the terms Data Analysis, Data Mining, Knowledge Discovery in Databases, and Big Data. In addition, we have gathered the patterns used in Data Mining, the different phases of Knowledge Discovery in Databases, and some definitions of Big Data according to some important companies and organisations. Moreover, Big Data has challenges that sometimes are the same as its own characteristics. These characteristics are known as the Vs. Nonetheless, depending on the author, these Vs can be more or less, from 3 to 5, or even 7. Furthermore, the 4Vs or 5Vs are not the same every time. Therefore, in this survey, we reviewed the literature to explain how many Vs have been detected and explained according to different existing problems. In addition, we detected 7Vs, three of which had subtypes.
... For example, census data (i.e., national survey) is considered small data despite its large volume, as it is conducted only once a decade (Kitchin, 2014a). Furthermore, some argue that since the emergence of the term big data in the industry in the mid-1990s and in academia in the early 2000s (Diebold, 2012), technology advancements in storage abilities have rendered the 3Vs outdated (Batty, 2015;Kitchin & McArdle, 2016), indicating that the most significant factors defining big data are velocity, exhaustivity, and the analysis methods implemented. This paper, therefore, distinguishes between three data types based on the data source (Kitchin, 2014b). ...
... Originating from computer science, econometrics, and statistics (Diebold, 2012) big data research resonated in urbanism studies, where data-intense methodologies are applied to understand how urban environments materialise and perform (Bibri, 2018). In computational social science, big data technologies utilise large datasets and computational methods to evaluate environmental perception, sentiment, or social connection. ...
Article
Sustainability Rating Systems are standard methods for achieving sustainable development of buildings and urban landscapes. However, they suffer from low adoption and implementation rates, mainly due to labour-intensive evaluation processes. This study explores how recent advancements in big data, combined with the availability of new urban environment datasets, could advance sustainability rating systems in landscape development. We compared between existing computational technology (supply) and industry performance evaluation needs (demand) using a systematic review and survey of Israel's professional communities as a case study. Of the existing indicators, Israeli professionals prioritised measuring socio-ecological indicators of landscapes in development projects , mainly at the urban level. Our review revealed that this level also holds available big data sustainability evaluation methods and technologies. Specifically, directed data for measuring ecology and volunteered and automated data for measuring social indicators. Such supply-demand links could significantly advance evaluation methods towards achieving a broader application of sustainable urban development.
... Furthermore, also the demand for the corresponding processing speed increases [4]. As a consequence of those developments, the conventional technologies and methods of data handling are more and more often no longer sufficient, resulting in the necessity for a development of new techniques and the establishment of modern data analysis paradigms, as they are constituted by the terms big data and big data analytics (BDA) [5], [6]. Organizations who build the respective capabilities to utilize this new source of insights can hereby enhance their performance [7] through, inter alia, a more accurate or faster decision making, cost reductions, an optimization of their offered portfolio of services or an improvement of customer acquisition and retention [8]- [10]. ...
... However, when dropping the search term big data and just looking for the different variants of the term microservice, the earliest papers that are found were published in 2003 [246], [247]. The term big data can be traced back even further [5]. This shows that it has taken more than a decade until the combination of both domains has been scientifically explored. ...
Article
Full-text available
Due to the ever increasing amount of data that is produced and captured in today’s world, the concept of big data has risen to prominence. However, implementing the respective applications is still a challenging task. This holds especially true, since a high degree of flexibility is desirable. One potential approach is the utilization of novel decentralized technologies, as in the case of microservices to construct such big data analytics solutions. To obtain an overview of the current situation regarding the corresponding research, using the scientific database Scopus and its provided tools for search and analytics, this bibliometric review provides an analysis of the literature and subsequently discusses avenues for future research.
... Autonomous vehicles collect and process huge amounts of data some of which is personal. This can be defined as big data (Diebold, 2012). ...
Article
Full-text available
At the time of writing this paper there are already some implementations of autonomous vehicles. This paper approaches this subject from a different perspective by defining what automation means in other areas of transportation as well as ethical problems regarding the operation of this kind of vehicle. In the later part although of equal importance discuss the operation of the autonomous vehicle from a security and data privacy point of view. This paper tries to define autonomous vehicles by comparison with other ways of transportation which have a higher degree of automation and defines what autonomous vehicles might impact our lives and our privacy.
... The first significant academic references to Big Data in computer science were by Weiss and Indurkhya (1998) in computer science and Diebold (2000) in statistics and econometrics (Diebold 2012). The reference to Big Data in those first citations pertained to bigger data sets than normal, but since then it has evolved to include a range of characteristics (Leary 2013). ...
... Researchers and practitioners have recognised several key features of big data. It is widely accepted, though, that (1) volume, (2) velocity, and (3) variety (also known as "3Vs") are three dominant characteristics of big data, first identified by Douglas Laney some twenty years ago (Diebold, 2012). Soon after, these became a part of Gartner's big data definition, which was recently slightly modified: "big data is high-volume, high-velocity and/or highvariety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation" (Gartner, 2021). ...
Article
Full-text available
Innovative digital technologies and ever-changing business environment have and will continue to transform businesses and industries around the world. This transformation will be even more evident in view of forthcoming technological breakthroughs, and advances in big data analytics, machine learning algorithms, cloud-computing solutions, artificial intelligence, internet of things, and the like. As we live in a data-driven world, technologies are altering work and work-related activities, and everyday activities and interactions. This paper is focused on big data and big data analytics (BDA), which are viewed in the paper from organisational perspective, as a means of improving firm performance and competitiveness. Based on a review of selected literature and researches, the paper aims to explore the extent to which big data analytics is utilized in companies, and to highlight the valuable role big data analytics may play in achieving better business outcomes. Furthermore, the paper briefly presents main challenges that accompany the adoption of big data analytics in companies.
... Therefore, new tools and techniques are needed to deal with the new requirements and simultaneously the term big data emerged to describe this phenomenon. Even though the origins of a term are not conclusively clarified (Diebold 2012) and there is also no unified definition for it (Al-Mekhlal and Khwaja 2019; Volk et al. 2020b), most of the relevant literature follows a similar understanding. The arguably most influential description (Chang and Grady 2019) is based on four characteristics, which are sometimes also termed the 4 Vs of big data. ...
Conference Paper
Big data has emerged to be one of the driving factors of today’s society. However, the quality assurance of the corresponding applications is still far from being mature. Therefore, further work in this field is needed. This includes the improvement of existing approaches and strategies as well as the exploration of new ones. One rather recent proposition was the application of test driven development to the implementation of big data systems. Since their quality is of critical importance to achieve good results and the application of test driven development has been found to increase the developed product’s quality, this suggestion appears promising. However, there is a need for a structured approach to outline how the corresponding endeavors should be realized. Therefore, the publication at hand applies the design science research methodology to bridge this gap by proposing a process model for test driven development in the big data domain.
... ML methods that are becoming standard across disciplines for addressing supervised learning problems include decision trees, gradient boosting models, random forests, support vector machines, and neural networks. Such methods are optimized to perform well in areas that have experienced explosive growth in large-scale data [66]. ...
Article
Full-text available
The systematic monitoring of private communications through the use of information technology pervades the digital age. One result of this is the potential availability of vast amount of data tracking the characteristics of mobile network users. Such data is becoming increasingly accessible for commercial use, while the accessibility of such data raises questions about the degree to which personal information can be protected. Existing regulations may require the removal of personally-identifiable information (PII) from datasets before they can be processed, but research now suggests that powerful machine learning classification methods are capable of targeting individuals for personalized marketing purposes, even in the absence of PII. This study aims to demonstrate how machine learning methods can be deployed to extract demographic characteristics. Specifically, we investigate whether key demographics—gender and age—of mobile users can be accurately identified by third parties using deep learning techniques based solely on observations of the user’s interactions within the network. Using an anonymized dataset from a Latin American country, we show the relative ease by which PII in terms of the age and gender demographics can be inferred; specifically, our neural networks model generates an estimate for gender with an accuracy rate of 67%, outperforming decision tree, random forest, and gradient boosting models by a significant margin. Neural networks achieve an even higher accuracy rate of 78% in predicting the subscriber age. These results suggest the need for a more robust regulatory framework governing the collection of personal data to safeguard users from predatory practices motivated by fraudulent intentions, prejudices, or consumer manipulation. We discuss in particular how advances in machine learning have chiseled away a number of General Data Protection Regulation (GDPR) articles designed to protect consumers from the imminent threat of privacy violations.
... Despite big data being one of today's big trends (Ghasemaghaei and Calic 2020;Volk et al. 2020b), and consequently also intense scientific discourse (Staegemann et al. 2019b), there is still no universally used definition for the term itself. In fact, not even the origins of the term are completely clear (Diebold 2012). ...
Conference Paper
Knowledge, information, and modern technologies have become some of the most influential drivers of today’s society, consequently leading to a high popularity of the concepts of big data (BD). However, their actual harnessing is a demanding task that is accompanied by many barriers and challenges. To facilitate the realization of the corresponding projects, the (big) data science engineering process (BDSEP) has been devised to support researchers and practitioners in the planning and implementation of data intensive projects by outlining the relevant steps. However, the BDSEP is only geared towards a test last development approach. With recent works suggesting the application of test driven development (TDD) in the big data domain, it appears reasonable to also provide a corresponding TDD focused equivalent to the BDSEP. Therefore, in the publication at hand, using the BDSEP as a foundation, the test driven big data science engineering process (TDBDSEP) is proposed, facilitating the application of TDD in the big data domain and further enriching the discourse on BD quality assurance.
... "Big data" originally referred to managing, handling, and analyzing very large datasets and has been used to refer to this ever since the mid-1990s. The term 'Big data' was coined in 1990 by John Mashey, (Diebold, 2012). In the age of the World Wide Web and Web 2.0 technologies, a constant amount of structured and unstructured data is generated from various sources, including email, social media platforms such as Facebook, WhatsApp, LinkedIn, blogs, online transactions, articles, and forums. ...
Article
Full-text available
This study examined the growth rate of Big Data research literature over the period 2001 to 2020. Data were extracted from WoS and Scopus Databases and merged with Bibliometrics, R programming. Collected data further refined and remove duplicate records and finally analyzed a total of 19667 research papers. This study aims to determine various scientometric indicators, including the year-wise distribution of records, annual growth rate, compound annual growth rate, authorship pattern, etc., This article shows an increase in publications from 0.005 to 21.37% with an annual growth rate of 89.53% and a CAGR of 41.56%. Over the study period, the results reported here confirm that the relative growth rate decreased and the doubling time increased. Writing modeling showed that 93.66% of articles were co-authored. As the results show, the growth rate of big data research is at an alarming rate.
... 18 What is big data? The term big data was first used at Silicon Graphics in the mid-1990s according to Diebold,20 Cox et al 21 first used big data in publications on data-intensive computing in 1997. The term big data was defined in various ways in recent publications. ...
Article
Full-text available
Introduction Big data technologies have been talked up in the fields of science and medicine. The V-criteria (volume, variety, velocity and veracity, etc) for defining big data have been well-known and even quoted in most research articles; however, big data research into public health is often misrepresented due to certain common misconceptions. Such misrepresentations and misconceptions would mislead study designs, research findings and healthcare decision-making. This study aims to identify the V-eligibility of big data studies and their technologies applied to environmental health and health services research that explicitly claim to be big data studies. Methods and analysis Our protocol follows Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols (PRISMA-P). Scoping review and/or systematic review will be conducted. The results will be reported using PRISMA for Scoping Reviews (PRISMA-ScR), or PRISMA 2020 and Synthesis Without Meta-analysis guideline. Web of Science, PubMed, Medline and ProQuest Central will be searched for the articles from the database inception to 2021. Two reviewers will independently select eligible studies and extract specified data. The numeric data will be analysed with R statistical software. The text data will be analysed with NVivo wherever applicable. Ethics and dissemination This study will review the literature of big data research related to both environmental health and health services. Ethics approval is not required as all data are publicly available and involves confidential personal data. We will disseminate our findings in a peer-reviewed journal. PROSPERO registration number CRD42021202306.
... The continuing adoption of technology (i.e., computers, cell phones, and information systems) and the associated large-scale growth of information have led to the "big data" movement (Diebold, 2012;Mayer-Schönberger & Cukier, 2013), where "big data" refers to the large volume of information that no longer fits in the memory that modern computers use for processing (Mayer-Schönberger & Cukier, 2013). According to Boyd and Crawford (2012), the definition of "big data" is composed of technology that includes maximum computation power and accurate algorithms. ...
... Within the time, the term 'Big Data' has been used for this. Although it is not exactly known who first used the term, most people credit John R.Mashey (who at the time worked at Silicon Graphics) for making the term popular [10]. ...
... big data kelimesinin Türkçe karşılığı büyük veri olarak isimlendirilmektedir. Veri artışını ifade etmek amacıyla kelimenin günümüzdeki anlamına yakın ilk kullanımları; bilgisayar bilimleri alanında Weiss ve Indurkhya (1998), istatistik alanında ise Diebold (2000) tarafından yapılan akademik çalışmalarda gerçekleştirilmiştir (Diebold, 2012). Kelimenin yaygın kullanımına ilişkin ilk çalışma ise 2005 yılında Roger Magoulas tarafından, boyut ve karmaşık yapısı itibariyle geleneksel veri analiz yöntemlerinin inceleme yapmayı neredeyse imkânsız hâle getirdiği, çok çeşitli ve oldukça büyük olan veri kümelerini ifade etmek amacıyla kullanılmıştır (Emani vd., 2015). ...
Chapter
Full-text available
Günümüzde siber suç ve siber güvenlik alanı başta olmak üzere, bilgisayar bilimleri, istatistik ve ekonometrinin neredeyse tüm alanlarında popüler olarak kullanılan kavramlardan biri olan big data kelimesinin Türkçe karşılığı büyük veri olarak isimlendirilmektedir. Veri artışını ifade etmek amacıyla kelimenin günümüzdeki anlamına yakın ilk kullanımları; bilgisayar bilimleri alanında Weiss ve Indurkhya (1998), istatistik alanında ise Diebold (2000) tarafından yapılan akademik çalışmalarda gerçekleştirilmiştir (Diebold, 2012). Kelimenin yaygın kullanımına ilişkin ilk çalışma ise 2005 yılında Roger Magoulas tarafından, boyut ve karmaşık yapısı itibariyle geleneksel veri analiz yöntemlerinin inceleme yapmayı neredeyse imkânsız hâle getirdiği, çok çeşitli ve oldukça büyük olan veri kümelerini ifade etmek amacıyla kullanılmıştır (Emani vd., 2015).
... By combining large datasets with advanced modeling approaches, big data technology extracts patterns, performs time-series analyses, and makes predictions that assist problem solving in new ways. Affordances of datadriven logistics and forecasting models may help to alleviate global sustainability pressures like food security, climate resilience and efficient transportation [3]. Developments in natural capital accounting propose new knowledge systems related to ecological functions and open new markets [4]. ...
Article
Full-text available
Many private and public actors are incentivized by the promises of big data technologies: digital tools underpinned by capabilities like artificial intelligence and machine learning. While many shared value propositions exist regarding what these technologies afford, public-facing concerns related to individual privacy, algorithm fairness, and the access to insights requires attention if the widespread use and subsequent value of these technologies are to be fully realized. Drawing from perspectives of data science, social science and technology acceptance, we present an interdis-ciplinary analysis that links these concerns with traditional research and development (R&D) activities. We suggest a reframing of the public R&D 'brand' that responds to legitimate concerns related to data collection, development, and the implementation of big data technologies. We offer as a case study Australian agriculture, which is currently undergoing such digitalization, and where concerns have been raised by landholders and the research community. With seemingly limitless possibilities , an updated account of responsible R&D in an increasingly digitalized world may accelerate the ways in which we might realize the benefits of big data and mitigate harmful social and environmental costs.
... Not only is the amount of those data rapidly rising (Dobre and Xhafa, 2014), (Yin and Kaynak, 2015), but also the demand for their fast processing increases (Kolajo et al., 2019). This lead to the establishment of the term big data (Gandomi and Haider, 2015), (Diebold, 2012), respectively big data analytics, which address those new challenges and ways of coping with them. To gain new insights, organizations establish and develop their analytics capabilities, aiming to enhance their operational performance by improving their decision making, reducing costs, amending existing assets or services and establishing new ones (Becker, 2016;Vom Brocke et al., 2009b;Wamba et al., 2017). ...
Conference Paper
Full-text available
Big data has evolved to a ubiquitous part of today’s society. However, despite its popularity, the development and testing of the corresponding applications are still very challenging tasks that are being actively researched in pursuit of ways for improvement. One newly introduced proposition is the application of test driven development (TDD) in the big data domain. To facilitate this concept, existing literature reviews on TDD have been analyzed to extract insights from those sources of aggregated knowledge, which can be applied to this new setting. After introducing the different studies, lessons for the application of TDD in the big data domain are deducted and discussed. Finally, avenues for future works are proposed.
... They mentioned about the challenges in working with large unstructured data sets with the existing computing system. They also stated that Big Data probably originated in lunch-table conversations at Silicon Graphics Inc. (SGI) in the mid-1990s, in which John Mashey figured prominently (Diebold, 2012). Despite the references to the mid-nineties, the term became widespread as recently as in 2011. ...
Chapter
Big data is emerging, and the latest developments in technology have spawned enormous amounts of data. The traditional databases lack the capabilities to handle this diverse data and thus has led to the employment of new technologies, methods, and tools. This research discusses big data, the available big data analytical tools, the need to use big data analytics with its benefits and challenges. Through a research drawing on survey questionnaires, observation of the business processes, interviews and secondary research methods, the organizations, and companies in a small island state are identified to survey which of them use analytical tools to handle big data and the benefits it proposes to these businesses. Organizations and companies that do not use these tools were also surveyed and reasons were outlined as to why these organizations hesitate to utilize such tools.
... With these advancements, an increase in processing speed is desirable. Consequently, the performance of conventional methods and technologies are growing insufficient, thus motivating the invention of new techniques and the need of modern data architectures, as they are denoted by the terms Big Data and Big Data architecture [2]. Organizations that implement aforementioned technologies can experience a significant i ncrease i n p erformance [ 3] d ue t o m ore e fficient de cision ma king, re duced expenses and a more result-oriented portfolio of services while improving customer relations [4]. ...
Article
Full-text available
Microservices and Big Data are renowned hot topics in computer science that have gained a lot of hype. While the use of microservices is an approach that is used in modern software development to increase flexibility, Big Data allows organizations to turn today’s information deluge into valuable insights. Many of those Big Data architectures have rather monolithic elements. However, a new trend arises in which monolithic architectures are replaced with more modularized ones, such as microservices. This transformation provides the benefits from microservices such as modularity, evolutionary design and extensibility while maintaining the old monolithic product’s functionality. This is also valid for Big Data architectures. To facilitate the success of this transformation, there are certain beneficial factors. In this paper, those aspects will be presented and the transformation of an exemplary Big Data architecture with somewhat monolithic elements into a microservice favoured one is outlined.
... Not only is the amount of those data rapidly rising (Dobre and Xhafa, 2014), (Yin and Kaynak, 2015), but also the demand for their fast processing increases (Kolajo et al., 2019). This lead to the establishment of the term big data (Gandomi and Haider, 2015), (Diebold, 2012), respectively big data analytics, which address those new challenges and ways of coping with them. To gain new insights, organizations establish and develop their analytics capabilities, aiming to enhance their operational performance by improving their decision making, reducing costs, amending existing assets or services and establishing new ones (Becker, 2016;Vom Brocke et al., 2009b;Wamba et al., 2017). ...
Conference Paper
Full-text available
Due to the constantly increasing amount and variety of data produced, big data and the corresponding technologies have become an integral part of daily life, influencing numerous domains and organizations. However, because of its diversity and complexity, the necessary testing of the corresponding applications is a highly challenging task that lacks maturity and is still being explored. While there are numerous publications dealing with this topic, there is no sufficiently comprehensive overview to conflate those isolated pieces of information to a coherent knowledgebase. The publication at hand highlights this grievance by means of an unstructured literature review, proposes a starting point for a corresponding taxonomy to bridge this gap and highlights future avenues for research.
... Não se pode dizer que haja maior substância nesse desejo, uma vez que a caracterização desse "modo humano" é feita de maneira geralmente superficial. Todavia, empresas de análise de dados têm ofertado essas possibilidades a seus potenciais clientes (KOGLER JR., 2018), (DIEBOLD, 2012), (KIMBLE e MILIODAKIS, 2015). ...
Chapter
Neste ensaio examinamos as relações entre Big Data e a cognição, em particular a cognição humana. A razão de se explorar tais relações reside em dois aspectos. Primeiramente, porque no domínio da ciência cognitiva muitos especulam quanto aos benefícios que os usos de técnicas de análise de Big Data possam proporcionar à caracterização e compreensão da cognição. Em segundo lugar, porque os setores científicos e tecnológicos que promovem as atividades de análise de dados, particularmente a estatística, a ciência da computação e a ciência de dados, naturalmente afeitos ao trabalho com Big Data, têm nos últimos anos utilizado a ideia de cognição e seus conceitos correlatos como termos que parecem capazes de remediar as deficiências da era da inteligência artificial simbólica, como resposta à automatização de atividades tipicamente humanas no processo de análise de dados e utilização de seus resultados.De um lado, conduziremos nossa análise através do domínio da ciência de dados, buscando entender o que é de fato Big Data e investigar se o emprego de modelos da cognição poderia ser aplicado ao setor. De outro lado, analisaremos a potencialidade do uso de Big Data para auxiliar a ciência cognitiva. Procuramos também proporcionar certo caráter didático em alguns pontos, visando auxiliar a compreensão, pelos não especialistas, no que se refere a alguns princípios e aspectos da metodologia da ciência de dados e da estatística que eventualmente serão mencionados no desenvolvimento de nossa análise.
... They mentioned about the challenges in working with large unstructured data sets with the existing computing system. They also stated that Big Data probably originated in lunch-table conversations at Silicon Graphics Inc. (SGI) in the mid-1990s, in which John Mashey figured prominently (Diebold, 2012). Despite the references to the mid-nineties, the term became widespread as recently as in 2011. ...
Chapter
Full-text available
Big data is emerging, and the latest developments in technology have spawned enormous amounts of data. The traditional databases lack the capabilities to handle this diverse data and thus has led to the employment of new technologies, methods, and tools. This research discusses big data, the available big data analytical tools, the need to use big data analytics with its benefits and challenges. Through a research drawing on survey questionnaires, observation of the business processes, interviews and secondary research methods, the organizations, and companies in a small island state are identified to survey which of them use analytical tools to handle big data and the benefits it proposes to these businesses. Organizations and companies that do not use these tools were also surveyed and reasons were outlined as to why these organizations hesitate to utilize such tools.
... "Recently much good science, whether physical, biological, or social, has been forced to confront-and has often benefited from-the Big Data phenomenon. Big Data refers to the explosion in the quantity (and sometimes, quality) of available and potentially relevant data, largely the result of recent and unprecedented advancements in data recording and storage technology" [2]. On each day we produce about 2.5 quintillion (2.5 X 10 18 ) bytes of data and out of that 90% possibly have been created in last twothree years [3]. ...
Article
Full-text available
In the present work a brief but exhaustive history of the researches on the time series analysis is demonstrated. It is also tried in the present work to furnish a detailed outline of the contemporary research status on this branch of study. Finally, comments and thoughts have been placed on the future challenges that are to be explored and resolved to continue the successful journey of researches in time series analysis. Introduction A time series is a sequential and quantitative manifestation of a system over time. Simply to say, a time series is an array of data points, observed or computed characteristically at consecutive time points placed usually at unvarying time interludes appearing from a system or a process. It appears in engineering, physical and chemical sciences, biological and medical sciences and even in social sciences. Time series imparts the corroboration of different theories and models with modifications and even sometimes can bring forth a new-fangled theory or model. Time series analysis is principally exercised (a) to swot the dynamic edifice of a process, (b) to explore the dynamic rapport between the variables entailed in a process, (c) to execute seasonal fine-tuning (for example in economic data seasonal adjustment due to the GDP and rate of unemployment), (d) to make regression as well as causal analysis and (e) to generate forecasts (both point and interval estimation) checking carefully the sensitivity to the initial conditions.
... Ancak, literatürde "büyük veri" olgusuna yönelik evrensel olarak kabul gören bir tanım olmadığı gözlenmektedir. "Büyük veri", 2000'li yılların başından itibaren hayatımıza girmiş görünse de Diebold (2012) büyük veri terimini, büyük veri olgusunun farkındalığı ile birleştirdiğinde büyük veriyi 1990'ların ortalarında Silicon Graphics'te değinilmektedir, tanım olarak ise kümelerinin ele alınması ve analizine başvurulması olarak belirtilmiştir. 2010 yılında Apache Hadoop, büyük verileri "kabul edilebilir bir kapsam içinde genel bilgisayarlar tarafından yakalanamayan, yönetilemeyen ve işlenemeyen veri kümeleri" olarak tanımlamıştır. ...
Article
Full-text available
Göç ve mülteci sorunları yasadığımız yüzyılın en yürek burkan ve can sıkıcı insani sorunları arasında maalesef ilk sırada yer alıyor. Bu krizler, özellikle Suriye’de, Afganistan’da, Yemen’de, Venezüella’da, Myanmar’da devam eden halk ayaklanmaları, iç savaş ve terörizmin milyonlarca insanın komşu ülkelere sığınmasıyla ikinci dünya savaşından günümüze kadar görülmemiş bir seviyeye ulaşmış durumdadır. Birleşmiş Milletler raporlarına göre, günümüz itibari ile, toplam uluslararası göçmen sayısı yaklaşık 280 milyon olmakla birlikte zorla yerinden edilen insan sayısı 80 milyonu, toplam mülteci sayısı ise 26 milyonu geçmiş durumdadır. Yüksek insan hareketliliği karşısında ev sahibi ülkeler özellikle temel kamu hizmetlerinin daha kaliteli sunulması konusunda baskı altında kalmışlardır. Sunulan hizmetlerin istenilen seviyelerde olmamasının birçok nedeni olmakla beraber veri eksikliği önemli sebeplerden biri olarak karşımıza çıkmaktadır. Devam eden göçmen krizleri karşısında hizmet kalitesinin arttırılması ve gerekli düzeyde sağlanması için, ev sahibi ülkeler başta olmak üzere hükümetlerin ve sivil toplum kuruluşlarının bireyleri ve verileri yeniden gözden geçirmesi ve daha kaliteli veri elde etmenin yollarını bulmaları gereklilik arz etmektedir. Ayrıca, göçmen kaçaklığı, göç esnasında deniz yolunu kullananların ölümleri, ulaştıkları noktalarda karşılaştıkları kötü muamele gibi olumsuz sonuçların önlenmesi için elde edilen verilerin yeterli ve güvenilir olması büyük bir önem teşkil etmektedir. Bu noktada, Büyük Veri (Big Data) kavramı birçok alanda olduğu gibi göç alanında da önümüze bir kavram olarak çıkmaktadır. Makale, devletlerin uluslararası korumaya ihtiyaç duyan göçmenleri tanımlamak ve onlara yardımcı olmak için mevcut teknolojileri ve büyük veriyi neden kullanmaları konusunu tartışmaktadır. Göçmenlerin yaşam hakkının korunmasına, işkence, insanlık dışı ve aşağılayıcı muamelenin yasaklanmasına, kölelik ve zorla çalıştırmanın yasaklanmasına karşı büyük verinin uluslararası arenada elde ettiği avantaj ve fırsatlara değinilmiştir. Bununla birlikte, makale ayrıca, özellikle göçmenlerin gizlilik ve veri koruma hakkının korunması ile ilgili olarak, yeni teknolojilerin sınırsız kullanımından kaynaklanan dezavantaj, sınırları ve riskleri de detaylı bir literatür yardımı ile incelenmiştir. Abstract: Migration and refugee crises unfortunately rank first among the most heart-wrenching and vexatious humanitarian problems of this century. Especially, the ongoing civil unrest, civil war and terrorism in Syria, Afghanistan, Yemen, Venezuela, Myammar, neighboring countries have reached an unprecedented level of migrant and refugees since the Second World War. According to United Nations reports, as of today, the number of international migrants have reached 280 million and the number of displacedpeople by force has exceeded 80 million, and the total number of refugees has exceeded 26 million.*** In the face of the high human mobility of people, host countries have come under pressure, especially to provide better quality of basic public services. Although there are many reasons why the services offered are not at the desired levels, the lack of data is one of the important reasons. In the face of the ongoing migrants challenges and enhance the quality of service to ensure the necessary level of the host country governments and civil society organizations, members of reviewer of the data need necessarily to find ways to obtain better quality data. In addition, it is of great importance that the data is sufficient and reliable to prevent negative consequences such as migrant smuggling, the deaths of those who use the sea route during migration, and the mistreatment they face at the points they reach. As a result, the article discusses why states ought to use existing and modern technologies and data to identify and assist vulnerable migrants in need of protection. It has addressed the advantages and opportunities gained in the international arena against the protection of immigrants ' right to life, the prohibition of torture, inhuman and degrading treatment, and the prohibition of slavery and forced labor. However, the article also examined the disadvantages, limits and risks associated with the unrestricted use of new technologies, especially in relation to the protection of immigrants ' right to privacy and data protection.
... The Agricultural Big source of heterogenous Data Because the Agricultural Data is also being collected from different sources like public data, private data, industrial data, and Governmental data. So, data privacy and security are also required for private data and industrial data [4]. For such [3] a significant and diversified data, we need to develop a large cloud environment for data management and analytics [5]. ...
Article
Full-text available
Commonly, the implementation of agricultural practices (e.g., ploughing, sowing, watering, pests control, and harvesting.) purely depends on climate change, recommendations from previously experienced rules, and Governmental policies. For fulfilling Term smart farming, i), we employed real-time applications over sensors to capture climate changes of soil and atmosphere. ii) we defined agriculture practice rules by applying machine learning techniques over the last five years data iii) By federations of real-time data from the field sensors and rules, we define the time for implementation of the practice. This federation eliminates many malfunctions in old ways of smart farming for precision agriculture.
... Big Data as an emerging discipline. Diebold (2012) underlines the large areas taken over from other branches as computer science, information systems, econometrics and statistics. He concludes that Big Data are a perfect illustration of an "interdisciplinary" discipline. ...
Article
Full-text available
With the emergence of Big Data and the increasing market penetration of ad retargeting advertising, the advertising industry's interest in using this new online marketing method is rising. Retargeting is an innovative technology based on Big Data. People who have gone to a merchant site and window-shopped but not purchased can be re-pitched with the product they showed an interest in. Therefore click rates and conversion rates are dramatically enhancing by retargeting. However, in spite of the increasing number of companies investing in retargeting, there is little academic research on this topic. In this paper we explore the links between retargeting, perceived intrusiveness and brand image. As results show the importance of perceived intrusiveness, ad repetition and ad relevance, we introduce new analytical perspectives on online strategies with the goal of facilitating collaboration between consumers and marketers.
Article
The article contains a study of existing views on the economic content of big data. From among the views, within which the authors define big data, the descriptive-model, utility-digital and complex-technological approaches are formulated. Against the back- ground of the large-scale spread of digital technologies (machine learning, cloud computing, artificial intelligence, augmented and virtual reality, etc.), functioning thanks to big data, the study of their economic essence is becoming especially relevant. As a result, it was found that the basis of economic activity in the digital economy is big data. The definition of big data as a resource of the digital economy is proposed.
Article
El objetivo de esta investigación fue explorar y caracterizar los principales repositorios de big data en el área de ciencias sociales disponibles en 2021. El diseño de la investigación fue no experimental, exploratoria y descriptiva. La población estuvo constituida por 110 big data localizados por el motor de búsqueda para conjuntos de datos (datasets) de Google. La muestra correspondió a los 10 principales big data. Los resultados indicaron que los repositorios y plataformas más importantes de big data se encuentran centralizados por el sector privado localizado en empresas de EE. UU., fundamentalmente.
Article
The growing importance of big data in the current business context is a recognized phenomenon in managerial studies. Many such studies have been focused on possible changes from the use of big data analytics in business, and with reference to management control systems. However, the number and extent of studies attempting to analyze the opportunities and risks of using big data analytics in control systems from an empirical perspective appear rather limited. This work conducts case studies analyzing three companies that have used big data in their decision-making processes within management control systems. The empirical analysis shows how proper management of big data can represent a fundamental opportunity for the development of managerial control systems, with some possibilities not yet fully explored even by those who have already introduced big data analytics in these systems. Big data quality and privacy protection appear to be the profiles presenting the greatest opportunities for future study. Furthermore, new challenges seem to emerge for accountants and controllers, who now are called to a new approach regarding how they should interpret their professional roles.
Chapter
The Big Data problem is the computational challenge to deal with a humongous volume of information. With the advent of next-gen sequencing technologies and other ways, a huge amount of data are collected and stored every day. To process these data and take out fruitful information, mathematical descriptors are alone not sufficient enough. So, this chapter focuses on collaborating the bioinformatic concepts of alignment-free sequence descriptor with Big Data architecture to find out approachable solution to the problem.
Article
This article has been withdrawn: please see Elsevier Policy on Article Withdrawal (https://www.elsevier.com/about/our-business/policies/article-withdrawal). This article has been withdrawn as part of the withdrawal of the Proceedings of the International Conference on Emerging Trends in Materials Science, Technology and Engineering (ICMSTE2K21). Subsequent to acceptance of these Proceedings papers by the responsible Guest Editors, Dr S. Sakthivel, Dr S. Karthikeyan and Dr I. A. Palani, several serious concerns arose regarding the integrity and veracity of the conference organisation and peer-review process. After a thorough investigation, the peer-review process was confirmed to fall beneath the high standards expected by Materials Today: Proceedings. The veracity of the conference also remains subject to serious doubt and therefore the entire Proceedings has been withdrawn in order to correct the scholarly record.
Conference Paper
Full-text available
Resumen: La convergencia de las áreas de Redes de Datos e Inteligencia Artificial son de especial interés en la actualidad debido a la gran interconexión reinante en la sociedad actual, que se ha visto maximizada frente al contexto pandémico impuesto por el COVID-19. Por su parte, el Aprendizaje Automático y sus bondades, pueden resultar en herramientas de gran utilidad para los analistas de ciberseguridad y administradores de redes, en caso de que los modelos a utilizar sean entrenados e implementados correctamente para la tarea encomendada, y sus resultados comunicados en forma fehaciente y entendible para el personal técnico. Esto resulta en un desafío sumamente interesante para este proyecto, ya que brindará herramientas en ambas áreas para el desarrollo académico y profesional.
Article
Full-text available
Laboratory medicine is a digital science. Every large hospital produces a wealth of data each day—from simple numerical results from, e.g., sodium measurements to highly complex output of “-omics” analyses, as well as quality control results and metadata. Processing, connecting, storing, and ordering extensive parts of these individual data requires Big Data techniques. Whereas novel technologies such as artificial intelligence and machine learning have exciting application for the augmentation of laboratory medicine, the Big Data concept remains fundamental for any sophisticated data analysis in large databases. To make laboratory medicine data optimally usable for clinical and research purposes, they need to be FAIR: findable, accessible, interoperable, and reusable. This can be achieved, for example, by automated recording, connection of devices, efficient ETL (Extract, Transform, Load) processes, careful data governance, and modern data security solutions. Enriched with clinical data, laboratory medicine data allow a gain in pathophysiological insights, can improve patient care, or can be used to develop reference intervals for diagnostic purposes. Nevertheless, Big Data in laboratory medicine do not come without challenges: the growing number of analyses and data derived from them is a demanding task to be taken care of. Laboratory medicine experts are and will be needed to drive this development, take an active role in the ongoing digitalization, and provide guidance for their clinical colleagues engaging with the laboratory data in research.
Article
Full-text available
A smart city is more than its mere technological components. From a legal standpoint, smartness means a civic-enabling regulatory environment, access to technological resources, and openness to the political decision-making process. No doubt, the core asset of this socio-technical revolution is the data generated within the urban contest. However, national and EU law does not provide a specific regulation for using this data. Indeed, the next EU data strategy, with the open data and non-personal data legislation and the forthcoming Data Act, aims to promote a more profitable use of urban and local big data. Nonetheless, at present, this latter still misses a consistent approach to this issue. A thorough understanding of the smart city requires, first of all, the reconceptualization of big data in terms of urban data. Existing definitions and studies about this topic converge on the metropolises of East Asia and, sometimes, the USA. Instead, we approach the issues experienced in medium-size cities, focusing on the main Italian ones. Especially in this specific urban environment, data can help provide better services, automatize administrations, and further democratization only if they are understood holistically - as urban data. Cities, moreover, are a comprehensive source of data themselves, both collected from citizens and urban things. Among the various types of data that can be gathered, surveillance recordings play a crucial role. On the one hand, video surveillance is essential for many purposes, such as protecting public property, monitoring traffic, controlling high-security risk areas, and preventing crime and vandalism. From another standpoint, these systems can be invasive towards citizens' rights and freedoms: in this regard, urban data collected from video surveillance systems may be shared with public administrations or other interested entities, only afterward they have been anonymized. Even this process needs to be aligned with the transparency and participation values that inform the city's democracy. Thus, the anonymization process must be fully compliant with data protection legislation, looking for the most appropriate legal basis and assessing all the possible sources of risks to the rights and freedoms of people (DPIA). Urban data, indeed, is a matter of local democracy. The availability of data and the economy of platforms can significantly transform a city's services and geography as well as citizens' lifestyles. However, the participation of citizens to express their views on both the use of urban data for public policy and the regulation of the digital economy is still a challenge. The paper aims to analyze the projects of some Italian cities - including Milan, Rome, and Turin - which have tried to introduce participatory urban data management tools and to highlight the possible challenges of a democratic management of service platforms and data transfer for social and economic development.
Article
Full-text available
With a continuously increasing amount and complexity of data being produced and captured, traditional ways of dealing with their storing, processing, analysis and presentation are no longer sufficient, which has led to the emergence of the concept of big data. However, not only the implementation of the corresponding applications is a challenging task, but also the proper quality assurance. To facilitate the latter, in this publication, a comprehensive structured literature metareview on the topic of big data quality assurance is presented. The results will provide interested researchers and practitioners with a solid foundation for their own quality assurance related endeavors and therefore help in advancing the cause of quality assurance in big data as well as the domain of big data in general. Furthermore, based on the findings of the review, worthwhile directions for future research were identified, providing prospective authors with some guidance in this complex environment.
Chapter
GİRİŞ Örgütler değişen çevre şartları içerisinde varlıklarını devam ettirme ve sür-dürülebilir rekabet avantajını kazanma güdüsüyle hareket etmektedir. Dolayısıy-la sürdürülebilir rekabet avantajını kazanan örgütler için mevcut konumlarını korumak dinamik bir süreç anlamına gelmektedir. Bahsedilen dinamik süreç içinde rekabet avantajı kazanabilmenin temelinde değişen çevre şartlarının doğ-ru analiz edilmesi ve anlaşılması yer almaktadır. Bu temel ise doğru enformas-yonun temin edilmesi, işlenmesi ve bilgiye dönüştürülmesiyle oluşturulmaktadır (Acar, 2008, s. 53). Jensen'e göre (2005, s. 54) bilgiyi oluşturmak; verileri bir formül içerisinde organize ederek belirli bir bağlamda üretken bir amaç için kullanmaktır. Bu süreçte üç farklı kavram öne çıkmaktadır; veri, enformasyon ve bilgi. Kavramlar bağlamında iki temel dönüşüm süreci bulunmaktadır; (i) verilerin belirli bir formülle organize edilerek enformasyona dönüştürülmesi ve (ii) enformasyonun belirli bir bağlamda üretken bir amaçla ilişkilendirilerek ya da bu amaç doğrultusunda kullanılarak bilgiye dönüştürülmesi. Bu bağlamda enformasyonla ilişkili verilerin belirli bir amaç doğrultusunda bir araya getiril-diği, tasnif edildiği, sınıflandırıldığı ve filtrelendiği görülmektedir. Bu nedenle-dir ki veri enformasyon ve bilgi üretiminin temel ham maddesidir. Verinin artan önemi örgütler nezdinde veriyi önemli bir konuma yerleştirmektedir (Yılmaz, 2009, s. 98).
Chapter
The Internet of Things (IoT) connects humans and machines by means of intelligent technology such as sensors or actuators. This makes it possible to network everyday objects or machines in an industrial environment via the internet. The data generated in this way is also known as Big Data. Master data management (MDM) can offer great potential in dealing with data and data quality by providing a set of guidelines for data management and thus enabling a common view of it. In this paper, different approaches for the use of master data management in the context of IoT are analysed. For this purpose, a classification of the possible uses in the different design or functional areas is given in order to highlight areas of master data management with particular potential for use. The analysed results show that of the three design areas of enterprise-wide MDM, the system level is most frequently represented.
Thesis
Full-text available
Die emppirische Arbeit die zur Erlangung des akademischen Grades "Master of Arts - Business Administration" verfasst wurde setzt sich mit den Herausforderungen bei der Einführung von Predictive Anlytics beim Finanz-Forecast in Unternehmen auseinander und gibt anhand eines Reifegradmodells Handlungsempfehlungen für eine erfolgreiche Implementierung.
Chapter
Full-text available
The aim of the chapter is to identify the competences of Data Scientists in Brazil and to indicate their levels of seniority from the mapping of proficiency in 25 competences from five disciplines - Business, Technology, Mathematics, Programming and Statistics. It uses the exploratory method through a survey containing 34 questions. The questions involve the professional's proficiency according to an evaluation scale that goes from apprentice to specialist, time of experience and field. It counted on 98 respondents, all Data Scientists trained and active in the area. The results show that the most developed competencies across all profiles are Structured Data and Project Management, while the lowest indexes are in Natural Language Processing and Cloud Administration. In the analysis by area of activity, most of the professionals presented above-average skills in their main area of work and also in at least one competency of another discipline, characterizing an interdisciplinary profile. Low index of participants (3.7%) is considered specialist in the five major disciplines. It is concluded that most of the Data Scientists have intermediate level in four or five areas of knowledge (74.2% and 66.7% respectively), while 33.3% are specialists in a specific discipline. (text in PT-BR only)
Article
Monitoring economic conditions in real time, or nowcasting, and Big Data analytics share some challenges, sometimes called the three “Vs”. Indeed, nowcasting is characterized by the use of a large number of time series (Volume), the complexity of the data covering various sectors of the economy, with different frequencies and precision and asynchronous release dates (Variety), and the need to incorporate new information continuously and in a timely manner (Velocity). In this paper, we explore three alternative routes to nowcasting with Bayesian Vector Autoregressive (BVAR) models and find that they can effectively handle the three Vs by producing, in real time, accurate probabilistic predictions of US economic activity and a meaningful narrative by means of scenario analysis.
Article
Comparison of competing statistical models is an essential part of psychological research. From a Bayesian perspective, various approaches to model comparison and selection have been proposed in the literature. However, the applicability of these approaches depends on the assumptions about the model space M. Also, traditional methods like leave-one-out cross-validation (LOO-CV) estimate the expected log predictive density (ELPD) of a model to investigate how the model generalises out-of-sample, and quickly become computationally inefficient when sample size becomes large. Here, a tutorial on Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO-CV) is provided, which is computationally more efficient. It is shown how Bayesian model selection can be scaled efficiently for big data via PSIS-LOO-CV in combination with approximate posterior inference and probability-proportional-to-size subsampling. First, several model views and the available Bayesian model comparison methods in each are discussed. The Bayesian logistic regression model is then used as a running example to show how to apply the method in practice, and demonstrate that it provides similarly accurate ELPD estimates like LOO-CV or information criteria. Subsequently, the power and exponential law models relating reaction times to practice are used to demonstrate the approach with more complex models. Guidance is provided how to compare competing models based on the ELPD estimates and how to conduct posterior predictive checks to safeguard against overconfidence in one of the models under consideration. The intended audience are researchers who practice mathematical modelling and comparison, possibly with large datasets, and who are well acquainted to Bayesian statistics.
Article
O controle dos custos assistenciais é uma preocupação central das operadoras de saúde no Brasil, especialmente após o marco regulatório instituído pela Lei 9656/1998 e a criação da Agência Nacional de Saúde em janeiro de 2000. Foram incluídas diversas obrigações no atendimento aos usuários, exigindo das operadoras adequações administrativa e financeira. Diante disso, é importante adotar estratégias para o controle de custos, a fim de buscar a viabilidade econômico financeira no médio e no longo prazo. As ferramentas em tecnologia da informação trazem novas perspectivas de atuação no controle de desembolsos assistenciais, já que viabilizam estratégias para otimizar processos e antecipar cuidados aos beneficiários dos planos de saúde. O presente estudo apresenta o resultado obtido com a aplicação das ferramentas de Big Data e Data Analytics na redução de notificações de intermediação preliminar (NIPs) e liminares judiciais em uma grande operadora de saúde. Observou-se que essas ferramentas conseguem predizer o comportamento de beneficiários no tocante à abertura de NIPs e liminares, contribuindo para a melhoria da qualidade do atendimento aos clientes e à redução de gastos.
Article
The present study has been carried out to find out the research performance in the field of big data in India during 2010-2019 as indexed in the Scopus database. A total 7502 research publication has been found in the Scopus during the 10 years period of 2010-2019 and were downloaded for analyze with the help of Microsoft excel and vosviewer. The study also provide comparative performance of different Scientometric parameters including total research productivity, yearly research output, authorship pattern, International collaboration, Institution wise collaboration, Degree of collaboration, RGR & DT, Source wise research publication, Subject wise publication, and most prolific authors etc. The study revealed that publication in the field of big data has increased exponentially during the study periods of 2010-2019. Most of the researcher published in conference papers i. e. 53.57% (2019). It is also observed that most preferred subject of the researcher is “Computer Science”. The relative growth rate decreased from 2010 (0.73) to 2019 (0.25) While the doubling time of the publication increased from 0.95 in 2010 to 2.72 in 2019. The Advances in Intelligent Systems and Computing has 422 (5.62%) publications and is the more popular source of publications on big data research publication in India and the degree of collaboration is 0.93. It has been considered that the scientific research production in the field of big data is increasing exponentially due to the collaboration among the researcher from different subjects.
ResearchGate has not been able to resolve any references for this publication.