ArticlePDF Available

Abstract

Purpose Many organizations are seeking unicorn data scientists, that rarest of breeds that can do it all. They are said to be experts in many traditionally distinct disciplines, including mathematics, statistics, computer science, artificial intelligence, and more. The purpose of this paper is to describe authors’ pursuit of these elusive mythical creatures. Design/methodology/approach Qualitative data were collected through semi-structured interviews with managers/directors from nine Australian state and federal government agencies with relatively mature data science functions. Findings Although the authors failed to find evidence of unicorn data scientists, they are pleased to report on six key roles that are considered to be required for an effective data science team. Primary and secondary skills for each of the roles are identified and the resulting framework is then used to illustratively evaluate three data science Master-level degrees offered by Australian universities. Research limitations/implications Given that the findings presented in this paper have been based on a study with large government agencies with relatively mature data science functions, they may not be directly transferable to less mature, smaller, and less well-resourced agencies and firms. Originality/value The skills framework provides a theoretical contribution that may be applied in practice to evaluate and improve the composition of data science teams and related training programs.
Unicorn data scientist:
the rarest of breeds
SašaBaškarada and Andy Koronios
University of South Australia, Mawson Lakes, Australia
Abstract
Purpose Many organizations are seeking unicorn data scientists, that rarest of breeds that can do it all.
They are said to be experts in many traditionally distinct disciplines, including mathematics, statistics,
computer science, artificial intelligence, and more. The purpose of this paper is to describe authorspursuit of
these elusive mythical creatures.
Design/methodology/approach Qualitative data were collected through semi-structured interviews with
managers/directors from nine Australian state and federal government agencies with relatively mature data
science functions.
Findings Although the authors failed to find evidence of unicorn data scientists, they are pleased to report
on six key roles that are considered to be required for an effective data science team. Primary and secondary
skills for each of the roles are identified and the resulting framework is then used to illustratively evaluate
three data science Master-level degrees offered by Australian universities.
Research limitations/implications Given that the findings presented in this paper have been based on a
study with large government agencies with relatively mature data science functions, they may not be directly
transferable to less mature, smaller, and less well-resourced agencies and firms.
Originality/value The skills framework provides a theoretical contribution that may be applied in practice
to evaluate and improve the composition of data science teams and related training programs.
Keywords Data analytics, Skills, Definition, Framework, Data science, Business analytics
Paper type Research paper
1. Introduction
Data science is an emerging applied discipline focused on facilitating organizational decision
making through the development of statistical models that extract knowledge from raw data
(Patil and Davenport, 2012). Extracted knowledge may describe what happened, explain why
something happened, and predict what may or is likely to happen. In spite of the current
popularity of data science, many organizations lack clear understanding of the required roles
(e.g. data scientist, data analyst, data engineer, business expert, system expert, and software
engineer) and skills (e.g. domain, information technology, and quantitative) (Linden et al., 2015;
Harris et al., 2013). For instance, it is frequently stated that a data scientist is someone who is
better at programming than a statistician and better at statistics than a computer scientist.
Noting that data science requires domain knowledge and a broad set of quantitative skills,
Waller and Fawcett (2013) highlight that there is a dearth of literature on the topic and many
questions(p. 77). Accordingly, they call for more research on skills that are needed by data
scientists. Given the breadth of skills required, they conclude that it may not be realistic to
expect any one person to possess all the relevant expertise. Nevertheless, short of including
virgins[1] in their employee benefits packages, companies are making every effort to attract
data scientists who can to it all (Press, 2015). Given their almost mythical status, a growing
number of data professionals are starting to refer to such rare individual, who are said to excel
in a wide range of traditionally distinct disciplines, as unicorns (Stodder, 2015; Bertolucci, 2013).
Yet, it is surprising to note that scholarly literature has so far failed to investigate the nature of
this potentially new species. That is the purpose of this paper.
2. Literature review
There is a substantial overlap between data science, data analytics, and Big Data
organizational capabilities (Laney et al., 2015). Nevertheless, data science is generally viewed
Program
Vol. 51 No. 1, 2017
pp. 65-74
© Emerald Publishing Limited
0033-0337
DOI 10.1108/PROG-07-2016-0053
Received 11 July 2016
Revised 8 December 2016
Accepted 15 December 2016
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0033-0337.htm
65
Unicorn data
scientist
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
as a set of fundamental principles that support and guide the principled extraction of
information and knowledge from data(Provost and Fawcett, 2013). Proposed core skills
include mathematics and statistics (e.g. data mining, hypothesis testing, and predictive
analytics), computer science (e.g. data structures and algorithms), and domain expertise
(Dhar, 2013; Finzer, 2013). Other required skills include data integration, transformation,
and loading, as well as data visualization (Yang and Liu, 2013).
By tracing the evolution of business intelligence and analytics from structured
content in relational database management systems and data warehouses (currently widely
adopted in industry), through unstructured Web 2.0-based content (currently
widely researched in industry and academia), to mobile and sensor-based
content (an emerging area of research), Chen et al. (2012) identify a number of relevant
skills, including analytical and IT skills (drawing from statistics and computer science),
business and domain knowledge (e.g. accounting, finance, management, marketing,
logistics, and operation management), and communication skills required to interact with
relevant decision makers. They conclude that due to its emphasis on key data management
and information technologies, business-oriented statistical analysis and management
science techniques, and broad business discipline exposure(p. 1182), the discipline of
information systems is uniquely placed to provide a valuable contribution (Lee and
Mirchandani, 2010; Stevens et al., 2011).
A study commissioned by the Joint Information Systems Committee, a UK
non-departmental public body focused on championing the importance and potential of
digital technologies in UK education and research, found that data scientists need a
wide range of skills, including domain expertise, computing, and people skills (Swan and
Brown, 2008). The study noted that although there is some variation in the skills that are
possessed by data scientists, they are all expected to have at a substantial competency in the
domain in which they operate.
Based on a survey with 250 respondents, Harris et al. (2013) identified five skill groups
that are applicable to data scientists, including business, machine learning and Big Data,
math and operations research, programming, and statistics. Depending on their level of
competence in each of these skills sets, data scientists may then either tackle the entire
analytics process, or predominantly focus on technical problems of managing data, research
methods and statistics, or deriving business value through analytics.
Although there is a widespread belief that hard scientists ( particularly physicists) tend
to produce best data scientists (Loukides, 2011), other professionals that may assume data
science roles include data mining experts, operations researchers, statisticians, actuaries,
econometricians, equity analysts, process control engineers, and the like (Linden et al., 2015).
Even though library engagement in research data management is a relatively recent
phenomenon (Corrall et al., 2013), and the relevant roles and responsibilities are yet to be
settled (Corrall, 2012; Cox and Pinfield, 2014; Madrid, 2013; Xia and Wang, 2014;
Cox and Corrall, 2013; Cassella and Morando, 2012), librarians and information science
professionals may contribute vital data curation, preservation, and archiving skills to
ensure safe custody of research outputs (Swan and Brown, 2008; Pryor and Donnelly, 2009).
Based on a systematic review of 600 peer-reviewed library and information science papers
published between 2000 and 2014 in English, Vassilakaki and Moniarou-Papaconstantinou
(2015) identify six roles that information professionals have adopted, two of which are of
particular relevance to data science. Technology specialists may facilitate the development,
management, and promotion of institutional repositories for research output, while
knowledge mangers may also contribute to the management of such repositories as well as
facilitate relevant communication and knowledge flows throughout the organization.
Corrall (2010) notes the emergence of composite, hybrid and blended library and information
science professionals as evidenced by overlapping roles and broad skillsets. She classifies
66
PROG
51,1
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
data science as a hybrid specialty comprising aspects of information technology and media
(conduit), library and information science (content), and academic and professional
discipline (context) expertise. Librarians and information science professionals may also
provide support for public engagement with science, as well as facilitate public access to
research data sets (Lyon, 2012).
Undergraduate data science degrees have focused on data visualization, data
manipulation/wrangling, computational statistics, machine learning, as well as related
topics like spatial analysis, text mining, network science, and Big Data (Baumer, 2015;
Hardin et al., 2015). Others have also emphasized oral and written communication as well as
social, ethical, and legal issues (Anderson et al., 2014). Given the multidisciplinary nature ofthe
topic, it has been observed that the teaching of individual data science courses as part of more
general (e.g. business) degrees presents a particular challenge as many students may lack
relevant background knowledge (Wang and Gu, 2016).
3. Method
Qualitative data were collected through semi-structured interviews (DiciccoBloom and
Crabtree, 2006) with nine managers/directors from nine Australian state and federal
government agencies with relatively mature data science functions. Their official titles
included such descriptors as research, innovation, analytics, and policy (e.g. manager policy
and research, and director enterprise analytics). While some were responsible for relatively
small teams comprising approximately five people, others were responsible for several
dozen specialist staff. As this study adopts a qualitative rather than a quantitative
approach, there was no requirement to select a statistically representative sample.
Instead, the interviewees were identified through personal contacts and selected based on
their professional roles and willingness/availability to participate in the study. Being
semi-structured, the interviews were guided by a number of high-level questions
(see Appendix) pertaining to the nature of the relevant work being conducted in each
agency, key roles and skills, team composition, and broad challenges and opportunities.
Ad hoc probing and follow-up questions were employed to seek clarification, elicit
additional information, and explore emerging themes (Baškarada, 2014). Data analysis,
which occurred concurrently with data collection, employed the constant comparative
method to identify and categorize key constructs (Glaser, 1965).
The resulting framework was then used to illustratively evaluate three data science
Master-level degrees offered by three Australian universities. As the objective was not to
produce any universal generalizations, but instead to simply illustrate the applicability of
the framework developed, the universities were selected in a haphazard manner. Several
universities/degrees were excluded from the analysis because they did not provide
sufficiently detailed course descriptions online.
4. Results and discussion
The authors failed to find evidence of a unicorn data scientist. In other words,
all interviewees agreed that it is unrealistic to expect one person to have the same level of
expertise in a number of distinct disciplines as more specialized experts can. Instead they all
sought to build effective multidisciplinary teams. Six key roles that are considered to be
required for an effective data science team are outlined below.
4.1 Roles
Six key roles that are considered to be required for an effective data science team
include domain expert, data engineer, statistician, computer scientist, communicator,
and team leader.
67
Unicorn data
scientist
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
4.1.1 Domain expert. This study confirmed the absolute importance of domain expertise,
one of the most frequently quoted data science skills in the literature (Linden et al., 2015;
Waller and Fawcett, 2013; Dhar, 2013; Finzer, 2013; Chen et al., 2012; Swan and Brown, 2008;
Laney et al., 2015). One of the participants observed: You have to have a very good
understanding of processes and government policy. Its absolutely critical.Without domain
expertise, data scientists (or data science teams) lack context needed to interpret raw data
into meaningful information (Baškarada and Koronios, 2013). Furthermore, the ability to ask
relevant questions, generate relevant hypotheses, and ultimately interpret results is
underpinned by deep domain expertise. Given the large variety and complexity of many
organizations, continuous access to other business experts outside of the data science team
is also important (Linden et al., 2015). For instance, a number of interviewees explained how
they validateanalytical insights through workshops with subject matter experts.
Accordingly, domain experts work very closely with all other team members with the
possible exception of computer scientists.
4.1.2 Data engineer. Literature identifies both data preparation and data quality as
critical inputs to effective analytics (Herschel et al., 2015; Randall and Beyer, 2014;
Baškarada and Koronios, 2014). Most of the interviewees referred to the garbage in,
garbage outprinciple, emphasizing the importance of high-quality data. It has been
observed that depending on the volume, velocity, variety, and veracity of data, data
preparation, which includes extraction, cleaning, enrichment, and transformation,
can consume up to 80 percent of effort (Linden et al., 2015). This was confirmed by many
interviewees, with one of them noting: Most of our effort goes on data wrangling and
cleaning. All these data in data lakes are of no use unless they are first properly prepared.
In contrast to business intelligence systems, which operate on semantically consistent data
warehouses (which transform all data into a common format), data science teams may
operate on semantically inconsistent data lakes (which keep all data in their original format)
(Heudecker and White, 2014). Accordingly, in contrast to business intelligence systems,
which require relatively infrequent data preparation (only when the data warehouse is built
or modified), data science efforts require ongoing data preparation. As a result, having a
data engineer as a permanent member of a team is much more important in the context of
data science than in the context of business intelligence.
4.1.3 Statistician. Statisticians are at the core of data science teams. They form a bridge
between domain experts, data engineers, and computer scientists. For instance, they may
refine and formalize questions and ideas from domain experts, request relevant data from data
engineers, and guide computer scientists in relation to data analysis. One of the interviewees
observed: Data science is statistics plus. Statistics is at the core of everything we do.Given
that data science efforts are increasingly undertaken in the context of Big Data, statisticians
require special expertise for dealing with large data sets. For instance, they need to be able to
identify and maximize opportunities for automation. According to one participant: Itsnot
just your traditional stats. With Big Data the focus is shifting to data mining and machine
learning.In addition to traditional skills like experimental design and hypothesis testing,
these statisticians also require a solid understanding of skills that are at the intersection of
statistics and computer science. As such, their approach needs to be much more applied than
the approach traditionally followed by academic statisticians. For instance, academic
statisticians are traditionally conservative in terms of being very cautious about making
inferences to unobserved events and entities. This conservativism may still have its place in
some applications of data science (e.g. health and public policy), but may need to be relaxed in
other problem domains with less inherent risk. As one participant observed: We dontneed
academic rigor. Its much more important to produce something quickly. We dontneeda
100% solution; 80% is usually good enough.
68
PROG
51,1
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
4.1.4 Computer scientist. Exponential growth in the volume, velocity and variety of
data has led to the development of new Big Data software tools and technologies
(e.g. Apache Hadoop, Map reduce, and Spark). Computer scientists require proficiency in
such tools and technologies, relevant programming languages like R and Python, as well as
cluster and cloud computing in order to implement and optimize (e.g. in the context of
real-time analytics) processing (e.g. sorting, aggregating, searching, matching,
and concatenating) and analysis of large data sets. One of the interviewees noted:
These are very complex tools, and they are rapidly evolving, too. We need someone to keep
on top of all the latest developments, like the Apache stack. Thats a full-time job.Given
that much (perhaps most) data are unstructured; computer scientists also require skills in
text analytics and natural language processing. In general, study participants placed strong
emphasis on agile processes and open source software.
4.1.5 Communicator. From a practical perspective, data science is largely pointless
unless it can affect organizational change. Accordingly, the ability to effectively communicate
with relevant decision makers becomes critical. This includes exploration of relevant
problems and opportunities as well as communication of eventual results. As relevant decision
makers frequently do not have advanced statistical skills, any findings need to be presented
in a form that it visually appealing, easy to understand, and ultimately convincing. At the
same time, complexity, simplifying assumptions, and contextual dependencies also need to be
appreciated and effectively communicated. It was frequently observed that decision-makers
do not want data, they want answers.As such, storytelling becomes a critical skill.
An interviewee observed: We need to challenge their (decision-makers) assumptions, change
their mental models. These are time-poor people, so we need to be able to capture their
attention quickly.This requires a different approach to the one followed by academic
statisticians who have traditionally communicated with other statisticians. Communicators
form a bridge between data science teams and relevant decision makers. Internally, within the
data science team, communicators form a bridge between the team leader, the statistician, and
the domain expert.
4.1.6 Team leader. This role is most like the mythical unicorn in the sense that the team
leader requires some understanding of all the other roles in order to bring everyone together,
manage resources, tasks, and deliverables. One interviewee observed: I have been around
for a while. I have been in similar roles for more than 30 years.In addition to requiring
extensive project management expertise, the team leader is responsible for ensuring that
any ethical, privacy, and security norms and expectations are adhered to. Working closely
with the communicator, the team leader is responsible for developing relevant business
cases and estimating expected return on investment.
4.2 Primary and secondary skills
Although each role is associated with, and based on, primary expertise, no role can operate
in isolation. In other words, in order to enable interaction within a data science team,
each role requires one or more secondary skills. Table I details primary and secondary skills
for each of the roles identified in this paper. As such, it identifies the degree of interaction
between the roles. It also highlights that interactions between the roles are asymmetric,
and identifies the degree of asymmetry.
For instance, although the data engineer requires some domain expertise in order to be able
to seek relevant information and guidance from the domain expert, domain experts may be
able to provide such information and guidance even if they have no data preparation skills.
As there is no need for any direct interaction between the domain expert and the
computer scientist, domain experts generally do not require any computer science skills and
vice versa. Domain experts do, however, require some statistical skills in order to facilitate
69
Unicorn data
scientist
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
identification/generation of relevant questions and hypotheses, as well as interpretation of
results. They do not necessarily require any specialist communication expertise. Besides
requiring some domain expertise in order to be able to extract, clean, enrich, and transform
relevant data, data engineers do not necessarily require any other secondary skills. Statisticians
requiresomedomainexpertiseinordertobeabletorefineandformalizequestionsandideas
from domain experts, and some data preparation expertise in order to be able to provide
informed guidance to data engineers. They do not necessarily require specialized computer
science, communication, or management skills. Computer scientists do not necessarily require
any domain expertise, communication, or management skills. They do, however, require some
data preparation skills, and reasonably advanced statistical skills. Communicators require
significant domain expertise in order to effectively communicate with relevant decision makers.
They also require some statistical skills in order to be able to present analytical findings in a
form that is visually appealing, easy to understand, and ultimately convincing. As noted above,
team leaders are most like the mythical unicorns in the sense that they require some
understanding of all the other roles in order to bring everyone together, manage resources,
tasks, and deliverables.
Table I indicates that the team leader role requires the greatest breadth of skills, and that
domain expertise and statistics are at the core of data science. They are closely followed by
data preparation skills. In contrast to the core skills, computer science, communication,
and management skills are somewhat less central, although not necessarily less important.
4.3 Culture
A scientific (as opposed engineering) approach to data science implies a certain culture.
For instance, as outcomes are by definition not known at the start (as opposed to
engineering where one starts with a predefined outcome), failure needs to be expected and
accepted. This requires supportive leadership and cultural environment, with sufficient time
and resources to test new approaches and ideas, as well as a mechanism for implementing
good ideas. The primary focus of data science teams should be on developing proof of
concept prototypes. Accordingly, such teams should not be expected to deliver mature,
production-level products. Instead, separate software engineering teams should be engaged
for that purpose. A scientific approach also implies that organizational data science efforts
represent an iterative journey rather than a destination.
5. Applying the framework
Next, we use the above framework to illustratively evaluate three data science Master-level
degrees offered by three Australian universities. Two of those (case A and case C)
have a duration of two years full-time, while the third one (case B) has a duration of
one year full-time. Table II details relevant academic and professional admission
requirements for each case. Table III details courses comprising each degree and skills
developed in each course. The mapping between courses and applicable skills was based on
course descriptions provided on the universitieswebsites.
Table I.
Primary and
secondary skills
70
PROG
51,1
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
Cases A and C have a reasonable coverage across the skill areas identified in our framework.
Case B, on the other hand, addresses domain expertise and management skill requirements
only in the final capstone project. The lack of coverage in case B may partly be attributed to
its shorter duration of one year full-time, in contract to two years full-time for cases A and C.
Nevertheless, most courses, with the exception of electives are generally very broad, thus,
offering limited opportunity for specialization. This is particularly acute in case B, where the
two elective courses do not even have to be related to data science. It may be argued that this
limitation is somewhat offset by the academic admission requirements, which serve to select
students with quantitative expertise. However, given the lack of any communication and
management prerequisites for admission, it may be argued that cases B and C do not
provide sufficient opportunities for deep specialization in these skill areas. In contrast,
Case Academic Professional
A A Bachelors degree in mathematics, computer science, physics,
engineering, accounting, finance, or economics
At least three years of
professional experience
B An Honors degree, a graduate certificate, or a graduate diploma in
mathematics, computer science, statistics, physics, engineering,
economics, or finance None
C A Bachelors degree in mathematics or information technology,
or a graduate certificate/diploma in data science None
Table II.
Admission
requirements
Case Course DE DP S CS C M
A Introduction to Data Science ||||||
Statistics for Data Science ||||
Data and Algorithms |||||
Project Management ||
Visualization and Communication |
Evidence-Based Decision Making ||||
Project 1 ||||
Project 2 ||||||
Specialized Elective ×4||||||
B Introduction to Data Science ||||
Data Mining |||
Elective ×2||||
Information Visualization ||
Computational Statistics ||
Capstone Project ||||||
C Big Data || |||
Programming for Data Science ||
Elective ×2|||||
Predictive Analytics ||||
Machine Learning ||
Project 1 ||||||
Social Media Analytics ||
Customer Analytics |||
Project 2 ||||
Advanced Analytics 1 ||
Advanced Analytics 2 |||
Capstone Project ||||||
Notes: DE, domain expertise; DP, data preparation; S, statistics; CS, computer science; C, communication;
M, management
Table III.
Courses
71
Unicorn data
scientist
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
case A offers two courses specifically focused on project management, and visualization and
communication. Furthermore, four specialized case A elective courses provide an
opportunity for deeper specialization in several relevant skill areas, including research
methods, advanced statistics, databases, and software development, ethics, law, policy, and
so on. In addition, a variety of case studies are used to illustrate the relevance of domain
expertise. The admission requirement of at least three years of professional experience is also
useful for ensuring some familiarity with teamwork and management concepts. In case C, two
elective courses are used to provide an opportunity to students with an information
technology background to develop skills in statistics and probabilities, and to students with a
mathematics background to develop skills in relational databases and warehouses, as well as
business intelligence and analytics.
The above analysis indicates that these universities aim to produce quasi-unicorns. Given
the limited opportunity for specialization in the skill areas identified in our framework, prior
academic and professional experience becomes critical. Without deep expertise in any of the
roles identified in our framework, such graduates may not be able to effectively contribute to
multidisciplinary data science teams. Nevertheless, they may prove valuable to smaller
agencies and firms with limited resources who may have to rely on such quasi-unicorns.
6. Conclusion
While many universities are now offering degrees in data science, and many organizations are
seeking to hire individual data scientists, the findings presented in this paper suggest that it
may be more beneficial to view data science from a multidisciplinary team perspective.
The paper identified six key roles considered essential for an effective data science team, and
shared skills required for effective within-team interaction. The skills framework provides a
theoretical contribution that may be applied in practice to evaluate and improve the
composition of multidisciplinary data science teams and related training programs. However,
given that our findings have been based on a study with large government agencies with
relatively mature data science functions, they may not be directly transferable to less mature,
smaller, and less well-resourced agencies and firms, who may instead have to rely on individual
unicorndata scientists. Given that the illustrative case studies highlighted a potential gap in
opportunities for academic specialization in relation to the roles identified in our framework,
future studies may wish to explore how higher education institutions may effectively partner
with private and public organizations in order to address this potential problem.
Note
1. Those unfamiliar with the reference may wish to note that according to medieval lore unicorns are
only tamable by virgins who, as a result, may be used by hunters as unicorn bait.
References
Anderson, P., Bowring, J., Mccauley, R., Pothering, G. and Starr, C. (2014), An undergraduate degree in
data science: curriculum and a decade of implementation experience,Proceedings of the
45th ACM Technical Symposium on Computer Science Education, ACM, pp. 145-150.
Baškarada, S. (2014), Qualitative case study guidelines,The Qualitative Report, Vol. 19 No. 40, pp. 1-25.
Baškarada, S. and Koronios, A. (2013), Data, information, knowledge, wisdom (DIKW): a semiotic
theoretical and empirical exploration of the hierarchy and its quality dimension,Australasian
Journal of Information Systems, Vol. 18 No. 1, pp. 5-24.
Baškarada, S. and Koronios, A. (2014), A critical success factor framework for information quality
management,Information Systems Management, Vol. 31 No. 4, pp. 276-295.
Baumer, B. (2015), A data science course for undergraduates: thinking with data,The American
Statistician, Vol. 69 No. 4, pp. 334-342.
72
PROG
51,1
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
Bertolucci, J. (2013), Are you recruiting a data scientist, or unicorn?,InformationWeek, available at:
www.informationweek.com/big-data/big-data-analytics/are-you-recruiting-a-data-scientist-or-
unicorn/d/d-id/899843 (accessed November 12, 2015).
Cassella, M. and Morando, M. (2012), Fostering new roles for librarians: skills set for repository
managers results of a survey in Italy,Liber Quarterly, Vol. 21 Nos 3/4, pp. 407-428.
Chen, H., Chiang, R.H. and Storey, V.C. (2012), Business intelligence and analytics: from Big Data to
big impact,MIS Quarterly, Vol. 36 No. 4, pp. 1165-1188.
Corrall, S. (2010), Educating the academic librarian as a blended professional: a review and case
study,Library Management, Vol. 31 Nos 8/9, pp. 567-593.
Corrall, S. (2012), Roles and responsibilities: libraries, librarians and data, in Pryor, G. (Ed.), Managing
Research Data, Facet, London, pp. 141-151.
Corrall, S., Kennan, M.A. and Afzal, W. (2013), Bibliometrics and research data management services:
emerging trends in library support for research,Library Trends, Vol. 61 No. 3, pp. 636-674.
Cox, A.M. and Corrall, S. (2013), Evolving academic library specialties,Journal of the American
Society for Information Science and Technology, Vol. 64 No. 8, pp. 1526-1542.
Cox, A.M. and Pinfield, S. (2014), Research data management and libraries: current activities and
future priorities,Journal of Librarianship and Information Science, Vol. 46 No. 4, pp. 299-316.
Dhar, V. (2013), Data science and prediction,Communications of the ACM, Vol. 56 No. 12, pp. 64-73.
DiciccoBloom, B. and Crabtree, B.F. (2006), The qualitative research interview,Medical Education,
Vol. 40 No. 4, pp. 314-321.
Finzer, W. (2013), The data science education dilemma,Technology Innovations in Statistics
Education, Vol. 7 No. 2, pp. 1-9.
Glaser, B.G. (1965), The constant comparative method of qualitative analysis,Social Problems, Vol. 12
No. 4, pp. 436-445.
Hardin, J., Hoerl, R., Horton, N.J., Nolan, D., Baumer, B., Hall-Holt, O., Murrell, P., Peng, R., Roback, P.
and Temple Lang, D. (2015), Data science in statistics curricula: preparing students to
think with data’”,The American Statistician, Vol. 69 No. 4, pp. 343-353.
Harris, H., Murphy, S. and Vaisman, M. (2013), Analyzing the Analyzers: An Introspective Survey of Data
Scientists and Their Work,OReilly Media.
Herschel, G., Linden, A. and Duncan, A.D. (2015), Seven Best Practices for Your Big Data Analytics
Projects, Gartner, Stamford, CT.
Heudecker, N. and White, A. (2014), The Data Lake Fallacy: All Water and Little Substance, Gartner,
Stamford, CT.
Laney, D., Kart, L., Jain, A. and Linden, A. (2015), How Data Scientist Skills and Qualifications Differ
from Those of BI Analysts and Statisticians, Gartner, Stamford, CT.
Lee, K. and Mirchandani, D. (2010), Dynamics of the importance of IS/IT skills,Journal of Computer
Information Systems, Vol. 50 No. 4, pp. 67-78.
Linden, A., Kart, L., Randall, L., Beyer, M.A. and Duncan, A.D. (2015), Staffing Data Science Teams,Gartner,
Stamford, CT.
Loukides, M. (2011), What is Data Science? OReilly Media, Inc.
Lyon, L. (2012), The informatics transform: re-engineering libraries for the data decade,International
Journal of Digital Curation, Vol. 7 No. 1, pp. 126-138.
Madrid, M.M. (2013), A study of digital curator competences: a survey of experts,The International
Information & Library Review, Vol. 45 Nos 3/4, pp. 149-156.
Patil, T. and Davenport, D. (2012), Data scientist: the sexiest job of the 21st century,Harvard Business
Review, available at: https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
(accessed December 11, 2015).
Press, G. (2015), The hunt for unicorn data scientists lifts salaries for all data analytics professionals,
Forbes, available at: www.forbes.com/sites/gilpress/2015/10/09/the-hunt-for-unicorn-data-
scientists-lifts-salaries-for-all-data-analytics-professionals/ (accessed November 12, 2015).
73
Unicorn data
scientist
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
Provost, F. and Fawcett, T. (2013), Data science and its relationship to Big Data and data-driven
decision making,Big Data, Vol. 1 No. 1, pp. 51-59.
Pryor, G. and Donnelly, M. (2009), Skilling up to do data: whose role, whose responsibility, whose
career?,International Journal of Digital Curation, Vol. 4 No. 2, pp. 158-170.
Randall, L. and Beyer, M.A. (2014), Data Preparation is Not an Afterthought, Gartner, Stamford, CT.
Stevens, D., Totaro, M. and Zhu, Z. (2011), Assessing IT critical skills and revising the MIS
curriculum,The Journal of Computer Information Systems, Vol. 51 No. 3, pp. 85-95.
Stodder, D. (2015), Chasing the data science unicorn,TDWI, available at: https://tdwi.org/articles/
2015/01/06/chasing-the-data-science-unicorn.aspx (accessed November 12, 2015).
Swan, A. and Brown, S. (2008), The skills, role and career structure of data scientists and curators:
an assessment of current practice and future needs, Report to the JISC, Key Perspectives,
Playing Place.
Vassilakaki, E. and Moniarou-Papaconstantinou, V. (2015), A systematic literature review informing library
and information professionalsemerging roles,New Library World, Vol. 116 Nos 1/2, pp. 37-66.
Waller, M.A. and Fawcett, S.E. (2013), Data science, predictive analytics, and Big Data: a revolution
that will transform supply chain design and management,Journal of Business Logistics, Vol. 34
No. 2, pp. 77-84.
Wang, J. and Gu, L. (2016), Challenges of teaching data science in a business school,Issues in
Information Systems, Vol. 17 No. 3, pp. 209-217.
Xia, J. and Wang, M. (2014), Competencies and responsibilities of social science data librarians:
an analysis of job descriptions,College & Research Libraries, Vol. 75 No. 3, pp. 362-388.
Yang, L. and Liu, X. (2013), Teaching business analytics,Frontiers in Education Conference IEEE,
IEEE, pp. 1516-1518.
Appendix. High-level interview questions
(1) Could you please tell us about your organization/agency?
(2) Could you please tell us about your group/team?
How many members?
What are their skills/roles?
How do they work together?
(3) Could you please tell us about your role in your group/team?
(4) What are some of the key challenges facing your group/team?
(5) What do you see as potential future opportunities for your group/team?
(6) What do you look for when you hire data scientists?
(7) What are your thoughts on individual data scientists who excel in all the required skills?
Have you come across any/many such individuals?
Corresponding author
SašaBaškarada can be contacted at: baskarada@gmail.com
For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com
74
PROG
51,1
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
This article has been cited by:
1. BaškaradaSaša, Saša Baškarada, KoroniosAndy, Andy Koronios. Strategies for maximizing
organizational absorptive capacity. Industrial and Commercial Training, ahead of print. [Abstract]
[Full Text] [PDF]
Downloaded by University of South Australia At 02:07 06 January 2018 (PT)
... There is a multitude of studies conducted on data science skills in various distinct domains, apart from studies conducted in general or business domains (Baškarada & Koronios, 2017;Costa & Santos, 2017;Gardiner et al., 2018;Jiang & Chen, 2022;Verma et al., 2019). Unfortunately, in big data and data science, despite being an extremely important phenomenon, literature is scarce on these topics in tourism including literature on "data science skills." ...
... Domain knowledge can also be defined as "any information that is not explicitly present in data" (Piatetsky-Shapiro, 1990, p. 68). Domain knowledge is required by the data scientist to interpret raw data into meaningful information (Baskarada and Coronios, 2013 as cited in Baškarada & Koronios, 2017). Most data science projects begin as a real-world domainspecific problem that can be solved using a data-driven approach; as a result, a data scientist needs to have domain expertise that allows him/her to understand the problem, its criticality, and design a data-driven solution, which effectively resolves the issue (Kelleher & Tierney, 2018). ...
... These methods are used throughout the data science process for data investigation and in the comparison of different models and analyses produced during data science projects (Kelleher & Tierney, 2018, p. 23). In data science, the statistical approach needs to be more applied; hence, in addition to traditional skills like hypothesis testing or experimental design, these statisticians require a clear understanding of skills that are at the intersection of computer science and statistics (Baškarada & Koronios, 2017). Machine learning (ML) involves the application of many advanced statistical and computing techniques; however, a data scientist involved in the applied aspects of ML need not write his/her version of the ML algorithm (Kelleher & Tierney, 2018, pp. ...
Chapter
Data scientists have become one of the coolest professions of the twenty-first century. Amid the rising and unmet demand for data scientists, organizations are offering the most lucrative salary packages possible to this profession, and education institutions are scrambling to offer data science courses. On the other hand, despite tourism being a highly information-cum-data intensive sector, where big data and digitalization increasingly play a greater role, the literature on data science is extremely limited. Therefore, the present study has attempted to scholarly postulate the needs, skills, and scope of data scientists in the tourism sector. The needs section is built upon three types of trends such as megatrends, micro-trends, and sectoral trends, which justify the need for data scientists in tourism. A general notion of the data scientist skillset, the main focus of this study, is conceptualized in the next section. The present study also calls for future in-depth research studies on the aspect of data science skills especially in the context of the tourism domain. Lastly, the scope of data scientists in the tourism field has been framed across the business sector, research sector, governance sector, and smart tourism, including the future growth prospects and the importance of these skills in human lives and society.
... observes that most accredited LIS programmes are now offering courses and programmes that are focusing on "information" rather than "libraries". This, according to the author, is advantageous as it opens more opportunities for LIS graduates to work in various diverse departments and organisations as opposed to only being inclined to the library [2]. indicate that organisations and institutions are not fully aware of the requirements, roles and skills, needed for data scientists, hence, they seek professionals who can do it all, and who are able to "to excel in a wide range of traditionally distinct disciplines, as unicorns (Stodder, 2015; Bertolucci, 2013)" (cited in Ref. [2]; p. 65). ...
... This, according to the author, is advantageous as it opens more opportunities for LIS graduates to work in various diverse departments and organisations as opposed to only being inclined to the library [2]. indicate that organisations and institutions are not fully aware of the requirements, roles and skills, needed for data scientists, hence, they seek professionals who can do it all, and who are able to "to excel in a wide range of traditionally distinct disciplines, as unicorns (Stodder, 2015; Bertolucci, 2013)" (cited in Ref. [2]; p. 65). ...
Article
Full-text available
Libraries are currently undergoing drastic changes; these changes are a result of the proliferation of advanced technology, change in users’ information-seeking behaviour and equally the diversity of information resources. As such, libraries and librarians are no longer enjoying the monopoly they used to enjoy as the sole providers of information. With the new changes, libraries are expected not only to be the custodians of information resources, but also facilitators of the same. This new role calls for libraries and librarians to have adequate skills and knowledge in a wide range of subjects that can enable them to survive the competitive environment. This study aims at establishing effective ways of incorporating business courses into LIS programmes in universities in Hungary as a strategy for enhancing economic development and sustainability in the country. The study used a literature review approach in analysing the implementation of business courses in Library and Information Sciences (LIS) programmes among the ALA (American Library Association) accredited programmes. The study established correlations between various ALA-accredited programmes that had incorporated business courses in their programmes. Using ALA-accredited programmes as a model, the study sought to analyse an appropriate model for restructuring LIS programmes in Hungary. From the findings, it was revealed that most ALA-accredited programmes had embraced various business courses in their programmes, although, it was noted that the majority of the programmes had business courses as electives. It was also observed that various titles of business courses amongst the ALA programmes were diverse and varied. From the analysis of this study, it was established that the incorporation of business courses in the LIS programme is beneficial, since most universities, globally, are trending towards the concept of entrepreneurial universities. However, there needs to be an appropriate strategy of ensuring that the courses chosen are market driven.
... The requirement for building multidisciplinary teams is not new in the field, this was already concluded by Baškarada and Koronios (2017). What is surprising, is the fact that although we have standardized methods and tools, there is some evidence that these are not being used by data science teams (Martinez et al., 2021;Khalajzadeh et al., 2020). ...
... This would expand the scope of the methodologies to cover the entire ML lifecycle, including MLOps oriented concerns such as memory and compute resource management and optimization, quality assurance, model governance and -retraining, and monitoring of live models. (Ferrero et al., 2020;Verma et al., 2021, Zhang et al., 2020Mao et al., 2019;Piorkowski et al., 2021;van Stijn, nd., Staron et al., 2021;Hukkelberg and Berntzen, 2019;Hind et al., 2019;McDavid et al., 2021;Khalajzadeh et al., 2020;Wang et al., 2019) Collaborative practices 12 (Ferrero et al., 2020;Staron et al., 2021;Verma et al., 2021;Zhang et al., 2020;Mao et al., 2019;Piorkowski et al., 2021;Park et al., 2021;Hukkelberg and Berntzen, 2019;McDavid et al., 2021;Martinez et al., 2021;Martín-Noguerol et al., 2021;Wang et al., 2019) Hindrances to teamwork 10 (Arpteg et al., 2018;Zhang et al., 2020;Mao et al., 2019;Piorkowski et al., 2021;Park et al., 2021;Hukkelberg and Berntzen, 2019;Hind et al., 2019;Martinez et al., 2021;Amershi et al., 2019;Passi and Jackson, 2018) Workflow 8 (Verma et al., 2021;Zhang et al., 2020;Mao et al., 2019;Piorkowski et al., 2021;Antoniou and Mamdani, 2021;McDavid et al., 2021;Martinez et al., 2021;Khalajzadeh et al., 2020) (Baškarada and Koronios, 2017;Ferrero et al., 2020) Although the collected literature mostly deals with the work of data scientists, this does not mean that none of it is relevant to MLOps. The above lifecycle models (Martinez et al., 2021), or common workflows, can be considered as critical tools for a discipline that aims to standardize the entire ML lifecycle. ...
Article
Full-text available
Machine learning operations (MLOps) is an emerging and complex subject area involving experts from several fields and backgrounds. Its main purpose is to enable a more standardized and effective approach to building and maintaining machine learning systems. Machine learning projects have an extremely high failure rate. One of the reasons behind this is the lack of teams designed for building these systems. At the same time, machine learning projects can carry great business risks. This paper takes a scoping review approach in assessing the state of the current literature about multidisciplinary teamwork within the context of MLOps. Most of the literature reviewed on collaboration and teamwork focuses on the intimately related field of data science. These articles are analyzed, and a synthesis is presented of the gaps in the current literature for collaboration within data science. Recommendations for further research directions are given for MLOps.
... Different studies are trying to establish what is the optimum level of mutual understanding between data scientists and doctors, [14,15,16]. It is obvious that a data scientist should not expect a medical doctor to have knowledge regarding model development if understandable terms are not used in the communications. ...
... This communication system focuses on three pillars: the committee of doctors and AI models that include deep learning and statistical learning algorithm, statistical analysis, and operational research, [16][17][18]. The model is highrisk because it includes the communication of people that come from different backgrounds (medicine and computer science) on a very sensitive and emotional subject, but it is also high-gain. ...
Article
Full-text available
The last two years have taught us that we need to change the way we practice medicine. Due to the COVID-19 pandemic, obstetrics and gynecology setting has changed enormously. Monitoring pregnant women prevents deaths and complications. Doctors and computer data scientists must learn to communicate and work together to improve patients’ health. In this paper we present a good practice example of a competitive/collaborative communication model for doctors, computer scientists and artificial intelligence systems, for signaling fetal congenital anomalies in the second trimester morphology scan.
... Despite the promises of self-service and empowerment that come with commercial dashboarding systems, we heard from many of our participants that the process of making decisions from complex data within a large organization requires specialized knowledge and skills. This echoes arguments that encourage organizations to hire teams of specialists rather than chase after unicorns [12,20,40]. Dashboarding systems, however, are traditionally designed with single users in mind. ...
Article
Full-text available
Many long-established, traditional manufacturing businesses are becoming more digital and data-driven to improve their production. These companies are embracing visual analytics in these transitions through their adoption of commercial dashboarding systems. Although a number of studies have looked at the technical challenges of adopting these systems, very few have focused on the socio-technical issues that arise. In this paper, we report on the results of an interview study with 17 participants working in a range of roles at a long-established, traditional manufacturing company as they adopted Microsoft Power BI. The results highlight a number of socio-technical challenges the employees faced, including difficulties in training, using and creating dashboards, and transitioning to a modern digital company. Based on these results, we propose a number of opportunities for both companies and visualization researchers to improve these difficult transitions, as well as opportunities for rethinking how we design dashboarding systems for real-world use.
... As companies' hiring data scientists find that it is difficult to find a so-called "unicorn data scientist" [1], we conducted our experiments and analysis using companies' job postings for a data scientist position, job seekers' CVs for that position, and a curriculum from a master's program in data science. However, our investigated methods and our final recommendation system can be applied to other job positions as well. ...
Article
Full-text available
Skills are the common ground between employers, job seekers and educational institutions which can be analyzed with the help of artificial intelligence (AI), specifically natural language processing (NLP) techniques. In this paper we explore a state-of-the-art pipeline that extracts, vectorizes, clusters, and compares skills to provide recommendations for all three players—thereby bridging the gap between employers, job seekers and educational institutions. As companies hiring data scientists report that it is increasingly difficult to find a so-called "unicorn data scientist" [1], we conduct our experiments and analysis using companies’ job postings for a data scientist position, job seekers’ CVs for that position, and a curriculum from a master's program in data science. However, our investigated methods and our final recommendation system can be applied to other job positions as well. Our best system combines Sentence-BERT [2], UMAP [3], DBSCAN [4], and K-means clustering [5]. To also evaluate feedback from potential users, we conducted a survey, in which the majority of employers’, job seekers’ and educational institutions’ representatives state that with the help of our automatic recommendations, processes related to skills are more effective, faster, fairer, more explainable, more autonomous and more supported.
... Recent years have brought more widespread application of data science in many fields -from business and economics, through public policy, to science and education (Voulgaris, 2014). The acquired body of knowledge can help describe what happened, explain why something happened, and predict what might happen (Baškarada & Koronios, 2017), contributing to the optimization and increase in the efficiency of the processes (Granville, 2014). ...
Article
Full-text available
Research data management (RDM) poses a significant challenge for academic organizations. The creation of library research data services (RDS) requires assessment of their maturity, i.e., the primary objective of this study. Its authors have set out to probe the nationwide level of library RDS maturity, based on the RDS maturity model, as proposed by Cox et al. (2019), while making use of natural language processing (NLP) tools, typical for big data analysis. The secondary objective consisted in determining the actual suitability of the above-referenced tools for this particular type of assessment. Web scraping, based on 72 keywords, and completed twice, allowed the authors to select from the list of 320 libraries that run RDS, i.e., 38 (2021) and 42 (2022), respectively. The content of the websites run by the academic libraries offering a scope of RDM services was then appraised in some depth. The findings allowed the authors to identify the geographical distribution of RDS (academic centers of various sizes), a scope of activities undertaken in the area of research data (divided into three clusters, i.e., compliance, stewardship, and transformation), and overall potential for their prospective enhancement. Although the present study was carried within a single country only (Poland), its protocol may easily be adapted for use in any other countries, with a view to making a viable comparison of pertinent findings.
Purpose Data science promises new opportunities for organizational decision-making. Data scientists arguably play an important role in this regard and one can even observe a certain “buzz” around this nascent occupation. This paper enquires into how data scientists construct their occupational identity and the challenges they experience when enacting it. Design/methodology/approach Based on semi-structured interviews with data scientists working in different industries, the authors explore how these actors draw on their educational background, work experiences and perception of the contemporary digitalization discourse to craft their occupational identities. Findings The authors identify three main components of data scientists’ occupational identity: a scientific mindset, an interest in sophisticated forms of data work and a problem-solving attitude. The authors demonstrate how enacting this identity is sometimes challenged through what data scientists perceive as either too low or too high expectations that managers form towards them. To address those expectations, they engage in outward-facing identity work by carrying out educational work within the organization and (paradoxically) stressing both prestigious and non-prestigious parts of their work to “tame” the ambiguity and hype they perceive in managers’ expectations. In addition, they act upon themselves to better appreciate managers’ perspectives and expectations. Originality/value This study contributes to research on data scientists as well as the accounting literature that often refers to data scientists as new competitors for accountants. It cautions scholars and practitioners alike to be careful when discussing the possibilities and limitations of data science concerning advancements in accounting and control.
Chapter
Usually employers, job seekers and educational institutions use AI in isolation from one another. However, skills are the common ground between these three parties which can be analyzed with the help of AI. Employers want to automatically check which of their required skills are covered by applicants’ CVs and know which courses their employees can take to acquire missing skills. Job seekers want to know which skills from job postings are missing in their CV and which study programs they can take to acquire missing skills. In addition, educational institutions want to make sure that skills required in job postings are covered in their curricula, and they want to recommend study programs. Consequently, we investigated several natural language processing techniques to extract, vectorize, cluster and compare skills, thereby connecting and supporting employers, job seekers and educational institutions. Our application Skill Scanner uses our best algorithms and outputs statistics and recommendations for all groups. The results of our survey demonstrate that the majority finds that with the help of Skill Scanner, processes related to skills are carried out more effectively, faster, fairer, more explainably, and in a more supported manner. In total, 89% of all participants are not averse to apply our recommendation system for their tasks, and 67% of job seekers would certainly use it.
Chapter
Face recognition is one of the most popular applications in video surveillance systems and computer vision. The researches of face recognition in recent years have been shown that their applications are widely used in practice. Particularly, during the pandemic of Covid-19, there were a lot of researches relating to face recognition with and without mask. The accuracy of the face recognition algorithms is depended on technical issues, implemented solutions and models of data processing. In this paper, we propose an improved method for face recognition based on deep learning techniques and data augmentation. Our contribution of the proposed method is focused on the following steps: (1) obtaining and pre-processing data for training dataset based on image processing techniques (i.e. noise removal, mask wearing). (2) Creating a trained model of new dataset based on the Inception Resnet-v1. (3) Building an application for face recognition in timekeeping of a company. We use the two popular face datasets which are open source and publicity available: Casia-WebFace [1] for training and LFW [2] for validation. Comparing the several methods, the accuracy of our method is higher in case with mask and the processing time is very fast in the real time.KeywordsFace detectionFace recognitionDeep learningCNN modelsInception-resnet data augmentation
Chapter
Full-text available
Data science, a new discovery paradigm, is potentially one of the most significant advances of the early twenty-first century. Originating in scientific discovery, it is being applied to every human endeavor for which there is adequate data. While remarkable successes have been achieved, even greater claims have been made. Benefits, challenge, and risks abound. The science underlying data science has yet to emerge. Maturity is more than a decade away. This claim is based firstly on observing the centuries-long developments of its predecessor paradigms—empirical, theoretical, and Jim Gray’s Fourth Paradigm of Scientific Discovery (Hey et al., The fourth paradigm: data-intensive scientific discovery Edited by Microsoft Research, 2009) (aka eScience, data-intensive, computational, procedural)—and secondly on my studies of over 150 data science use cases, several data science-based startups, and, on my scientific advisory role for Insight (https://www.insight-centre.org/), a Data Science Research Institute (DSRI) that requires that I understand the opportunities, state of the art, and research challenges for the emerging discipline of data science. This chapter addresses essential questions for a DSRI: What is data science? What is world-class data science research? A companion chapter (Brodie, On Developing Data Science, in Braschler et al. (Eds.), Applied data science – Lessons learned for the data-driven business, Springer 2019) addresses the development of data science applications and of the data science discipline itself.
Article
Full-text available
Purpose The purpose of this paper is to discuss strategies for maximizing organizational absorptive capacity. Design/methodology/approach The views presented here have been derived from authors’ extensive research and professional experience. Support for the claims made is provided through anecdotal evidence and related literature. Findings The viewpoint discusses how organizational absorptive capacity may be maximized through actions and interactions of a wide range of individual, managerial, organizational, and inter-organizational factors. Originality/value The viewpoint may assist practitioners with developing strategies for improving vicarious learning. From a theoretical perspective, the claims made in the paper present fertile ground for future empirical testing.
Article
Full-text available
U.S. business executives and educators need to be continuously aware of the knowledge and skills required for IS/IT professionals to meet current and future technological trends. This paper attempts to investigate the dynamics of the importance of IS/IT skills from the perspective of 70 IS/IT managers using latent growth curve modeling. The overall results suggest that 1) the importance of most IS/IT skills is continually increasing over time, 2) that wireless communications and applications, mobile commerce applications and protocols, IS security, Web applications, services, and protocols, and data management are the top five rapidly growing skills; 3) that IS security, data management, project management and other business skills, Web applications, services, and protocols, and wireless communications and applications are expected to be the most important five skills in the future. Based on these results, implications and recommendations for IS/IT educators, researchers, and practitioners are provided.
Article
Full-text available
Business intelligence and analytics (BI&A) has emerged as an important area of study for both practitioners and researchers, reflecting the magnitude and impact of data-related problems to be solved in contemporary business organizations. This introduction to the MIS Quarterly Special Issue on Business Intelligence Research first provides a framework that identifies the evolution, applications, and emerging research areas of BI&A. BI&A 1.0, BI&A 2.0, and BI&A 3.0 are defined and described in terms of their key characteristics and capabilities. Current research in BI&A is analyzed and challenges and opportunities associated with BI&A research and education are identified. We also report a bibliometric study of critical BI&A publications, researchers, and research topics based on more than a decade of related academic and industry publications. Finally, the six articles that comprise this special issue are introduced and characterized in terms of the proposed BI&A research framework.
Article
Full-text available
Reviews opportunities and challenges for libraries and librarians in the research data arena, with reference to published reports and case studies of emerging practice, supplemented by evidence from university and library websites. Looks at connections between research data management (RDM) and established library roles and responsibilities to explore whether RDM represents an incremental step in professional practice or a paradigm shift in collection development and service delivery requiring fundamental rethinking of roles, responsibilities, and competencies to create “next-generation librarianship,” drawing on experiences and opinions of practitioners in the field. Also discusses professional education and continuing development needs for library engagement with research data, referring particularly to initiatives in the USA.
Article
Full-text available
Although widely used, the qualitative case study method is not well understood. Due to conflicting epistemological presuppositions and the complexity inherent in qualitative case-based studies, scientific rigor can be difficult to demonstrate, and any resulting findings can be difficult to justify. For that reason, this paper discusses methodological problems associated with qualitative case-based research and offers guidelines for overcoming them. Due to its nearly universal acceptance, Yin's six-stage case study process is adopted and elaborated on. Moreover, additional principles from the wider methodological literature are integrated and explained. Finally, some modifications to the dependencies between the six case study stages are suggested. It is expected that following the guidelines presented in this paper may facilitate the collection of the most relevant data in the most efficient and effective manner, simplify the subsequent analysis, as well as enhance the validity of the resulting findings. The paper should be of interest to students (honour, masters, doctoral), academics, and practitioners involved with conducting and reviewing qualitative case-based studies.
Article
The aim of this research was to define competences for digital curators, and to validate a Delphi process in the context of Library, Archives, Museum curriculum development. The objective for the study was to obtain consensus regarding competence statements for Library, Archives and Museum digital curators. The Delphi method, a research technique, typically used to develop a consensus of opinion for topic areas in which there is little previously documented knowledge, was used in specifying the digital curator competences in LAM context. Three rounds of questionnaires with controlled feedback with space for comments and/or suggestions were sent to panel members. Five point Likert scale was employed in the questionnaire. Consensus was determined when a competence statement received a mode higher than 3, an average mean more than 3.5, and a standard deviation smaller than 1.0. Response rates for rounds I, II and III were: 70% (n = 16), 87.5% (n = 14), and 94% (n = 15) respectively. Of the 18 digital curator competences listed in the first round questionnaire, 13 (70%) achieved consensus as being necessary digital curator competences required of advanced level digital curator. Other inputs of respondents like comments and suggestions were also analyzed. An additional 23 digital curator competence statements were also suggested by the panel in round I and further developed in subsequent rounds. In round II, 12 (30%) competence statements achieved consensus. The final round and editing of competence statements led to 20 statements that describe what a well-prepared digital curator trained to participate in digital curation work should be able to do.
Article
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.