Conference PaperPDF Available

Towards Interactive Recommender Systems with the Doctor-in-the-Loop

Authors:

Abstract

Recommender Systems are a perfect example for automatic Machine Learning (aML) – which is the fastest growing field in computer science generally and health informatics specifically. The general goal of ML is to develop algorithms which can learn and improve over time and can be used for predictions and decision support – which is of the central interest of health informatics. Whilst automatic approaches greatly benefit from big data with many training sets, in the health domain experts are often confronted with a small number of complex data sets or rare events, where aML-approaches suffer of insufficient training samples. Here interactive Machine Learning (iML) may be of help, which can be defined as " algorithms that can interact with agents and can optimize their learning behaviour through these interactions, where the agents can also be human ". Such a human can be an expert, i.e. a medical doctor, and this " doctor-in-the-loop " can be beneficial in solving computationally hard problems, e.g., subspace clustering, protein folding, or k-anonymization of health data, where human expertise can help to reduce an exponential search space through heuristic selection of samples. Therefore, what would otherwise be an NP-hard problem, reduces greatly in complexity through the input and the assistance of a human expert agent involved in the learning phase. Important future research aspects are in the combined use of both human intelligence and computer intelligence, in the context of hybrid multi-agent recommender systems which can also make use of the power of crowdsourcing to make use of joint decision making – which can be very helpful e.g. in the diagnosis and treatment of rare diseases.
Towards Interactive Recommender
Systems with the Doctor-in-the-Loop
Andreas Holzinger1, André Calero Valdez1,2, Martina Ziefle2
Holzinger Group, HCI-KDD, Institute for Medical Informatics, Medical University Graz1
Human-Computer Interaction Center, RWTH-Aachen University2
Abstract
Recommender Systems are a perfect example for automatic Machine Learning (aML) which is the
fastest growing field in computer science generally and health informatics specifically. The general
goal of ML is to develop algorithms which can learn and improve over time and can be used for
predictions and decision support which is of the central interest of health informatics. Whilst
automatic approaches greatly benefit from big data with many training sets, in the health domain
experts are often confronted with a small number of complex data sets or rare events, where aML-
approaches suffer of insufficient training samples. Here interactive Machine Learning (iML) may be of
help, which can be defined as “algorithms that can interact with agents and can optimize their learning
behaviour through these interactions, where the agents can also be human”. Such a human can be an
expert, i.e. a medical doctor, and this “doctor-in-the-loop” can be beneficial in solving computationally
hard problems, e.g., subspace clustering, protein folding, or k-anonymization of health data, where
human expertise can help to reduce an exponential search space through heuristic selection of samples.
Therefore, what would otherwise be an NP-hard problem, reduces greatly in complexity through the
input and the assistance of a human expert agent involved in the learning phase. Important future
research aspects are in the combined use of both human intelligence and computer intelligence, in the
context of hybrid multi-agent recommender systems which can also make use of the power of
crowdsourcing to make use of joint decision making which can be very helpful e.g. in the diagnosis
and treatment of rare diseases.
1 Introduction
Originally the term machine learning (ML) was defined as “... artificial generation of
knowledge from experience”, and the first studies have been performed with games, i.e., with
the game of checkers (Samuel, 1959). Today, ML is the fastest growing technical field, at the
intersection of informatics and statistics, tightly connected with data science and knowledge
Veröffentlicht durch die Gesellschaft für Informatik e.V. 2016 in
B. Weyers, A. Dittmar (Hrsg.):
Mensch und Computer 2016 Workshopbeiträge, 4. - 7. September 2016, Aachen.
Copyright © 2016 bei den Autoren.
http://dx.doi.org/10.18420/muc2016-ws11-0001
Towards Interactive Recommender Systems with the Doctor-in-the-Loop 2
discovery; and health informatics is among the greatest challenges (Jordan & Mitchell,
2015), (Le Cun, Bengio & Hinton, 2015), (Lake, Salakhutdinov & Tenenbaum, 2015).
In daily life we have often to make decisions without sufficient experience or personal
background knowledge of alternatives, consequently we rely on recommendations of other
people. Consequently, recommendations are a firm part of natural human social interaction
(Taraghi, Grossegger, Ebner & Holzinger, 2013), (Desrosiers & Karypis, 2011).
Apart from these naïve daily observation, decision making/decision support is the supreme
discipline in health informatics and the core competency in the health sciences (McNeil,
Keeler & Adelstein, 1975), (Croskerry & Nimmo, 2011), (Holzinger, 2014).
Recommender systems primarily assist and augment these natural social processes. In a
typical recommender system people provide recommendations as inputs, which the system
then aggregates and directs to appropriate recipients. In some cases, the primary
transformation is in the aggregation; in others the system’s value lies in its ability to make
good matches between the recommenders and those seeking recommendations (Resnick &
Varian, 1997).
2 Background and Related Work
The underlying theory of recommender systems is in collaborative filtering. The idea stems
from famous Xerox PARC, first applied in the Tapestry system (Goldberg, Nichols, Oki &
Terry, 1992).
2.1 Collaborative Filtering
Collaborative filtering (CF) allows users to tag content and have other users benefitting from
this tagging. Beyond the manual tagging automatic use-based tagging can be applied.
Consequently, CF may be considered a special case of usage mining, which relies on
previous recommendations by other users in order to predict which among a set of items are
most interesting for the current user (Srivastava, Cooley, Deshpande & Tan, 2000). This
helps to answer the question of what is interesting? (Miller & Sittig, 1990), (Silvia, 2005)
which is together with the question what is relevant?” among the grand research questions
in decision making and decision support, which is an growing research area in both machine
learning and health informatics (Tulabandhula & Rudin, 2014), (Holzinger, 2014).
Systems following such approaches are generally called recommender systems: humans
provide recommendations as inputs, which the system then aggregates and directs to
appropriate recipients. In some cases, the primary transformation is in the aggregation; in
others the system’s value lies in its ability to make good matches between the recommenders
and those seeking recommendations (Resnick & Varian, 1997).
Naturally, recommender systems are in the daily experiences of most Web users today and
have manifold applications in different domains, ranging from E-commerce (product
Towards Interactive Recommender Systems with the Doctor-in-the-Loop 3
recommendations) to Science (e.g. recommending papers to reviewers) with sheer endless
application possibilities. In contexts with increasing information overload, people have to use
a variety of strategies to make choices (Jannach, Zanker, Felfernig & Friedrich, 2010).
Technically, recommender systems store a data table that records for each user/item pair
whether the user has made a recommendation for the item or not, and also the strength of the
recommendation. Typical approaches are content-based or user-based approaches or hybrid
approaches. Bayesian classifiers, non-negative matrix factorization or singular value
decomposition are used to cluster documents, users and interests for recommendations.
These numerical approaches are extended by machine learning approaches, e.g. deep
learning approaches (Wang, Wang & Yeung, 2015) or evolutionary computing approaches
(da Silva, Camilo, Pascoal & Rosa, 2016).
However, understanding how a neural network completes its task is still hard or impossible
to answer. Visual interpretations of neuron-feature relationships have been impressively
demonstrated using feed-forward networks in the Deep Dream Project of Google. Using
high-dimensional non-visual data, makes this task infinitely more complicated and there are
a lot of open research routes for future work.
2.2 Interactive Machine Learning with the human-in-the-loop
Interactive Machine Learning (iML) can be defined as algorithms that can interact with both
computational agents and human agents and can optimize their learning behaviour through
these interactions (Holzinger, 2016b), (Holzinger, 2016a). In active learning such agents are
called oracles (Settles, 2011).
2.3 When is the human-in-the-loop beneficial?
There is evidence that humans sometimes still outperform ML-algorithms, e.g., in the
instinctive, often almost instantaneous interpretation of complex patterns, for example, in
diagnostic radiologic imaging: A promising technique to fill the semantic gap is to adopt an
expert-in-the-loop approach, to integrate the physician’s high-level expert knowledge into
the retrieval process by acquiring his/her relevance judgments regarding a set of initial
retrieval results (Akgul et al., 2011). Despite these apparent findings, so far there is little
quantitative evidence on effectiveness and efficiency of iML-algorithms. Moreover, there is
practically no evidence, how such interaction may really optimize these algorithms, even
though “natural” intelligent agents are present in large numbers on our world and are studied
by cognitive scientists for quite a while (Gigerenzer & Gaissmaier, 2011). A very recent
work is on building probabilistic kernel machines that encapsulate human support and
inductive biases, because state-of-the-art ML algorithms perform badly on a number of
extrapolation problems, which otherwise would be very easy to solve for humans (Wilson,
Dann, Lucas & Xing, 2015).
Towards Interactive Recommender Systems with the Doctor-in-the-Loop 4
2.4 Trust for the doctor-in-the-loop
The doctor-in-the-loop (DiL) as new paradigm in information driven medicine, picturing
the doctor as authority inside a loop not only supplying an expert system with data and
information, but also to interactively manipulate algorithms and tools (Holzinger, 2016a),
(Holzinger, 2016b). Before this DiL-paradigm can be implemented in any such system for
use in real-world clinical medicine, the trustworthiness of such a system must be assured (O'
Donovan & Smyth, 2005). It is well known that publicly accessible adaptive systems such as
collaborative recommender systems present a huge security problem, mostly due to the fact
that potential attackers cannot easily be distinguished from end users. Apart from technical
risks, such attacks may lead to a degradation of user trust in the objectivity and accuracy of
such system. A major issue for further research is in modelling attacks and to examine their
impact on recommendation algorithms.
One benefit of the DiL paradigm could be that hybrid algorithms may provide a higher
degree of robustness (Mobasher, Burke, Bhaumik & Williams, 2007). The doctor as
authority inside a loop with an expert system in order to support the (automated) decision
making with expert knowledge, not only includes support in pattern finding and supplying
external knowledge, but the inclusion of data on actual patients, as well as treatment results
and possible additional (side-) effects that relate to previous decisions of semi-automated
systems. In this sense, the DiL-concept can be seen as an extension of the increasingly
frequent use of knowledge discovery for the enhancement of medical treatments together
with human expertise: The expert knowledge of the doctor is enriched with additional
information and expert know-how (Kieseberg, Weippl & Holzinger, 2016), (Kieseberg et al.,
2016), (Kieseberg, Frühwirt, Weippl & Holzinger, 2015).
3 Future Outlook
We are very much interested in applying recommender systems for solving problems in
health informatics, where there is not much previous work. To date as of 5th June 2016, the
related work comprises 17 results in the Web of Science with the title “recommender
systems health”, the oldest ranging back to 2007 and the most cited having 14 citations:
A five page research statement on the use of recommender systems for personalized health
education by (Fernandez-Luque, Karlsen & Vognild, 2009) argues that these systems do not
take advantage of the increasing amount of educational resources freely available on the
Web, and they point out that it is a difficult problem to find and to match the relevant ones.
(Sezgin & Ozkan, 2013) provided at the EHB 2013 a four-page review on health
recommender systems, where they emphasize the increasing importance od so-called
context, Health Recommender Systems (HRS) which are presented as complementary tools
in decision making processes in health care services and have potential to increase usability
and acceptance of technologies and reduce information overload in many processes.
Towards Interactive Recommender Systems with the Doctor-in-the-Loop 5
A very important future research is in measuring and benchmarking recommender systems,
particularly in terms of acceptance of end users (Ziefle, Klack, Wilkowska & Holzinger,
2013) and satisfaction and to personalize the system exactly to the needs, demands and
requirements of the end user, and this opens a lot of future research issues, bringing diversity
and personalization not just to the contents of recommendation lists, but to the
recommendation process itself (Zhou et al., 2010). Quality issues of recommender systems
will be crucial for the application in the health domain.
Most of all, more comprehensive quality measures are urgently sought, but need much
theoretical and experimental future work (Herlocker, Konstan, Terveen & Riedl, 2004). The
problem is still that most metrics focus on accuracy and ignore e.g. serendipity and coverage.
For answering the question “what is interesting?”, which is highly important for health
informatics. There are well-known techniques by which algorithms can trade-off reduced
serendipity and coverage for improved accuracy (such as only recommending items for
which there are many ratings). Since users value all three attributes in many applications,
these algorithms may be more accurate, but less useful for algorithm designers this is a
difficult task, where again the DiL-paradigm can be very helpful, because the question “what
is interesting?” is inherently subjective and of human nature. We need comprehensive quality
measures that combine accuracy with other serendipity and coverage, so algorithm designers
can make sensible trade-offs to serve users better.
Serendipity is discovery of interesting items by accident, and is one of the cornerstones of
scientific progress. However, “what is interesting” is a hard question, and is even hard to
define as it is an essentially human construct (Beale, 2007).
Another very important research issue is trust in recommender systems, as they have proven
to be an important response to the information overload problem, by providing end users
with more proactive and personalized information services in the past (O' Donovan & Smyth,
2005), but there are a lot of open research questions in the factors that play roles in guiding
recommendations, and must particularly emphasize gender and age (Ziefle, Röcker &
Holzinger, 2011). This is also related to user satisfaction, and (Herlocker, Konstan, Terveen
& Riedl, 2004) recommend that four questions deserve future attention: 1) For different
metrics, what is the level of change needed before end users notice or user behaviour
changes? 2) To which metrics are end users most sensitive? 3) How does end user sensitivity
to accuracy depend on other factors such as the interface? 4) How do factors such as
coverage and serendipity affect user satisfaction? Moreover, Herlocker et al. (2004) state that
if these questions are answered, it may be possible to build a predictive model of user
satisfaction that would permit more extensive offline evaluation. By the way, we emphasize
always the term end user, intentionally, as it will be of particular importance to focus on
particular end user groups, e.g. health practitioners, clinical doctors, biomedical researchers
where there will be great differences among them.
Also very interesting is the combination of collaborative filtering with content-based
approaches to recommender systems, i.e., approaches that make predictions based on
background knowledge of specific characteristics of end users, which is a huge topic in
preference learning (Fürnkranz, Hüllermeier, Cheng & Park, 2012).
Towards Interactive Recommender Systems with the Doctor-in-the-Loop 6
Today, recommender systems are assisting Web users in the daily process of identifying
items that fulfil their wishes, requirements, demands and needs and have been applied in E-
commerce settings for quite a while with extreme success and still needing much future
research (Felfernig et al., 2013).
Tomorrow, the next big thing is in the application of such system in the health informatics
domain, for the benefit of patients, doctors and hospital managers just everybody to stay
healthy and fit.
Acknowledgements
We would like to thank the anonymous reviewers for their constructive comments on an
earlier version of this manuscript. The authors thank the German Research Council DFG for
the friendly support of the research in the excellence cluster „Integrative Production
Technology in High Wage Countries“.
References
Akgul, C. B., Rubin, D. L., Napel, S., Beaulieu, C. F., Greenspan, H. & Acar, B. 2011. Content-Based
Image Retrieval in Radiology: Current Status and Future Directions. Journal of Digital Imaging,
24, (2), 208-222, doi:10.1007/s10278-010-9290-9.
Beale, R. 2007. Supporting serendipity: Using ambient intelligence to augment user exploration for
data mining and Web browsing. International Journal of Human-Computer Studies, 65, (5), 421-
433.
Croskerry, P. & Nimmo, G. 2011. Better clinical decision making and reducing diagnostic error. The
journal of the Royal College of Physicians of Edinburgh, 41, (2), 155-162.
Da Silva, E. Q., Camilo, C. G., Pascoal, L. M. L. & Rosa, T. C. 2016. An evolutionary approach for
combining results of recommender systems techniques based on collaborative filtering. Expert
Systems with Applications, 53, 204-218, doi:10.1016/j.eswa.2015.12.050.
Desrosiers, C. & Karypis, G. 2011. A comprehensive survey of neighborhood-based recommendation
methods. Recommender Systems Handbook, 107-144.
Felfernig, A., Jeran, M., Ninaus, G., Reinfrank, F. & Reiterer, S. 2013. Toward the next generation of
recommender systems: applications and research challenges. Multimedia Services in Intelligent
Environments. Springer, pp. 81-98.
Fernandez-Luque, L., Karlsen, R. & Vognild, L. K. Challenges and opportunities of using
recommender systems for personalized health education. MIE, 2009. 903-907.
Fürnkranz, J., Hüllermeier, E., Cheng, W. & Park, S.-H. 2012. Preference-based reinforcement
learning: a formal framework and a policy iteration algorithm. Machine Learning, 89, (1-2), 123-
156, doi:10.1007/s10994-012-5313-8.
Gigerenzer, G. & Gaissmaier, W. 2011. Heuristic Decision Making. Annual Review of Psychology, 62,
451-482, doi:10.1146/annurev-psych-120709-145346.
Goldberg, D., Nichols, D., Oki, B. M. & Terry, D. 1992. Using collaborative filtering to weave an
information tapestry. Communications of the ACM, 35, (12), 61-70.
Towards Interactive Recommender Systems with the Doctor-in-the-Loop 7
Herlocker, J. L., Konstan, J. A., Terveen, K. & Riedl, J. T. 2004. Evaluating collaborative filtering
recommender systems. ACM Transactions on Information Systems, 22, (1), 5-53,
doi:10.1145/963770.963772.
Holzinger, A. 2014. Lecture 8 Biomedical Decision Making: Reasoning and Decision Support.
Biomedical Informatics. Springer, pp. 345-377.
Holzinger, A. 2016a. Interactive Machine Learning (iML). Informatik Spektrum, 39, (1), 64-68,
doi:10.1007/s00287-015-0941-6.
Holzinger, A. 2016b. Interactive Machine Learning for Health Informatics: When do we need the
human-in-the-loop? Springer Brain Informatics (BRIN), 3, (2), 119-131, doi:10.1007/s40708-016-
0042-6.
Jannach, D., Zanker, M., Felfernig, A. & Friedrich, G. 2010. Recommender systems: an introduction,
Cambridge University Press.
Jordan, M. I. & Mitchell, T. M. 2015. Machine learning: Trends, perspectives, and prospects. Science,
349, (6245), 255-260, doi:10.1126/science.aaa8415.
Kieseberg, P., Frühwirt, P., Weippl, E. & Holzinger, A. 2015. Witnesses for the Doctor in the Loop. In:
Guo, Y., Friston, K., Aldo, F., Hill, S. & Peng, H. (eds.) Brain Informatics and Health, Lecture
Notes in Artificial Intelligence LNAI 9250. Cham, Heidelberg, Berlin: Springer, pp. 369-378,
doi:10.1007/978-3-319-23344-4_36.
Kieseberg, P., Malle, B., Frühwirt, P., Weippl, E. & Holzinger, A. 2016. A tamper-proof audit and
control system for the doctor in the loop. Brain Informatics, 1-11, doi:10.1007/s40708-016-0046-2.
Kieseberg, P., Weippl, E. & Holzinger, A. 2016. Trust for the Doctor-in-the-Loop. European Research
Consortium for Informatics and Mathematics (ERCIM) News: Tackling Big Data in the Life
Sciences 104, (1), 32-33.
Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. 2015. Human-level concept learning through
probabilistic program induction. Science, 350, (6266), 1332-1338, doi:10.1126/science.aab3050.
Le Cun, Y., Bengio, Y. & Hinton, G. 2015. Deep learning. Nature, 521, (7553), 436-444,
doi:10.1038/nature14539.
Mcneil, B. J., Keeler, E. & Adelstein, S. J. 1975. Primer on Certain Elements of Medical Decision
Making. New England Journal of Medicine, 293, (5), 211-215,
doi:doi:10.1056/NEJM197507312930501.
Miller, P. L. & Sittig, D. F. 1990. The Evaluation of Clinical Decision Support Systems - What is
Necessary versus What is Interesting. Medical Informatics, 15, (3), 185-190.
Mobasher, B., Burke, R., Bhaumik, R. & Williams, C. 2007. Toward trustworthy recommender
systems: An analysis of attack models and algorithm robustness. ACM Transactions on Internet
Technology (TOIT), 7, (4), 23, doi:10.1145/1278366.1278372.
O' Donovan, J. & Smyth, B. Trust in recommender systems. Proceedings of the 10th international
conference on Intelligent user interfaces (IUI 2005), 2005. ACM, 167-174,
doi:10.1145/1040830.1040870.
Resnick, P. & Varian, H. R. 1997. Recommender systems. Communications of the ACM, 40, (3), 56-58,
doi:10.1145/245108.245121.
Towards Interactive Recommender Systems with the Doctor-in-the-Loop 8
Samuel, A. L. 1959. Some studies in machine learning using the game of checkers. IBM Journal of
research and development, 3, (3), 210-229, doi:10.1147/rd.33.0210.
Settles, B. 2011. From theories to queries: Active learning in practice. In: Guyon, I., Cawley, G., Dror,
G., Lemaire, V. & Statnikov, A. (eds.) Active Learning and Experimental Design Workshop 2010.
Sardinia: JMLR Proceedings, pp. 1-18.
Sezgin, E. & Ozkan, S. A systematic literature review on Health Recommender Systems. E-Health and
Bioengineering Conference (EHB), 2013, 2013. IEEE, 1-4.
Silvia, P. J. 2005. What is interesting? Exploring the appraisal structure of interest. Emotion, 5, (1), 89.
Srivastava, J., Cooley, R., Deshpande, M. & Tan, P.-N. 2000. Web usage mining: Discovery and
applications of usage patterns from web data. ACM SIGKDD Explorations Newsletter, 1, (2), 12-
23.
Taraghi, B., Grossegger, M., Ebner, M. & Holzinger, A. 2013. Web Analytics of user path tracing and
a novel algorithm for generating recommendations in Open Journal Systems. Online Information
Review, 37, (5), 672-691, doi:10.1108/OIR-09-2012-0152.
Tulabandhula, T. & Rudin, C. 2014. On combining machine learning with decision making. Machine
Learning, 97, (1-2), 33-64, doi:10.1007/s10994-014-5459-7.
Wang, H., Wang, N. & Yeung, D.-Y. Collaborative deep learning for recommender systems.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, 2015. ACM, 1235-1244.
Wilson, A. G., Dann, C., Lucas, C. & Xing, E. P. The Human Kernel. In: Cortes, C., Lawrence, N. D.,
Lee, D. D., Sugiyama, M. & Garnett, R., eds. Advances in Neural Information Processing Systems,
NIPS 2015, 2015 Montreal. 2836-2844.
Zhou, T., Kuscsik, Z., Liu, J. G., Medo, M., Wakeling, J. R. & Zhang, Y. C. 2010. Solving the apparent
diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of
Sciences of the United States of America, 107, (10), 4511-4515, doi:10.1073/pnas.1000488107.
Ziefle, M., Klack, L., Wilkowska, W. & Holzinger, A. 2013. Acceptance of Telemedical Treatments
A Medical Professional Point of View. In: Yamamoto, S. (ed.) Human Interface and the
Management of Information. Information and Interaction for Health, Safety, Mobility and Complex
Environments, Lecture Notes in Computer Science LNCS 8017. Berlin Heidelberg: Springer pp.
325-334, doi:10.1007/978-3-642-39215-3_39.
Ziefle, M., Röcker, C. & Holzinger, A. 2011. Medical Technology in Smart Homes: Exploring the
User's Perspective on Privacy, Intimacy and Trust. 35th Annual IEEE Computer Software and
Applications Conference Workshops COMPSAC 2011. Munich: IEEE, pp. 410-415,
doi:10.1109/COMPSACW.2011.75.
Towards Interactive Recommender Systems with the Doctor-in-the-Loop 9
Authors
Holzinger, Andreas
Currently, Andreas is Visiting Professor for Machine Learning in
Health Informatics at the Faculty of Informatics at Vienna University
of Technology. His research interests are in supporting human
intelligence with machine learning to help to solve problems in
biomedical informatics and the life sciences. Andreas obtained a
Ph.D. in Cognitive Science from Graz University in 1998 and his
Habilitation in Computer Science from Graz University of
Technology in 2003. Andreas is Associate Editor of Knowledge and
Information Systems (KAIS), and member of IFIP WG 12.9
Computational Intelligence.
Calero Valdez, André
André Calero Valdez has studied computer science at the RWTH
Aachen University and holds a PhD in Psychology also from RWTH
Aachen University. He a senior researcher at the Human-Computer
Interaction Center of the RWTH Aachen University and visiting
professor with the HCI-KDD group in Graz, Austria. His thesis dealt
with the topic of user-centered design of small screen devices for
diabetes patients. He currently conducts research in the topics of
knowledge management, social media, and decision support by
visualizations. The aim is to manage complexity of information by
applying human-computer interaction principles.
Ziefle, Martina
Martina Ziefle holds the chair of communication science and is
founding member of the Human-Computer Interaction Center of the
RWTH Aachen University. Her research addresses the
communication between human-human and human-machine with the
research focus on technology acceptance for various technologies
with respect to user diversity.
... HITL approach varies among researchers, with some emphasizing the quality impact of human interaction, while others focus on supervision and feedback, or reducing the cost of labeling data by integrating human knowledge and experience. According to Arambepola et al. [1], HITL is the combination of both human and machine intelligence supporting the creation of Machine Learning (ML) models, The goal of HITL is to provide intelligent and efficient automation for system improvements through human intervention [6]. Arambepola et al. define HITL as a method that allows users to interact with the system and provide extra data and criteria to evaluate its performance or "fix" potential problems that may arise in real-world applications, such as equity assessments and biases. ...
... From the result of 13 articles, we review their abstracts, keywords, and conclusions, remaining 7 papers to the final list. The articles that made it to the final list are [1,2,4,6,9,10,14]. ...
... The work presented in [14] and [6] is a specialization of HITL. [14] proposes a HITL process named teacher-in-the-loop in the customization of Multimodal Learning Analytics (MMLA) solutions that uses data from different sources such as video, audio, and text to better understand the learning process. ...
Chapter
The concept of “human-in-the-loop” (HITL) has gained increased attention in the field of educational recommendation systems (ERS). ERS aims to provide personalized learning experiences by suggesting relevant learning resources or activities to individual learners. HITL in ERS involves incorporating human intervention and decision making into the recommendation process, leveraging the unique capabilities of both human and machine algorithms. This approach recognizes the importance of human expertise, preferences, and context in the learning process and seeks to enhance the effectiveness and relevance of recommendations by actively involving users in the recommendation process. In this paper, we explore the concept of HITL in ERS by performing a systematic review of the literature. The systematic review examines the key components of HITL in ERS, including the integration of human feedback, the role of machine learning algorithms, and the impact on the overall recommendation process. The findings of this systematic review provide insights into the current state of HITL in educational recommendation systems and highlight areas for future research and development in this promising field.
... However, some of the recent researchers have investigated the effectiveness of the AI-based systems which includes a part for humans instead of completely automating a system by removing human involvement from the task (Bhardwaj et al., 2014). This concept is called Human-in-the-Loop (HITL) and the main objective of this approach is to provide efficient, intelligent automation for system improvements through human feedback (Holzinger, Valdez and Ziefle, 2016). Here, humans are directly involved in the training, tuning and testing of the ML algorithms. ...
... One way is, making decisions through a specialized professional in the domain (Zanzotto, 2019). For example, a domain specialist doctor can work as an adviser for the Doctor-in-the-loop system in the medical field (Holzinger, Valdez and Ziefle, 2016). On the other hand, machines make decisions based on the knowledge extracted from data. ...
Conference Paper
Full-text available
Employing cuprous oxide (Cu2O) photoelectrodes in photoelectrochemical cells to generate hydrogen by water splitting is beneficial. Conventionally, it is limited in practice because of the well-known reasons of its inherent corrosiveness and poor conversion efficiencies. In this study, we have investigated the possibility of improving the efficiency of Cu2O photoelectrode in the form of p-n homojunction together with sulphidation. Initially, the optimum pH values for the n- and p-Cu2O thin film deposition baths are determined as 6.1 and 13 for Ti/n-Cu2O/p-Cu2O in photoelectrochemical cell configuration. Then, at these pH values the duration of n- and p-Cu2O thin film deposition is optimized by forming Ti/n-Cu2O/p-Cu2O photoelectrode. In this study, we found that at 45 minutes of n-Cu2O and 50 minutes of p-Cu2O thin film deposition together with sulphidation forms relatively high efficient Ti/n-Cu2O/p-Cu2O photoelectrode resulting Solar-To-Hydrogen (STH) conversion efficiency of 0.9%. In addition, current-voltage characteristic of the best Cu2O homojunction photoelectrode exhibits more negative shift in onset of photocurrent which indicates that photocurrent generation and transportation have improved by the formation of homojunction and further been enhanced by sulphidation.
... However, some of the recent researchers have investigated the effectiveness of the AI-based systems which includes a part for humans instead of completely automating a system by removing human involvement from the task (Bhardwaj et al., 2014). This concept is called Human-in-the-Loop (HITL) and the main objective of this approach is to provide efficient, intelligent automation for system improvements through human feedback (Holzinger, Valdez and Ziefle, 2016). Here, humans are directly involved in the training, tuning and testing of the ML algorithms. ...
... One way is, making decisions through a specialized professional in the domain (Zanzotto, 2019). For example, a domain specialist doctor can work as an adviser for the Doctor-in-the-loop system in the medical field (Holzinger, Valdez and Ziefle, 2016). On the other hand, machines make decisions based on the knowledge extracted from data. ...
Conference Paper
Full-text available
It is undeniable that modern computers are incredibly fast and accurate. However, computers cannot ‘think’ (act intelligently) as humans unless it is trained to learn from the past knowledge. Despite their intelligence, humans are comparatively slow in computational tasks. However, the combination of the computational capacity of computers and human intelligence could produce powerful systems beyond the imagination. This concept is called Human-in-the-Loop (HITL) where both human and machine intelligence support the creation of Machine Learning (ML) models. HITL design is an emerging technology which is used in many domains such as autonomous vehicle technology, health systems and interactive system implementations. In this research, we systematically reviewed past research of HITL systems with the objectives of identifying key benefits and limitations of the HITL design. This systematic review was conducted by analyzing 68 research papers published in top-ranked journals and conferences during the past decade. Moreover, the papers were selected using keyword-based searching and references of the most cited HITL research papers. The PRISMA model was used to exclude irrelevant papers, and keyword-based clustering was used to identify the frequent keywords in the selected papers. Although the HITL design often improves the performance of intelligent interactive systems, there are certain drawbacks of this concept when compared to fully manual or fully automated systems such as making decisions with emotional bias and being unable to take actions when demanded. Thus, we comprehensively discuss the approaches proposed by the recent researchers to overcome some of the issues of the existing HITL designs.
... Further examples and perspectives on human-in-the-loop approaches can be found in[14,21,35]. Examples of domain-specific roles for humans include the doctor-in-the-loop[15] and the analyst-in-the-loop[6]. ...
Chapter
This study investigates three challenges for developing machine learning-based self-service web apps for consumers. First, we argue that user research must accompany the development of ML-based products so that they better serve users’ needs at all stages of development. Second, we discuss the data sourcing dilemma in developing consumer-oriented ML-based apps and propose a way to solve it by implementing an interaction design that balances the workload between users and computers according to the ML component’s performance. To dynamically define the role of the user-in-the-loop, we monitor user success and ML performance over time. Finally, we propose a lightweight typology of ML-based systems to assess the generalizability of our findings to other ML use cases. Our case study uses a newly developed web application that allows consumers to analyze their heating bills for potential energy and cost savings. Based on domain-specific data values extracted from user-provided document images, an assessment of potential savings is derived and reported back to the user.
... This is sometimes debated in AI (see also Demartini, 2015;Holzinger et al., 2016). It is an important aspect to consider in the teaching module: The human is shaped by AI, but the human can also shape the AI, that is the general idea in the HIS-concept of shaping and being shaped by technology. ...
Article
Full-text available
The domain of data science is a large field, combining statistics, computer science, and sociocultural issues. It is an open question which topics and which contents can and should be implemented in school, e.g., from the perspective of computer science education. A pilot course is designed by computer science and statistics educators at the Paderborn University, addressing upper secondary students within a design-based research project. This paper concentrates on the second of four modules, in which machine learning and neural networks are addressed. Some individual phases of the module are presented, followed by a metaperspective of the curriculum development that contributes to our project and further research questions.
... A promising future awaits snake identification as AI begins to compliment static photos, diagrams, and audio-visual media, interactive multiple-access keys, species checklists that can be customized to particular locations, dynamic range maps, and online communities in which people share species observations and identifications in "next-generation field guides" (Farnsworth et al., 2013). In particular, we emphasize the need to keep "humans in the loop" in order to validate labels in training datasets as well as AI predictions, particularly for healthcare applications (Holzinger, 2016;Holzinger et al., 2016). ...
Article
Full-text available
We trained a computer vision algorithm to identify 45 species of snakes from photos and compared its performance to that of humans. Both human and algorithm performance is substantially better than randomly guessing (null probability of guessing correctly given 45 classes = 2.2%). Some species (e.g., Boa constrictor) are routinely identified with ease by both algorithm and humans, whereas other groups of species (e.g., uniform green snakes, blotched brown snakes) are routinely confused. A species complex with largely molecular species delimitation (North American ratsnakes) was the most challenging for computer vision. Humans had an edge at identifying images of poor quality or with visual artifacts. With future improvement, computer vision could play a larger role in snakebite epidemiology, particularly when combined with information about geographic location and input from human experts.
... Information and above all metainformation can "drift" through user interaction -especially when algorithms determine the presentation of information (e.g., through evaluation, sympathy). The integration of human oversight into doctor-in-the-loop approaches could be interesting [17]. ...
Chapter
“In case of side effects please consult your physician or pharmacist”, used to be the advice for questions regarding the intake of medicine or other health-related issues. Nowadays, the Internet has become the favored place to find this kind of information. However, the quality of online health information is mixed. This becomes an issue when people use online information for important health decisions. According to which criteria do users select the found information? To understand which elements on a website convince people to trust the information or not, we have conducted a study with two objectives: first, to identify factors that trigger credibility; second, to investigate to what extent both the media presentation and the severity of the associated disease influence the assessment of credibility. Possible factors were first collected in three focus groups (N = 17) and then operationalized in a questionnaire. We collected 184 responses, presenting and evaluating three different health websites with different disease complexity and severity (mild vs. life-threatening). The results show that complex information is preferred for more serious diseases. In addition, the disease has a significant influence on the criteria.
Article
Full-text available
This paper presents the work done on recommendations of healthcare related journal papers by understanding the semantics of terms from the papers referred by users in past. In other words, user profiles based on user interest within the healthcare domain are constructed from the kind of journal papers read by the users. Multiple user profiles are constructed for each user based on different categories of papers read by the users. The proposed approach goes to the granular level of extrinsic and intrinsic relationship between terms and clusters highly semantically related relevant domain terms where each cluster represents a user interest area. The semantic analysis of terms is done starting from co-occurrence analysis to extract the intra-couplings between terms and then the inter-couplings are extracted from the intra-couplings and then finally clusters of highly related terms are formed. The experiments showed improved precision for the proposed approach as compared to the state-of-the-art technique with a mean reciprocal rank of 0.76.
Article
Full-text available
Nowadays, a vast amount of clinical data scattered across different sites on the Internet hinders users from finding helpful information for their well-being improvement. Besides, the overload of medical information (e.g., on drugs, medical tests, and treatment suggestions) have brought many difficulties to medical professionals in making patient-oriented decisions. These issues raise the need to apply recommender systems in the healthcare domain to help both, end-users and medical professionals, make more efficient and accurate health-related decisions. In this article, we provide a systematic overview of existing research on healthcare recommender systems. Different from existing related overview papers, our article provides insights into recommendation scenarios and recommendation approaches. Examples thereof are food recommendation, drug recommendation, health status prediction, healthcare service recommendation, and healthcare professional recommendation. Additionally, we develop working examples to give a deep understanding of recommendation algorithms. Finally, we discuss challenges concerning the development of healthcare recommender systems in the future.
Chapter
Full-text available
The rapid growth of digital health information has elevated the application and egress of data analytics healthcare industry. One proposed solution, health recommender systems (HRS) have emerged for patient-oriented decision making to recommend better healthcare advice based on profile health records (PHR) and patient databases. The HRS can enhance healthcare systems and simultaneously manage patients suffering from a range of different diseases employing predictive analytics and recommending appropriate treatments. A content-based recommender system (CBRS) is a customized HRS approach that concentrates on the evaluation of a patient’s history and ‘learns’, through machine learning (ML), to generate predictions. Additionally, CBRS intends to offer individualized and trusted information to the patient’s regarding their health status. The CBRS is usually applied in case of medical document recommenders where patients give their preferences after receiving recommendations in the form of ratings where positively ranked items are recommended to the patient. The CBRS and associated popular ML algorithms are discussed in this chapter. Subsequently, the basic concepts, feature extraction methods, similarity measure, and ranking are presented and discussed. The privacy preservation phase is also discussed, particularly how data is protected, and intruders prohibited from altering valuable information. Finally, the challenges and open issues are deliberated.
Article
Full-text available
The "doctor in the loop" is a new paradigm in information driven medicine, picturing the doctor as authority inside a loop supplying an expert system with data and information. Before this paradigm is implemented in real environments, the trustworthiness of the system must be assured. The “doctor in the loop” is a new paradigm in information driven medicine, picturing the doctor as authority inside a loop with an expert system in order to support the (automated) decision making with expert knowledge. This information not only includes support in pattern finding and supplying external knowledge, but the inclusion of data on actual patients, as well as treatment results and possible additional (side-) effects that relate to previous decisions of this semi-automated system. The concept of the "doctor in the loop" is basically an extension of the increasingly frequent use of knowledge discovery for the enhancement of medical treatments together with the “human in the loop” concept (see [1], for instance): The expert knowledge of the doctor is incorporated into "intelligent" systems (e.g., using interactive machine learning) and enriched with additional information and expert know-how. Using machine learning algorithms, medical knowledge and optimal treatments are identified. This knowledge is then fed back to the doctor to assist him/her.
Article
Full-text available
The “doctor in the loop” is a new paradigm in information-driven medicine, picturing the doctor as authority inside a loop supplying an expert system with information on actual patients, treatment results, and possible additional (side-)effects, including general information in order to enhance data-driven medical science, as well as giving back treatment advice to the doctor himself. While this approach can be very beneficial for new medical approaches like P4 medicine (personal, predictive, preventive, and participatory), it also relies heavily on the authenticity of the data and thus increases the need for secure and reliable databases. In this paper, we propose a solution in order to protect the doctor in the loop against responsibility derived from manipulated data, thus enabling this new paradigm to gain acceptance in the medical community. This work is an extension of the conference paper Kieseberg et al. (Brain Informatics and Health, 2015), which includes extensions to the original concept.
Conference Paper
Full-text available
The “doctor in the loop” is a new paradigm in information driven medicine, picturing the doctor as authority inside a loop supplying an expert system with information on actual patients, treatment results and possible additional (side-)effects, as well as general information in order to enhance data driven medical science, as well as giving back treatment advice to the doctor himself. While this approach offers several positive aspects related to P4 medicine (personal, predictive, preventive and participatory), it also relies heavily on the authenticity of the data and increases the reliance on the security of databases, as well as on the correctness of machine learning algorithms. In this paper we propose a solution in order to protect the doctor in the loop against responsibility derived from manipulated data, thus enabling this new paradigm to gain acceptance in the medical community.
Article
Full-text available
Machine learning (ML) is the fastest growing field in computer science, and health informatics is among the greatest challenges. The goal of ML is to develop algorithms which can learn and improve over time and can be used for predictions. Most ML researchers concentrate on automatic machine learning (aML), where great advances have been made, for example, in speech recognition, recommender systems, or autonomous vehicles. Automatic approaches greatly benefit from big data with many training sets. However, in the health domain, sometimes we are confronted with a small number of data sets or rare events, where aML-approaches suffer of insufficient training samples. Here interactive machine learning (iML) may be of help, having its roots in reinforcement learning, preference learning, and active learning. The term iML is not yet well used, so we define it as “algorithms that can interact with agents and can optimize their learning behavior through these interactions, where the agents can also be human.” This “human-in-the-loop” can be beneficial in solving computationally hard problems, e.g., subspace clustering, protein folding, or k-anonymization of health data, where human expertise can help to reduce an exponential search space through heuristic selection of samples. Therefore, what would otherwise be an NP-hard problem, reduces greatly in complexity through the input and the assistance of a human agent involved in the learning phase.
Article
Recommender systems (RS) are often used as guides, helping users to discover products of their interest. Many techniques and approaches to generate an effective recommendation are available for the system designers. On the one hand, this is interesting because different application's scenarios could have a fittest solution but on the other it can also cause some complexity to select the best technique to address at each state of the database. Thus, choose the best technique for each new state becomes too difficult and frequent for manually select. One of big challenges on RS is turn the techniques more useful for real-world scenarios. Therefore, automate or help the design decision is an important task to improve the usability of RS and reduce its cost. Although many works aims to improve the performance of RS for some scenarios, just a few of them try to help the designers on selection or combination of the techniques through applications' state changes. Therefore, this work proposes an evolutionary approach, called Invenire, to automate the choice of techniques used by combining results of different recommendation techniques. This is a new approach that uses a search algorithm to optimize the techniques combination, and can inspire hybrid methods and expert systems on how automate them. To evaluate the proposal, experiments were performed with a dataset from MovieLens and different collaborative filtering approaches. The results obtained show that the Invenire outperforms all collaborative filtering approach separately in all contexts addressed. The improvement achieved varies from 3.6% to 118.99% depending on the combination encountered and the experiment executed. Thus, the proposal was able to increase the accuracy on the generated recommendations and automate the combinations of techniques.
Article
People learning new concepts can often generalize successfully from just a single example, yet machine learning algorithms typically require tens or hundreds of examples to perform with similar accuracy. People can also use learned concepts in richer ways than conventional algorithms—for action, imagination, and explanation. We present a computational model that captures these human learning abilities for a large class of simple visual concepts: handwritten characters from the world’s alphabets. The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion. On a challenging one-shot classification task, the model achieves human-level performance while outperforming recent deep learning approaches. We also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior.
Conference Paper
Recommendation systems work as a counselor, behaving in such a way to guide people in the discovery of products of interest. There are various techniques and approaches in the literature that enable generating recommendations. This is interesting because it emphasizes the diversity of options; on the other hand, it can cause doubt to the system designer about which is the best technique to use. Each of these approaches has particularities and depends on the context to be applied. Thus, the decision to choose among techniques become complex to be done manually. This article proposes an evolutionary approach for combining results of recommendation techniques in order to automate the choice of techniques and get fewer errors in recommendations. To evaluate the proposal, experiments were performed with a dataset from MovieLens and some of Collaborative Filtering techniques. The results show that the combining methodology proposed in this paper performs better than any one of collaborative filtering technique separately in the context addressed. The improvement varies from 9.02% to 48.21% depending on the technique and the experiment executed.