Conference PaperPDF Available

e-RARE: Interactive Diagnostic Assistance for Rare Diseases through Dynamic Taxonomies

Conference Paper

e-RARE: Interactive Diagnostic Assistance for Rare Diseases through Dynamic Taxonomies

Abstract and Figures

A system for the computer-aided guided interactive diagnosis of rare diseases is presented. It is based on dynamic taxonomies, a knowledge management model that allows the guided interactive exploration of complex information bases. The problem of the diagnosis of pathologies is recasted as a problem of exploring and thinning out candidate pathologies on the basis of clinical signs and other observable features. Our early experience with this tool is discussed both in terms of advantages with respect to existing approaches, and in terms of extensions and improvements.
Content may be subject to copyright.
e-RARE: interactive diagnostic assistance for rare diseases through dynamic
taxonomies
Giovanni Maria Sacco
Dipartimento di Informatica, Università di Torino
sacco@di.unito.it
Abstract
A system for the computer-aided guided interactive
diagnosis of rare diseases is presented. It is based on
dynamic taxonomies, a knowledge management model
that allows the guided interactive exploration of
complex information bases. The problem of the
diagnosis of pathologies is recasted as a problem of
exploring and thinning out candidate pathologies on
the basis of clinical signs and other observable
features. Our early experience with this tool is
discussed both in terms of advantages with respect to
existing approaches, and in terms of extensions and
improvements.
1. Introduction
Medical diagnosis and treatment are becoming
increasingly difficult because of the increase of
diseases that are not locally endemic and the
emergence of very rare pathologies that affect an
extremely small percentage of the population.
Physicians are very often required to diagnose and
correctly treat diseases that they are not familiar with
and might have not even heard of. Especially in this
context, but also with more frequent pathologies,
computer assistance can become an essential tool in
healthcare, especially if it supports all relevant
information such as protocols, exceptions, best
practices, condition-specific guidelines, etc, in addition
to diagnosis.
The field of computerized diagnosis has a relatively
long history, dating back to the 70’s. However,
traditional approaches to computer-assisted diagnosis,
based on knowledge-based systems, have not been
successful. Several theories of diagnosis were proposed
in the past, with an evolution from systems in which
diagnostic knowledge from experts was captured in the
form of empirical classification rules [4] to theoretical
and model-based approaches, such as set-covering [8],
abductive diagnosis [5] and consistency-based
diagnosis [6].
Regardless of the theory of diagnosis used, the
principal limitation of traditional diagnostic systems is
their system-centric rather than user-centric approach:
the system is in charge of diagnosis, and the user is
basically an input device, used to supply the system
with observations. Not taking users into account is one
of the known causes of failure in knowledge-based
systems [3]. First, it dramatically limits their
application in areas such as healthcare, where highly
skilled professional users do not readily accept a slave-
master relationship. Second, traditional systems are
often unable to explain to users the reasons for their
diagnosis. Finally, the cost of building and maintaining
a diagnostic system based on AI techniques may very
well be so large as to be impractical.
Alternatively, the diagnostic problem can be
recasted in terms of information access rather than in
terms of artificial intelligence. Complex knowledge-
based architectures can be replaced by a collection of
(possibly standardized) electronic texts that describe
pathologies, which can be searched by traditional
techniques such as information retrieval. Although this
approach greatly simplifies knowledge base creation
and maintenance, the limitations of information
retrieval have been known for some time [1] and they
indicate that locating the right information may be
quite hard, and the result usually not exhaustive.
More recently, systems based on hypertext
technology have been proposed. An example is
OncoDoc, a system for the assisted selection of clinical
practice guidelines for cancer treatment [2, 15].
OncoDoc encodes domain knowledge in the form of a
decision tree implemented through a number of pages
linked by hypertext links: the physician is presented
with a sequence of choices that lead her to the
guideline to be applied. Systems based on decision
trees are less system-centric than conventional
knowledge-based diagnostic systems, but user
interaction is still quite rigid and follows predefined
paths. Creation and maintenance are expensive, and the
addition of a single new pathology may well disrupt
the entire structure, and consequently user familiarity
with the system.
The application of dynamic taxonomies [9, 10] to
the problem of diagnosing pathologies from their
clinical signs and patient features (age, country, etc.)
was proposed by Sacco [12, 14]. Dynamic taxonomies
(also more recently known as faceted search systems)
are a general knowledge management model, and are
applied to very diverse areas [13]. The idea in this
present context is to use their exploratory search
paradigm to allow users to focus on a clinical sign, and
immediately see at the same time which pathologies
exhibit that sign, and, most importantly, which other
clinical signs are related to the currently selected
sign(s). The taxonomic summary of related clinical
signs guide the user to refine his search, but leave him
completely in charge of the interaction: it is his
responsibility to select the appropriate signs among the
available signs, though the system guides him by
discarding unrelated signs that would result in a dead-
end.
As discussed in [12], this approach is especially
interesting in applications such as the early screening
of rare diseases, which must be performed by
physicians who are not familiar with these pathologies.
Rare or “orphan” diseases are diseases that affect an
extremely low percentage of patients: some of these
diseases have less than ten known cases over the entire
world population. There is a continuing effort in
studying these diseases and at present almost 6000
different rare diseases are described in Orphanet [7],
currently the most authorative database in this field.
The system discussed here is based on a subset of
this database (over 2000 diseases) that is classified by
clinical signs, and works as an exploratory search
front-end to the Orphanet database, replacing its
diagnostic system. It is freely available to the public at
www.erare.di.unito.it.
2. Dynamic Taxonomies Reviewed
Dynamic taxonomies are a general knowledge
management model for complex, heterogeneous
information bases. The intension is a taxonomy that
does not require any other relationships in addition to
subsumptions (e.g., IS-A and PART-OF relationships).
Differently from traditional taxonomies, dynamic
taxonomies are multidimensional, i.e. items are
classified under several topics at any level of
abstraction. A concept C is just a label that identifies
the set of the items classified directly under C or under
any of C’s descendants (i.e. the deep extension of C).
There are two important consequences of this
approach. First, logical operations (and, or, not) on
concepts can be performed by the corresponding set
operations on their extension. Second, dynamic
taxonomies can find all the concepts related to a given
concept C: these concepts represent the conceptual
summary of C. Concept relationships other than
subsumptions are inferred through the extension only,
according to the following extensional inference rule:
two concepts A and B are related iff there is at least
one item D in the infobase which is classified at the
same time under A (or under one of A’s descendants)
and under B (or under one of B’s descendants). For
example, we can infer a (unnamed) relationship
between Michelangelo and Rome, if an item that is
classified under Michelangelo and Rome exists in the
infobase. At the same time, since Rome is a descendant
of Italy, also a relationship between Michelangelo and
Italy can be inferred. The extensional inference rule
can be seen as a device to infer relationships on the
basis of empirical evidence.
Dynamic taxonomies can be used to browse and
explore the infobase in several ways. Normally, the
user is initially presented with a tree representation of
the initial taxonomy for the entire infobase. Each
concept label has also a count of all the items classified
under it (i.e. the cardinality of items(C) for all C’s).
The initial user focus F is the universe (i.e. all the items
in the infobase). In the simplest case, the user can then
select a concept C in the taxonomy and zoom over it.
The zoom operation changes the current state in two
ways. First, concept C is used to refine the current
focus F, which becomes F=Fitems(C). Items not in
the focus are discarded. Second, the tree representation
of the taxonomy is modified in order to summarize the
new focus. All and only the concepts related to F are
retained and the count for each retained concept C’ is
updated to reflect the number of items in the focus F
that are classified under C’. The reduced taxonomy is a
conceptual summary of the set of documents identified
by F, exactly in the same way, as the original
taxonomy was a conceptual summary of the universe.
The retrieval process can then be seen as an iterative
thinning of the information base: the user selects a
focus, which restricts (thins out) the information base
by discarding all the items not in the current focus.
Only the concepts used to classify the items in the
focus, and their ancestors, are retained. These concepts,
which summarize the current focus, are those and only
those concepts that can be used for further refinements.
From the human computer interaction point of view,
the user is effectively guided to reach his goal, by a
clear and consistent listing of all possible alternatives.
3. Rare diseases and Orphanet
Rare diseases are diseases that affect an extremely
low percentage of the world population. They are also
called orphan diseases because it is not economically
feasible to conduct pharmacological research without
public subsidies for these pathologies. Rare diseases
are infrequent over the entire world population. Hence,
they do not include diseases that can be endemic in
some regions and extremely rare elsewhere, although
from our perspective the diagnostic problems and the
solutions are similar.
The single major problem in this context is the early
screening of rare pathologies that has to be performed
by general practitioners with inherently no experience
in the pathologies they have to deal with. This
screening does not require actual diagnosis or
treatment, but rather, on the basis of a suspect
pathology, the physician will contact a specific expert
group on that pathology for subsequent care.
The critical point is therefore how to provide timely
and complete information to the general physician, and
how to assist her in arriving at candidate pathologies.
This is the mission of Orphanet, established in 1997 by
the French Ministry of Health (Direction Générale de
la Santé) and the INSERM (Institut National de la
Santé et de la Recherche Médicale), and directed by
Dr. Ségolène Aymé. Orphanet is a portal for rare
diseases, providing all the available information on
rare diseases, but also on research/treatment groups,
laboratories, resources for patients, etc: in short, a
healthcare encyclopedia for rare diseases.
In this context, we are interested into the function
“search by signs” provided by the portal. This function
works on over 2000 diseases that have been classified
by clinical signs (unfortunately, this means that over
4000 diseases are still not classified). In Orphanet, this
diagnostic section is organized as an information
retrieval tool that allows the user to enter up to 5 signs,
which either directly entered or selected through a
taxonomy of signs (called, somewhat improperly, a
thesaurus). Figure 1 shows access to the taxonomy,
figure 2 shows the result of a query on macrocephaly
presented as a flat list of diseases that exhibit that sign.
4. e-RARE: access through dynamic
taxonomies
e-RARE is designed as a dynamic taxonomy front-
end to the database created by Orphanet. The target is
the general physician who has the responsibility of the
early screening of potential rare diseases. So the
system is currently targeted neither to the layman, nor
to the specialist in rare diseases. The current
application is powered by Knowledge Processors’
Universal Knowledge Processor [16], the first web-
based commercial dynamic taxonomy system, donated
by Knowledge Processors for this application. Based
on patented techniques, this engine provides real-time
operations even on very large datasets.
The main idea here is to cast the diagnostic problem
in terms of guided information access, rather than in
terms of logic manipulation or in terms of traditional
information retrieval. In the dynamic taxonomy used
here, items are pathological situations. Each item is
classified under one or more concepts that represent
features that characterize it. Such features may be
clinical signs or causes of the pathology, but also
locations in which a pathological situation is common,
or age groups that exhibit it, etc. In short, all the
available information that can be used to characterize
an item can be used as features. As an example, in the
present application, in addition to clinical signs, the
Age of Onset is used. Signs and features are
taxonomically organized in order to support abstraction
(generalization/specialization) in accessing the
knowledge base and to simplify the access to a specific
feature, especially when many features are used in
general applications.
The diagnostic process is an interactive exploration
of the knowledge base. Initially, the user is presented
with the original taxonomy of features. From this
taxonomy, he will focus on the first symptom, say
Head: Skull abnormality: Macrocephaly. The zoom
operation will present him with a reduced taxonomy in
which all and only those features, e.g. Systems:
Neurological functional anomalies:
Ataxia/incoordination, related to macrocephaly are
retained (see figure 3). Related features are
automatically computed on the basis of the extensional
inference rule defined in the previous section: thus
macrocephaly and ataxia are related iff there is an item
in the knowledge base that is classified under both
concepts (or one of their descendants). Figure 4 shows
the pathologies that exhibit both signs. These
symptoms can be used to further restrict the focus, and
hence the number of candidate items to be manually
inspected.
Although dynamic taxonomies support any boolean
combinations of concepts, a simpler refinement in
AND is used here: i.e., at each stage the user selects a
concept that is AND’ed with the current focus. This
seems more appropriate for assertive interactions such
as diagnosis, in which an observed clinical sign is
selected at each stage.
The reduced taxonomy guides the user to reach his
goal because it indicates related features and symptoms
and therefore focuses the attention of the user. At the
same time, he can see unexpected features that can be
used to discriminate among items. This is especially
important, since we believe that the practical
importance of diagnostic systems is in dealing with the
unknown and unexpected as in the case at hand.
Differently from traditional diagnosis, the
diagnostic process can terminate at any time, and
usually when the number of candidate items in the
current focus is sufficiently small for manual
inspection. The goal here is not to find the exact set of
pathologies, but rather a sufficiently small set of
candidate pathologies that the user will finally and
independently evaluate.
A slightly modified version of the Orphanet
taxonomy is used. The original taxonomy had a top-
level fanout of 43, which is much larger than the fanout
of 10 recommended by [10], and requires the scan of
too many elements for discrimination. In order to
reduce this initial effort, an additional top-level level,
shown in figure 4, was added. The access taxonomy is
initially provided in English and Italian, but the
extension to the other languages supported by
Orphanet (French, Spanish and German) is planned in
the short period. Multilingual support is one of the
important benefits of dynamic taxonomies [10].
5. Conclusions and Future Research
The system was not yet available to the general
public at the time of writing this paper, and was shown
to and used by only a limited number of physicians.
The initial feedback was encouraging: the summary by
taxonomic signs was considered very important by
physicians not familiar with rare diseases, who fully
understood its importance in guiding the search, in the
actual discovery of unknown candidate pathologies,
and finally in the confidence of having considered all
the aspects of the problem before arriving at a
conclusion. We believe this last benefit of diagnostic
systems based on dynamic taxonomies to be especially
relevant, because it is very hard in practice to be
exhaustive and guarantee a high standard quality on
diagnosis.
On the other hand, experts in rare diseases tended to
use the system, improperly, in a bottom-up way:
starting from diseases of which they knew the signs,
they used the taxonomic feedback only as a check on
the correctness of classification. They did not need and
consequently often failed to see the importance of
taxonomic sign summaries for exploration and
diagnosis. This is not surprising, and it must be
stressed that experts are not the target of this new
diagnostic paradigm.
Our current plans are in two directions: a practical
direction and a research direction. On the practical
side, we are working to extend the coverage of the
application, by supporting additional languages, and
most importantly, by integrating the additional
information provided by Orphanet on clinics,
laboratories and professionals in the diagnostic tool.
The goal is to provide the user with complete relevant
information at each stage.
Research is currently in progress on the extension of
dynamic taxonomies from boolean to fuzzy logic [11],
in order to capitalize on the current classification of
Orphanet that assigns a frequency of occurrence to the
various signs on a scale of Very frequent, Frequent and
Occasional. There is obviously no research challenge
in ranking pathologies by decreasing weights of search
signs, but using these weights to focus the user
attention on more relevant signs in the taxonomic
summary is an interesting problem involving both
modeling and human factors.
We conclude by noting that a thorough
classification by clinical signs and other features is a
sine qua non condition for the application of diagnostic
techniques based on dynamic taxonomies. Thus, the
pioneering work of Orphanet needs to be extended to
other rare diseases and, most importantly, to other
pathology domains, in order to benefit from the
important improvements offered by dynamic
taxonomies over traditional techniques.
6. References
[1] Blair, D. C., Maron, M. E., „An evaluation of retrieval
effectiveness for a full-text document-retrieval system”,
Comm of the ACM, 28:3, 1985, pp. 289 - 299
[2] Bouaud, J, et al., “Hypertextual navigation
operationalizing generic clinical practice guidelines for
patient-specific therapeutic decisions”, Proc AMIA Symp.,
1998, pp. 488-92.
[3] Brézillon, P., “Context in problem solving: A survey”,
The Knowledge Engineering Review, 14(1) 1999, pp.1-34.
[4] Buchanan, B. G., and Shortliffe, E.H., Rule-based Expert
Systems: the MYCIN Experiments of the Stanford Heuristic
Programming Project, Addison-Wesley, Reading, MA, 1984.
[5] Console, L., Theseider Dupré, D., and Torasso, P., “A
theory of diagnosis for incomplete causal models”, Proc. 11th
Intern. Joint Conf. on Artificial Intelligence, Detroit, MI,
1989, pp. 1311– 1317.
[6] de Kleer, J., Mackworth, A. K., and Reiter, R.,
“Characterizing diagnoses and systems”, Artificial
Intelligence 52, 1992, pp. 197–222.
[7] Orphanet, www.orpha.net
[8] Reggia, J. A., Nau, D. S., and Wang, Y., “Diagnostic
expert systems based on a set-covering model”, Internat. J.
Man-Machine Studies 19, 1983, pp. 437–460.
[9] Sacco, G. M., “Navigating the CD-ROM”, Proc. Int.
Conf. Business of CD-ROM, 1987
[10] Sacco, G. M., “Dynamic Taxonomies: A Model for
Large Information Bases”, IEEE Transactions on Knowledge
and Data Engineering 12, 2 (2000), p. 468-479.
[11] Sacco, G. M., “Uniform access to multimedia
information bases through dynamic taxonomies”,
IEEE 6th Int. Symp. on Multimedia Software Engineering,
Miami, December 13-15, 2004
[12] Sacco, G. M., “Guided interactive diagnostic systems”
IEEE Symposium on Computer-Based Medical Systems,
Dublin, 2005
[13] Sacco, G. M., “Some Research Results in Dynamic
Taxonomy and Faceted Search Systems”, SIGIR'2006
Workshop on Faceted Search, August 2006 Seattle, WA,
USA
[14] Sacco, G. M., “Guided interactive diagnostic
assistance”, in: Encyclopedia of Healthcare Information
Systems, Wickramasinghe N. and Geisler E. eds., Medical
Information Science Reference, 2008
[15] Séroussi B, et al., “OncoDoc, a successful experiment of
computer-supported guideline development and
implementation in the treatment of breast cancer”, Artif Intell
Med; 22(1), 2001, pp. 43–64.
[16] Knowledge Processors’ Universal Knowledge
Processor, www.knowledgeprocessors.com, US Patents
6,763,349 and 7,340,451 (other patents pending)
Figure 1 – Orphanet taxonomy and simple search interface
Figure 2 – Result for a query on Macrocephaly
Figure 3- Symptoms and diseases for Macrocephaly
Figure 4 – Symptoms and diseases for Macrocephaly and Ataxia
... A method for ranking the search results in faceted search based on fuzzy logic is presented in [17], and [47] develops a card sorting approach for specifying user facets independently from the indexing ontologies. The faceted (view-based) search paradigm [35, 15, 24, 40] is based on facet anal- ysis [34] , a classification scheme introduced in information sciences by S. R. Ranganathan already in the 1930's. The idea of faceted search has been invented and developed independently by several research groups, and is also called view-based search 32 [35] and dynamic taxonomies [40]. ...
... The faceted (view-based) search paradigm [35, 15, 24, 40] is based on facet anal- ysis [34] , a classification scheme introduced in information sciences by S. R. Ranganathan already in the 1930's. The idea of faceted search has been invented and developed independently by several research groups, and is also called view-based search 32 [35] and dynamic taxonomies [40]. ...
Chapter
Full-text available
Cultural heritage is a promising application domain for semantic web technologies due the semantic richness and heterogeneity of cultural content, and the distributed ways in which the content is created in memory organizations and by citizens. This chapter overviews issues and research related to creating semantic portals for publishing cultural heritage collections and other content on the web.
... We demonstrate how we have applied this approach over the data of FishBase 2 . However, the approach is generic, i.e. it can be applied over any kind of objects described by a number of attributes (or metadata), and it could be exploited for identifying a phenomenon in general; identification tasks are important in many domains, e.g. in patent search (for investigating whether an idea is already covered by existing patents) [3], for identifying a rare disease [7], for diagnostics of car breakdowns, and others. ...
Conference Paper
Full-text available
There are various ways and corresponding tools that the marine biologist community uses for identifying one species. Species identification is essentially a decision making process comprising steps in which the user makes a selection of characters, figures or photographs, or provides an input that restricts other choices, and so on, until reaching one species. In many cases such decisions should have a specific order, as in the textual dichotomous identification keys. Consequently, if a wrong decision is made at the beginning of the process, it could exclude a big list of options. To make this process more flexible (i.e. independent of the order of selections) and less vulnerable to wrong decisions, in this paper we investigate how an exploratory search process, specifically a Preference-enriched Faceted Search (PFS) process, can be used to aid the identification of species. We show how the proposed process covers and advances the existing methods. Finally, we report our experience from applying this process over data taken from FishBase, the most popular source for marine resources. The proposed approach can be applied over any kind of objects described by a number of attributes.
... The faceted search paradigm is based on facet analysis [103], a classification scheme introduced in information sciences by S. R. Ranganathan in the 1930's. The idea of faceted search [52] was invented and developed independently by several research groups, and is also called view-based search [117] and dynamic taxonomies [127]. The idea was integrated with Semantic Web ontologies in [65] and with fuzzy logic in [60]. ...
Article
Cultural Heritage (CH) data is syntactically and semantically heterogeneous, multilingual, semantically rich, and highly interlinked. It is produced in a distributed, open fashion by museums, libraries, archives, and media organizations, as well as individual persons. Managing publication of such richness and variety of content on the Web, and at the same time supporting distributed, interoperable content creation processes, poses challenges where traditional publication approaches need to be re-thought. Application of the principles and technologies of Linked Data and the Semantic Web is a new, promising approach to address these problems. This development is leading to the creation of large national and international CH portals, such as Europeana, to large open data repositories, such as the Linked Open Data Cloud, and massive publications of linked library data in the U.S., Europe, and Asia. Cultural Heritage has become one of the most successful application domains of Linked Data and Semantic...
... If content in semantic search is indexed using language-neutral concept URIs, and their labels are available in different languages, multilinguality can be supported. A widely employed semantic search and browsing technique in semantic CH portals is view-based or faceted search9596979899. Here, the user can make several simultaneous selections from orthogonal facets (e.g., object type, place, time, creator). ...
Chapter
Full-text available
This chapter turns to the application of semantic technologies to areas where text is not dominant, but rather audiovisual content in the form of images, 3D objects, audio, and video/television. Non-textual digital content raises new challenges for semantic technology in terms of capturing the meaning of that content and expressing it in the form of semantic annotation. Where such annotations are available in combination with expressive ontologies describing the target domain, such as television and cultural heritage, new and exciting possibilities arise for multimedia applications.
... These degrees can be provided by humans or be the result of automated tasks. For instance, in [19] they express the degree of a feature that is extracted from data, in [7] they express the certainty of membership of a document to an aspect (automatically retrieved from the answer set of a keyword query), in [17] the membership of an object to a cluster, in [8] the evaluation scores of products under hierarchically organized criteria, in [13] they express the frequency of symptoms in a disease. Furthermore in an open environment like the Web, we may have data of various degrees of credibility, as well data which are copies or modifications of other data. ...
Conference Paper
Full-text available
In several domains we have objects whose descriptions are accompanied by a degree expressing their strength. Such degrees can have various application specific semantics, such as relevance, precision, certainty, trust, etc. In this paper we consider Fuzzy RDF as the representation framework for such “weighted” descriptions and associations, and we propose a novel model for browsing and exploring such sources, which allows formulating complex queries gradually and through plain clicks. Specifically, and in order to exploit the fuzzy degrees, the model proposes interval-based transition markers. The advantage of the model is that it significantly increases the discrimination power of the interaction, without making it complex for the end user.
Poster
Full-text available
Eine Möglichkeit zur verbesserten Versorgung von Menschen mit seltenen Erkrankungen stellen Big Data Anwendungen dar. Innerhalb des BMG- geförderten Projekts "Einsatzmöglichkeiten und klinischer Nutzen von Big Data Anwendungen im Kontext seltener Erkrankungen (BIDA-SE)" werden ein Nutzungsszenario entwickelt und Empfehlungen für den Einsatz von Big Data Anwendungen für die Versorgung von Menschen mit seltenen Erkrankungen formuliert.
Conference Paper
There are various ways and corresponding tools that the marine biologist community uses for identifying one species. Species identification is essentially a decision making process comprising steps in which the user makes a selection of characters, figures or photographs, or provides an input that restricts other choices, and so on, until reaching one species. In many cases such decisions should have a specific order, as in the textual dichotomous identification keys. Consequently, if a wrong decision is made at the beginning of the process, it could exclude a big list of options. To make this process more flexible (i.e. independent of the order of selections) and less vulnerable to wrong decisions, in this paper we investigate how an exploratory search process, specifically a Preference-enriched Faceted Search (PFS) process, can be used to aid the identification of species. We show how the proposed process covers and advances the existing methods. Finally, we report our experience from applying this process over data taken from FishBase, the most popular source for marine resources. The proposed approach can be applied over any kind of objects described by a number of attributes.
Article
The initial diagnosis of rare diseases is difficult because they are infrequent and doctors do not often see or recognize their symptoms. Developing tools to assist in this diagnosis would provide a way to facilitate medical practice in this area. The broader goal of this project is to develop such a tool, which we name Rare Disease Discovery, RDD for short (http://disease-discovery.udl.cat). This tool is designed to identify rare diseases on the basis of a patient’s symptoms. To create it, several software entities were designed and integrated. First, a database of symptoms associated with every human rare disease known was designed and implemented. This database was derived from information available from Orphanet. Orphanet provides gold-standard data regarding rare diseases in the world. Second, a user-friendly website was also designed, implemented and connected to the database. This website connects the users, the database, and the third software entity, a disease prediction engine. Overall, we create an accurate, efficient and user-friendly diagnosis tool that can be quickly learnt and handled by medical doctors for the prediction of rare diseases.
Conference Paper
A system for computer-aided guided interactive diagnosis is presented. It is based on dynamic taxonomies, a knowledge management model that allows the guided interactive exploration of complex information bases. Clinical diagnosis is performed by exploring and thinning out candidate pathologies on the basis of clinical signs and other observable features in a guided yet practitioner-centered way. Early experience with this approach is discussed in terms of opportunities for a global approach to diagnosis and treatment.
Article
Full-text available
Context appears in Artificial Intelligence (AI) as a challenge for the coming years as shown by the various scientific events focusing on context held since 1995. However, context is already considered in other domains, such as Natural Language Processing, although through few aspects of context. We present in this paper a survey of the literature dealing directly and explicitly with context whatever the domain is. This permits us to have a clear view of the context in AI. One of the conclusions of this survey is to point out the existence of different types of context along knowledge representation, the mechanisms of reasoning on the knowledge, and the interaction of the computer system with humans.
Article
Full-text available
A novel taxonomic model for structuring and accessing large heterogeneous information bases is presented. The model is designed to simplify both classification and access by computer-illiterate people. It defines simple and intuitive operations to access large information bases at the conceptual level and at different levels of abstraction, in a totally assisted way, through a simple, yet effective visual interface. The model can also be used to summarize result sets computed by other query methods, such as information retrieval, shape retrieval, etc., and to provide user maps for complex hypermedia networks. The experience gained by applying this model to commercial applications is reported
Book
Full-text available
Full-text available online at http://www.shortliffe.net/Buchanan-Shortliffe-1984/MYCIN Book.htm or at http://aitopics.org/publication/rule-based-expert-systems-mycin-experiments-stanford-heuristic-programming-project Artificial intelligence, or AI, is largely an experimental science—at least as much progress has been made by building and analyzing programs as by examining theoretical questions. MYCIN is one of several well-known programs that embody some intelligence and provide data on the extent to which intelligent behavior can be programmed. As with other AI programs, its development was slow and not always in a forward direction. But we feel we learned some useful lessons in the course of nearly a decade of work on MYCIN and related programs. In this book we share the results of many experiments performed in that time, and we try to paint a coherent picture of the work. The book is intended to be a critical analysis of several pieces of related research, performed by a large number of scientists. We believe that the whole field of AI will benefit from such attempts to take a detailed retrospective look at experiments, for in this way the scientific foundations of the field will gradually be defined. It is for all these reasons that we have prepared this analysis of the MYCIN experiments.
Article
Full-text available
Despite the proliferation of implemented clinical practice guidelines, there is still little evidence of physicians compliance to formal standards. The ONCODOC project proposes a framework for elaborating generic decision support guidelines in a document-based paradigm with a knowledge-based approach. It has been first applied to assist clinicians in the treatment of breast cancer patients. Therapeutic expertise has been encoded as a decision tree. The decision process is driven by the clinician who interactively browses a hypertext version of the decision tree. During the navigation, he incrementally assigns values to decision parameters on the basis of his free interpretation of his patient's condition and thus builds a clinical context leading to patient-specific therapeutic recommendations. These guidelines are distributed on a hospital intranet and are evaluated at the point of care in an oncology department.
Article
An evaluation of a large, operational full-text document-retrieval system (containing roughly 350,000 pages of text) shows the system to be retrieving less than 20 percent of the documents relevant to a particular search. The findings are discussed in terms of the theory and practice of full-text document retrieval.
Article
Most approaches to model-based diagnosis describe a diagnosis for a system as a set of failing components that explains the symptoms. In order to characterize the typically very large number of diagnoses, usually only the minimal such sets of failing components are represented. This method of characterizing all diagnoses is inadequate in general, in part because not every superset of the faulty components of a diagnosis necessarily provides a diagnosis. In this paper we analyze the concept of diagnosis in depth exploiting the notions of implicate/implicant and prime implicate/implicant. We use these notions to consider two alternative approaches for addressing the inadequacy of the concept of minimal diagnosis. First, we propose a new concept, that of kernel diagnosis, which is free of this problem with minimal diagnosis. This concept is useful to both the consistency and abductive views of diagnosis. Second, we consider restricting the axioms used to describe the system to ensure that the concept of minimal diagnosis is adequate.
Article
This paper proposes that a generalization of the set covering problem can be used as an intuitively plausible model for diagnostic problem solving. Such a model is potentially useful as a basis for expert systems in that it provides a solution to the difficult problem of multiple simultaneous disorders. We briefly introduce the theoretical model and then illustrate its application in diagnostic expert systems. Several challenging issues arise in adopting the set covering model to real-world problems, and these are also discussed along with the solutions we have adopted.
Conference Paper
One of the problems of the recent approaches to problem solving based on deep knowledge is the lack of a formal treatment of incomplete knowledge. However, dealing with incomplete models is fundamental to many real- world domains. In this paper we propose a formal theory of causal diagnostic reasoning, dealing with different forms of incompleteness both in the general causal knowledge (missing or abstracted knowledge) and in the data describing a specific case under examination. Different forms of nonmonotonic reasoning (hypothetical and circumscriptive reasoning) are used in order to draw and confirm conclusions from incomplete knowledge. Multiple fault solutions are treated in a natural way and parsimony criteria arc used to rank alternative solutions.
Article
Originally published as textual documents, clinical practice guidelines have poorly penetrated medical practice because their editorial properties do not allow the reader to easily solve, at the point of care, a given medical problem. However, despite the proliferation of implemented clinical practice guidelines as decision support systems providing an easy access to patient-centered information, there is still little evidence of high physician compliance to guidelines recommendations. Apart from physicians' psychological reluctance, the incompleteness of guideline knowledge and the impreciseness of the terms used, another reason may be that, although suited to average patients, clinical practice guideline recommendations are not a substitute for the physician-controlled clinical judgement that should be applied to each actual individual patient. Therefore, computer-based approaches based on the automation of context-free operationalization of guideline knowledge, although providing uniform optimal strategies to problem-focused care delivery, may generate inappropriate inferences for a specific patient that the physician does not follow in practice. Rather than providing automated decision support, ONCODOC allows the clinician to control the operationalization of guideline knowledge through his hypertextual reading of a knowledge base encoded as a decision tree. In this way, he has the opportunity to interpret the information provided in the context of his patient, therefore, controlling his categorization to the closest matching formal patient. Experimented in life-size ONCODOC demonstrated good appropriation of the system by physicians with significantly high scores of compliance. We successfully tested the implemented strategy and the knowledge base in a second medical institution, giving then a noticeable example of reuse and sharing of encoded guideline knowledge across institutions.