Wolfgang Fuhl’s research while affiliated with University of Tübingen and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Figure 2. Proposed pipeline. (a) Sequences of length 1500 nt originating from four superkingdoms are used as input. (b) From each sequence, k-mer profiles-the relative frequency of all 4 k possible words of length k-are extracted and used as features. (c) Training data are balanced using an undersampling approach. Dense regions of the feature space are thinned out. This reduces the size of the training set. (d and e) The balanced and curated training data are used to train simple supervised learning classifiers. (f) Depending on the taxonomic rank of the given label, the test sequences are taxonomically classified at the superkingdom, phylum, or genus level
Figure 3. (a) Performance evaluation of a classifier (ensemble of bagged decision trees) trained on imbalanced (none) or balanced training data of the distantly related dataset using different grid sizes G. Relative k-mer frequencies were used as features. Mean MAP values and 1r-intervals over different choices of k 2 f1; 2; 3; 4; 5g are shown. (b) Effect of data balancing (N ¼ 6 Â 10 5 ; G ¼ 10) on the sample distribution of the distantly related training set. C ¼ Cbefore þCafter 2 describes the average sample count per grid cell before and after data balancing. DC ¼ C after À C before describes the number of samples that were removed per grid cell
Figure 4. Performance comparison in terms of MAP of several pipelines implementing our approach (upper four bars) with state-of-the-art methods. Note that performance values for state-of-the-art methods were taken from Mock et al. (2022). Results for the distantly related (a) and final model dataset (b) are shown. The best classification performances are labeled by asterisks
Performance evaluation of different ensemble classifiers trained on the distantly related dataset's balanced training data (G ¼ 10; N ¼ 6 Â 10 5 )
Improving taxonomic classification with feature space balancing
  • Article
  • Full-text available

July 2023

·

43 Reads

·

2 Citations

Bioinformatics Advances

Wolfgang Fuhl

·

Susanne Zabel

·

Modern high-throughput sequencing technologies, such as metagenomic sequencing, generate millions of sequences that need to be assigned to their taxonomic rank. Modern approaches either apply local alignment to existing databases, such as MMseqs2, or use deep neural networks, as in DeepMicrobes and BERTax. Due to the increasing size of datasets and databases, alignment-based approaches are expensive in terms of runtime. Deep learning-based approaches can require specialized hardware and consume large amounts of energy. In this article, we propose to use k-mer profiles of DNA sequences as features for taxonomic classification. Although k-mer profiles have been used before, we were able to significantly increase their predictive power significantly by applying a feature space balancing approach to the training data. This greatly improved the generalization quality of the classifiers. We have implemented different pipelines using our proposed feature extraction and dataset balancing in combination with different simple classifiers, such as bagged decision trees or feature subspace KNNs. By comparing the performance of our pipelines with state-of-the-art algorithms, such as BERTax and MMseqs2 on two different datasets, we show that our pipelines outperform these in almost all classification tasks. In particular, sequences from organisms that were not part of the training were classified with high precision. Availability and implementation The open-source code and the code to reproduce the results is available in Seafile, at https://tinyurl.com/ysk47fmr. Supplementary information Supplementary data are available at Bioinformatics Advances online.

Download

Attitudes of medical students toward AI in medicine (fears about AI in various areas of medicine).
Attitudes of medical students toward AI in medicine (statements about the use of AI and chatbots).
Chatbots for future docs: exploring medical students’ attitudes and knowledge towards artificial intelligence and medical chatbots

February 2023

·

318 Reads

·

98 Citations

·

·

·

[...]

·

Artificial intelligence (AI) in medicine and digital assistance systems such as chatbots will play an increasingly important role in future doctor – patient communication. To benefit from the potential of this technical innovation and ensure optimal patient care, future physicians should be equipped with the appropriate skills. Accordingly, a suitable place for the management and adaptation of digital assistance systems must be found in the medical education curriculum. To determine the existing levels of knowledge of medical students about AI chatbots in particular in the healthcare setting, this study surveyed medical students of the University of Luebeck and the University Hospital of Tuebingen. Using standardized quantitative questionnaires and qualitative analysis of group discussions, the attitudes of medical students toward AI and chatbots in medicine were investigated. From this, relevant requirements for the future integration of AI into the medical curriculum could be identified. The aim was to establish a basic understanding of the opportunities, limitations, and risks, as well as potential areas of application of the technology. The participants (N = 12) were able to develop an understanding of how AI and chatbots will affect their future daily work. Although basic attitudes toward the use of AI were positive, the students also expressed concerns. There were high levels of agreement regarding the use of AI in administrative settings (83.3%) and research with health-related data (91.7%). However, participants expressed concerns that data protection may be insufficiently guaranteed (33.3%) and that they might be increasingly monitored at work in the future (58.3%). The evaluations indicated that future physicians want to engage more intensively with AI in medicine. In view of future developments, AI and data competencies should be taught in a structured way during the medical curriculum and integrated into curricular teaching.

Citations (2)


... Similar hierarchical approaches for microbial data were previously used for predicting the taxonomies from different data types such as rRNA sequences 21,22 and Fourier-transform infrared spectroscopy 23,24 . To the best of our knowledge, such loss has not been applied in the field of DNA sequences taxonomic classification, where the prediction are either limited to one taxonomic level [25][26][27][28] or the functionality of the deep hierarchical loss is approximated by chaining neural network layers 29 . ...

Reference:

Taxometer: Improving taxonomic classification of metagenomics contigs
Improving taxonomic classification with feature space balancing

Bioinformatics Advances

... These ethical issues are worsened by the opacity of knowledge about AI chatbot algorithms. Students and even instructors often miss a full picture of how the underlying AI works, skepticism about chatbot outputs is common (Moldt et al. 2023). This issue is compounded by the relative opaqueness of various AI models, where their inner functioning is black-boxed which obfuscates how data and decisions are processed. ...

Chatbots for future docs: exploring medical students’ attitudes and knowledge towards artificial intelligence and medical chatbots