Project

BIOfid : Specialized Information Service for Biodiversity Research

Goal: BIOfid will be established as a Specialized Information Service (SIS, or FID = Fachinformationsdienst in German) for Biodiversity Research. The SIS is jointly designed by three project partners, namely (a) the Universitätsbibliothek Johann Christian Senckenberg in Frankfurt am Main, (b) Prof. Mehler, Text Technology Lab at the Goethe-University Frankfurt, and (c) the Senckenberg Nature Research Society. The work program comprises four modules:

A. An innovative text mining pilot project will mobilize data relevant for biodiversity research but only available in print as yet.

B. Th second module deals with digitizing a selected body of specialized biodiversity literature intending to provide source material for modules A and C.

C. A platform for open access journals will be established that facilitates non-commercial publishing and long-term availability of professional e-journals.

D. The SIS will purchase highly specialized literature which is unavailable online. This literature will be available in several ways, for instance via interlibrary loan.


Date: 1 July 2017 - 30 June 2020
Updates
5
0 new
Recommendations
3
0 new
Followers
13
2 new
Reads
195
8 new

Project log

Marco Schmidt
added an update
"Der Palmengarten", the journal of Frankfurt's Palmengarten is now available on the BIOfid open access journal platform: https://ojs.ub.uni-frankfurt.de/Palmengarten . All presently available digital articles have been uploaded and can be browsed or searched for keywords. New issues will be added one year after the print version.
In this lecture we deal with the preparation of a dictionary of biology for the use in the field of text mining in documents of biology. The idea is to finally map terms of this and related lexica onto Wikidata to get a better controlled vocabulary for enabling terminology-driven text mining.
In this talk we deal with OCR-errors that we have detected in historical documents as part of a corpus of journals about botany (see UB Frankfurt 2013). Such an error analysis is needed in order to learn about possible errors any text mining procedure has to face when processing documents of this sort. Our main finding is that OCR already works relatively good except for segments of documents which are more demanding in terms of their layout. In any event, OCR-errors are yet another source of mostly lexical variants with which text mining and natural language processing have to deal with in an automatic manner. This basically means that we will need to perform a kind of OCR post processing.
In this talk we explain first steps to map bio-ontologies to WikiData. This is needed to bridge between the latter ontologies on the one hand and English, German, French and possibly also Latin texts on the other hand in a semantic manner. The final aim is to enable users for using whatever ontological entities, classes and relations denoting biological objects in order to search even in historical documents on a semantic level.
Gerwin Kasperek
added a project goal
BIOfid will be established as a Specialized Information Service (SIS, or FID = Fachinformationsdienst in German) for Biodiversity Research. The SIS is jointly designed by three project partners, namely (a) the Universitätsbibliothek Johann Christian Senckenberg in Frankfurt am Main, (b) Prof. Mehler, Text Technology Lab at the Goethe-University Frankfurt, and (c) the Senckenberg Nature Research Society. The work program comprises four modules:
A. An innovative text mining pilot project will mobilize data relevant for biodiversity research but only available in print as yet.
B. Th second module deals with digitizing a selected body of specialized biodiversity literature intending to provide source material for modules A and C.
C. A platform for open access journals will be established that facilitates non-commercial publishing and long-term availability of professional e-journals.
D. The SIS will purchase highly specialized literature which is unavailable online. This literature will be available in several ways, for instance via interlibrary loan.