Automatic Extraction of Pedagogic Metadata from Learning Content.

I. J. Artificial Intelligence in Education 01/2008; 18:97-118.
Source: DBLP

ABSTRACT Annotating learning material with metadata allows easy reusability by different learning/tutoring systems. Several metadata standards have been developed to represent learning objects and courses. A learning system needs to use pedagogic attributes including document type, topic, coverage of concepts, and for each concept the significance and the role. Moreover, in order to have a flexible and reusable repository of e-learning materials, it is necessary that the annotation of the documents with such metadata be done in an automatic fashion as far as possible. This paper describes the attributes that represent some important pedagogic characteristics of learning materials. To reduce the overhead of manual annotation we have explored the feasibility of automatic annotation of learning materials with metadata. This facilitates the creation of an elearning open repository for storing these annotated learning materials, which can be used by learning systems. The automatic annotation is based on a domain knowledge base and a number of algorithms like standard classification algorithms, parsing and analysis of documents have been used for this purpose. The results show a fair degree of accuracy, which may be improved in future using more sophisticated algorithms.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Abstract. Personalization is increasingly vital especially for enterprises to be able to reach their customers. The key challenge in supporting personalization is the need for rich metadata, such as metadata about structural relationships, subject/concept relations between documents and cognitive metadata about documents (e.g. difficulty of a document). Manual annotation of large knowledge bases with such rich metadata is not scalable. As well as, automatic mining of cognitive metadata is challenging since it is very difficult to understand underlying intellectual knowledge about document automatically. On the other hand, the Web content is increasing becoming multilingual since growing amount of data generated on the Web is non-English. Current metadata extraction systems are generally based on English content and this requires to be revolutionized in order to adapt to the changing dynamics of the Web. To alleviate these problems, we introduce a novel automatic metadata extraction framework, which is based on a novel fuzzy based method for automatic cognitive metadata generation and uses different document parsing algorithms to extract rich metadata from multilingual enterprise content using the newly developed DocBook, Resource Type and Topic ontologies. Since the metadata generation process is based upon DocBook structured enterprise content, our framework is focused on enterprise documents and content which is loosely based on the DocBook type of formatting. DocBook is a common documentation formatting to formally produce corporate data and it is adopted by many enterprises. The proposed framework is illustrated and evaluated on English, German and French versions of the Symantec Norton 360 knowledge bases. The user study showed that the proposed fuzzy-based method generates reasonably accurate values with an average precision of 89.39% on the metadata values of document difficulty, document interactivity level and document interactivity type. The proposed fuzzy inference system achieves improved results compared to a rule-based reasoner for difficulty metadata extraction (~11% enhancement). In addition, user perceived metadata quality scores (mean of 5.57 out of 6) found to be high and automated metadata analysis showed that the extracted metadata is high quality and can be suitable for personalized information retrieval.
    Web Semantics: Science. 01/2011;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Personalized search and browsing is increasingly vital especially for enterprises to able to reach their customers. Key challenge in supporting personalization is the need for rich metadata such as cognitive metadata about documents. As we consider size of large knowledge bases, manual annotation is not scalable and feasible. On the other hand, automatic mining of cognitive metadata is challenging since it is very difficult to understand underlying intellectual knowledge about documents automatically. To alleviate this problem, we introduce a novel metadata extraction framework, which is based on fuzzy information granulation and fuzzy inference system for automatic cognitive metadata mining. The user evaluation study shows that our approach provides reasonable precision rates for difficulty, interactivity type, and interactivity level on the examined 100 documents. In addition, proposed fuzzy inference system achieves improved results compared to a rule-based reasoner for document difficulty metadata extraction (11% improvement).
    HT'11, Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia, Eindhoven, The Netherlands, June 6-9, 2011; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: The World Wide Web is composed of a large amount of learning content and an e-learner can retrieve learning materials from the Web. The different learners may have different learning requirements. Because the preferences and learner's requirement can vary greatly across individuals, a personalized retrieval system must be tailored so that it should be able to provide a learner with learning materials that he requires. The retrieval system should decide whether a document is relevant to the learner based on the curriculum requirement, the learner profile and the type of the learning material. We have implemented an information retrieval system for retrieving learning materials to satisfy the learners' need. To retrieve the personalized search results, the system looks into the learner profile, the domain knowledge and the automatically metadata annotated documents retrieves from the Web. To evaluate the performance of the system, many queries were processed out by our system.

Full-text (2 Sources)

Available from
May 19, 2014