James Sanger's scientific contributions

Publications (2)

Book
Text mining tries to solve the crisis of information overload by combining techniques from data mining, machine learning, natural language processing, information retrieval, and knowledge management. In addition to providing an in-depth examination of core text mining and link detection algorithms and operations, this book examines advanced pre-pro...

Citations

... This solution is called text categorization [2]. Text categorization is the task of automatically sorting a set of documents into categories (or classes, or topics) from a predefined set [3]. There are various reasons for using document categorization. ...
... Traditionally, text classification is seen as a machine learning (ML) problem where documents are represented using their corresponding numerical feature vectors, e.g., Tf-idf, which are then fed to a given classifier (e.g., support vector machine, linear regression) that utilizes some training sample to build the classifier model, which is, in turn, employed to predict the class label of the inputted text document (See, e.g., [7]). Since the last decade, Deep Learning (DL) approaches have widely been acknowledged as the mainstream technology in text classification task, and gradually substituted to traditional machine learning techniques. ...