KARE Knowledgeware

About the lab

Our research focuses in the application of deep neural networks (DNN) for natural language understanding (NLU). We are particularly interested in the application of semantic co-encoders for creating general purpose sentence embeddings that can be used for ranking, classification and intent recognition. We are also interested in domain entities recognition and cognitive automation.

Our research ultimately point to project MIND which aims to create the MIND for businesses. Read more on

Featured projects (1)

MIND aims to create a central AI capable of democratising and structuring and exploiting information flowing across businesses both for internal and external use.

Featured research (4)

It has become a common practice to use word embeddings, such as those generated by word2vec or GloVe, as inputs for natural language processing tasks. Such embeddings can aid generalisation by capturing statistical regularities in word usage and by capturing some semantic information. However they require the construction of large dictionaries of high-dimensional vectors from very large amounts of text and have limited ability to handle out-of-vocabulary words or spelling mistakes. Some recent work has demonstrated that text classifiers using character- level input can achieve similar performance to those using word embeddings. Where character input replaces word-level input, it can yield smaller, less computationally intensive models, which helps when models need to be deployed on embedded devices. Character input can also help to address out-of-vocabulary words and/or spelling mistakes. It is thus of interest to know whether using character embeddings in place of word embeddings can be done without harming performance. In this paper, we investigate the use of character embeddings vs word embeddings when classifying short texts such as sentences and questions. We find that the models using character embeddings perform just as well as those using word embeddings whilst being much smaller and taking less time to train. Additionally, we demonstrate that using character embeddings makes the models more robust to spelling errors.
As the demand for routing, auto-responding, la- belling, etc. is increasing, automated email classification and their responsive action(s) have become an area of increased interest in both supervised and unsupervised text learning. However, the large number of disparate classes required as a training set for any investigated domain seems an obstacle for increasing the performance of the literature baselines. We analyse the performance of six state-of-the-art research approaches against a highly-constrained, domain-driven text corpus, including vari- ations in testing to identify the best possible approach in dealing with larger documents such as emails. We identify the Memory Network as the best candidate, among other popular neural networks, due to its top accuracy among the models compared, faster prediction and ability to train on limited data.
Detecting calls to action in natural text conversations, such as emails, chats and instant messages, is a pre-processing step that enables assistive technologies, such as the ones used in virtual assistants and chatbots, to capture and handle human-to-human interactions. Different forms of question detection are often used as a proxy solution for detecting intended actions. In this position paper, we provide empirical evidence that, despite their accuracy in detecting questions, both state-of-the-art industrial and academic question detection technologies are not good proxy solutions for inferring intended actions. We provide empirical data to demonstrate that the problem should be approached differently and as a comparison we present results obtained with a solution derived from our patent pending intention inference technology.
In many machine learning tasks, a model needs to be presented with both correct and incorrect examples during the training. For instance, given a query, a search engine can be trained to predict the relevant (positive) documents from irrelevant (negative) ones. While each query is associated with a handful of relevant documents, the number of irrelevant documents can be vast. This imbalance can bias a document retrieval model while the mere volume of irrelevant documents can result in long training times. In this paper, we show the affect of a tailored negative sampling on the performance of the Deep Structure Semantic Model (DSSM). We show that a naive random sampling method outperforms more sophisticated ways of selecting negative data.

Lab head

Michele Sama

Members (1)

Chris Pebber
Chris Pebber
  • Not confirmed yet

Alumni (7)

Ritwik Dilip Kulkarni
  • University of Helsinki
Simon Rawles
  • Queen Mary, University of London
Mercè Vintró
Mercè Vintró
Stelios Kapetanakis
Stelios Kapetanakis