Christoph Miksovic’s research while affiliated with IBM Research - Thomas J. Watson Research Center and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (9)


Adapting LLMs for Structured Natural Language API Integration
  • Conference Paper

January 2024

·

3 Reads

Robin Chan

·

·

·

[...]

·

Abdel Labbi


A Goal-Driven Natural Language Interface for Creating Application Integration Workflows

June 2022

·

20 Reads

·

7 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

Web applications and services are increasingly important in a distributed internet filled with diverse cloud services and applications, each of which enable the completion of narrowly defined tasks. Given the explosion in the scale and diversity of such services, their composition and integration for achieving complex user goals remains a challenging task for end-users and requires a lot of development effort when specified by hand. We present a demonstration of the Goal Oriented Flow Assistant (GOFA) system, which provides a natural language solution to generate workflows for application integration. Our tool is built on a three-step pipeline: it first uses Abstract Meaning Representation (AMR) to parse utterances; it then uses a knowledge graph to validate candidates; and finally uses an AI planner to compose the candidate flow. We provide a video demonstration of the deployed system as part of our submission.


Business Entity Matching with Siamese Graph Convolutional Networks

May 2021

·

6 Reads

·

7 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

Data integration has been studied extensively for decades and approached from different angles. However, this domain still remains largely rule-driven and lacks universal automation. Recent developments in machine learning and in particular deep learning have opened the way to more general and efficient solutions to data-integration tasks. In this paper, we demonstrate an approach that allows modeling and integrating entities by leveraging their relations and contextual information. This is achieved by combining siamese and graph neural networks to effectively propagate information between connected entities and support high scalability. We evaluated our approach on the task of integrating data about business entities, demonstrating that it outperforms both traditional rule-based systems and other deep learning approaches.


Business Entity Matching with Siamese Graph Convolutional Networks
  • Preprint
  • File available

May 2021

·

18 Reads

Data integration has been studied extensively for decades and approached from different angles. However, this domain still remains largely rule-driven and lacks universal automation. Recent developments in machine learning and in particular deep learning have opened the way to more general and efficient solutions to data-integration tasks. In this paper, we demonstrate an approach that allows modeling and integrating entities by leveraging their relations and contextual information. This is achieved by combining siamese and graph neural networks to effectively propagate information between connected entities and support high scalability. We evaluated our approach on the task of integrating data about business entities, demonstrating that it outperforms both traditional rule-based systems and other deep learning approaches.

Download

Figure 1: An example of an enterprise Knowledge Graph having 3 entity types and 5 link types. The red dotted arrow is the missing link information.
Figure 2: The attention mechanism of RelAtt.
Dataset statistics.
Results of link prediction on FB15k-237, WN18 and Comp dataset.
Statistics of the query graphs

+1

Knowledge Graph Embedding using Graph Convolutional Networks with Relation-Aware Attention

February 2021

·

463 Reads

Knowledge graph embedding methods learn embeddings of entities and relations in a low dimensional space which can be used for various downstream machine learning tasks such as link prediction and entity matching. Various graph convolutional network methods have been proposed which use different types of information to learn the features of entities and relations. However, these methods assign the same weight (importance) to the neighbors when aggregating the information, ignoring the role of different relations with the neighboring entities. To this end, we propose a relation-aware graph attention model that leverages relation information to compute different weights to the neighboring nodes for learning embeddings of entities and relations. We evaluate our proposed approach on link prediction and entity matching tasks. Our experimental results on link prediction on three datasets (one proprietary and two public) and results on unsupervised entity matching on one proprietary dataset demonstrate the effectiveness of the relation-aware attention.


Linking IT Product Records

March 2020

·

25 Reads

·

2 Citations

Communications in Computer and Information Science

Today’s enterprise decision making relies heavily on insights derived from vast amounts of data from different sources. To acquire these insights, the available data must be cleaned, integrated and linked. In this work, we focus on the problem of linking records that contain textual descriptions of IT products.


Fig. 1: Preprocessing and runtime pipeline.
Fig. 3: S-curves (left: full; right: zoomed; x-axis: Jaccard similarity; y-axis: matching probability)
Fig. 4: Different scoring functions for different MinHash configurations: r = 4, b = 10 (encircled data point), followed by r = 5, b = 18 and r = 6, b = 30 (x-axis: precision [%]; y-axis: recall [%])
Fig. 5: Scalability Analysis of the RL System
Fast Record Linkage for Company Entities

December 2019

·

118 Reads

·

18 Citations

Record linkage is an essential part of nearly all real-world systems that consume structured and unstructured data coming from different sources. Typically no common key is available for connecting records. Massive data integration processes often have to be completed before any data analytics and further processing can be performed. In this work we focus on company entity matching, where company name, location and industry are taken into account. Our contribution is a highly scalable, enterprise-grade end-to-end system that uses rule-based linkage algorithms in combination with a machine learning approach to account for short company names. Linkage time is greatly reduced by an efficient decomposition of the search space using MinHash. Based on real-world ground truth datasets, we show that our approach reaches a recall of 91% compared to 73% for baseline approaches, while scaling linearly with the number of nodes used in the system.


Fast Record Linkage for Company Entities

July 2019

·

104 Reads

Record Linkage is an essential part of almost all real-world systems that consume data coming from different sources, structured and unstructured. Typically no common key is available in order to connect the records. Often massive data cleaning and data integration processes have to be completed before any data analytics and further processing can be performed. Though record linkage is often seen as a somewhat tedious necessary step, it is able to reveal valuable insights of the data at hand. These insights guide further analytic approaches over the data and support data visualization. In this work we focus on company entity matching, where company name, location and industry are taken into account. The matching is done on the fly to accommodate realtime processing of streamed data. Our contribution is a system that uses rule-based matching algorithms for scoring operations which we extend with a machine learning approach to account for short company names. We propose an end-to-end highly scalable enterprise-grade system. Linkage time is greatly reduced by efficient decomposition of the search space using MinHash. High linkage accuracy is reached by the proposed thorough scoring process of the matching candidates. Based on two real world ground truth datasets, we show that our approach reaches a recall of 91% compared to 86% for baseline approaches. These results are achieved while scaling linearly with the number of nodes used in the system.

Citations (5)


... Our work therefore simulates the use of xAI for explanatory debugging [19,22] with concept-based explanations [21], also called the "glitch detector task" [41,43]. We investigate how xAI may improve people's mental models for AI [2,10], and how personalized xAI will affect people's ability to accurately identify when their assistant is correct or incorrect (i.e., if the agent adapts to the user, will the user make fewer mistakes?). Our contributions include: ...

Reference:

Towards Balancing Preference and Performance through Adaptive Personalized Explainability
Follow the Successful Herd: Towards Explanations for Improved Use and Mental Models of Natural Language Systems
  • Citing Conference Paper
  • March 2023

... Existing literature suggests that supervised learning methods applied to record linkage provide superior results than those from unsupervised methods, such as K-means (e.g., Christen 2012). Recent advances include combining graph convolutional networks and siamese networks to leverage the relationships and contextual information in knowledge graphs (Krivosheev et al. 2021). Nevertheless, supervised approaches require a substantial volume of high-quality training data, which might be difficult and too expensive to obtain in practice, as the data distribution is highly unbalanced towards negative (non-matches) pairs. ...

Business Entity Matching with Siamese Graph Convolutional Networks
  • Citing Article
  • May 2021

Proceedings of the AAAI Conference on Artificial Intelligence

... Recently, there have been several applications in which first multiple plans are generated and then the users are involved in the selection process. Some of these applications are in the area of patient monitoring , enterprise risk management , conversational systems (Chakraborti et al. 2022;Rizk et al. 2020;Sreedharan et al. 2020b), and web service composition (Brachman et al. 2022). However, the user interfaces for interacting with such systems has received little attention. ...

A Goal-Driven Natural Language Interface for Creating Application Integration Workflows
  • Citing Article
  • June 2022

Proceedings of the AAAI Conference on Artificial Intelligence

... MinHash algorithm, when used with the LSH forest data structure, represents a text similarity method that approximates the Jaccard set similarity score [32] MinHash was used to replace the large sets of string data with smaller "signatures" that still preserve the underlying similarity metric, hence producing a signature matrix, but a pair-wise signature comparison was still needed. Here the LSH Forest comes into play. ...

Fast Record Linkage for Company Entities