Serdar Kadioglu’s research while affiliated with Brown University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (48)


Ner4Opt Example: Given the problem description in free-form natural language text, the goal of Ner4Opt is to extract the highlighted key information about variables, parameters, constraint direction, limits, objective, and optimization direction. Demo available on Hugging Face Spacesdemo
LLMs as Modeling Assistants. A motivational example that shows an incorrect model (Left) generated from the problem description and a correct model (Right) from the same description when annotated with Ner4Opt entities
Ner4Opt CRF Example: Given the input sentence, feature extraction and transformation of each token is fed into the conditional random field to find the output of recognized entities
Regular automaton (sketch) to capture features for (OBJ_NAME) extraction
Example pattern for the given objective name entity as the union of its part-of-speech and dependency tags

+3

Ner4Opt: named entity recognition for optimization modelling from natural language
  • Article
  • Publisher preview available

November 2024

·

2 Reads

Constraints

Serdar Kadıoğlu

·

·

Karthik Uppuluri

·

[...]

·

Ravisutha Srinivasamurthy

Solving combinatorial optimization problems involves a two-stage process that follows the model-and-run approach. First, a user is responsible for formulating the problem at hand as an optimization model, and then, given the model, a solver is responsible for finding the solution. While optimization technology has enjoyed tremendous theoretical and practical advances, the process has remained unchanged for decades. To date, transforming problem descriptions into optimization models remains a barrier to entry. To alleviate users from the cognitive task of modeling, we study named entity recognition to capture components of optimization models such as the objective, variables, and constraints from free-form natural language text, and coin this problem as Ner4Opt. We show how to solve Ner4Opt using classical techniques based on morphological and grammatical properties and modern methods leveraging pre-trained large language models and fine-tuning transformers architecture with optimization-specific corpora. For best performance, we present their hybridization combined with feature engineering and data augmentation to exploit the language of optimization problems. We improve over the state-of-the-art for annotated linear programming word problems. Large-language models (LLMs) are not yet versatile enough to turn text into optimization models or extract optimization entities. Still, when augmented with Ner4Opt annotations, the compilation accuracy of LLM-generated models improves significantly. We open-source our Ner4Opt library, release our training and fine-tuning procedures, and share our trained artifacts. We identify several next steps and discuss important open problems toward automated modeling.

View access options

Scalable iterative pruning of large language and vision models using block coordinate descent

November 2024

·

4 Reads

Pruning neural networks, which involves removing a fraction of their weights, can often maintain high accuracy while significantly reducing model complexity, at least up to a certain limit. We present a neural network pruning technique that builds upon the Combinatorial Brain Surgeon, but solves an optimization problem over a subset of the network weights in an iterative, block-wise manner using block coordinate descent. The iterative, block-based nature of this pruning technique, which we dub ``iterative Combinatorial Brain Surgeon'' (iCBS) allows for scalability to very large models, including large language models (LLMs), that may not be feasible with a one-shot combinatorial optimization approach. When applied to large models like Mistral and DeiT, iCBS achieves higher performance metrics at the same density levels compared to existing pruning methods such as Wanda. This demonstrates the effectiveness of this iterative, block-wise pruning method in compressing and optimizing the performance of large deep learning models, even while optimizing over only a small fraction of the weights. Moreover, our approach allows for a quality-time (or cost) tradeoff that is not available when using a one-shot pruning technique alone. The block-wise formulation of the optimization problem enables the use of hardware accelerators, potentially offsetting the increased computational costs compared to one-shot pruning methods like Wanda. In particular, the optimization problem solved for each block is quantum-amenable in that it could, in principle, be solved by a quantum computer.


Surrogate Modeling to Address the Absence of Protected Membership Attributes in Fairness Evaluation

October 2024

·

4 Reads

ACM Transactions on Evolutionary Learning and Optimization

It is imperative to ensure that artificial intelligence models perform well for all groups including those from underprivileged populations. By comparing the performance of models for the protected group with respect to the rest of the population, we can uncover and prevent unwanted bias. However, a significant drawback of such binary fairness evaluation is its dependency on protected group membership attributes. In various real-world scenarios, protected status for individuals is sparse, unavailable, or even illegal to collect. This paper extends the previous work on binary fairness metrics to relax the requirement on deterministic membership to its surrogate counterpart under a probabilistic setting. We show how to conduct binary fairness evaluation when exact protected attributes are not available, but their surrogates as likelihoods are accessible. In theory, we prove that inferred metrics calculated from surrogates are valid under standard statistical assumptions. In practice, we demonstrate the effectiveness of our approach using publicly available data from the Home Mortgage Disclosure Act and simulated benchmarks that mimic real-world conditions under different levels of model disparity. We extend the results from previous work to include comparisons with alternative model-based methods and we develop further practical guidance based on our extensive simulation. Finally, we embody our method in open-source software that is readily available for use in other applications.


Integrating optimized item selection with active learning for continuous exploration in recommender systems

April 2024

·

11 Reads

·

1 Citation

Annals of Mathematics and Artificial Intelligence

Recommender Systems have become the backbone of personalized services that provide tailored experiences to individual users, yet designing new recommendation applications with limited or no available training data remains a challenge. To address this issue, we focus on selecting the universe of items for experimentation in recommender systems by leveraging a recently introduced combinatorial problem. On the one hand, selecting a large set of items is desirable to increase the diversity of items. On the other hand, a smaller set of items enables rapid experimentation and minimizes the time and the amount of data required to train machine learning models. We first present how to optimize for such conflicting criteria using a multi-level optimization framework. Then, we shift our focus to the operational setting of a recommender system. In practice, to work effectively in a dynamic environment where new items are introduced to the system, we need to explore users’ behaviors and interests continuously. To that end, we show how to integrate the item selection approach with active learning to guide randomized exploration in an ongoing fashion. Our hybrid approach combines techniques from discrete optimization, unsupervised clustering, and latent text embeddings. Experimental results on well-known movie and book recommendation benchmarks demonstrate the benefits of optimized item selection and efficient exploration.


Building Higher-Order Abstractions from the Components of Recommender Systems

March 2024

·

7 Reads

·

2 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

We present a modular recommender system framework that tightly integrates yet maintains the independence of individual components, thus satisfying two of the most critical aspects of industrial applications, generality and specificity. On the one hand, we ensure that each component remains self-contained and is ready to serve in other applications beyond recommender systems. On the other hand, when these components are combined, a unified theme emerges for recommender systems. We present the details of each component in the context of recommender systems and other applications. We release each component as an open-source library, and most importantly, we release their integration under MAB2REC, an industry-strength open-source software for building bandit-based recommender systems. By bringing standalone components together, Mab2Rec realizes a powerful and scalable toolchain to build and deploy business-relevant personalization applications. Finally, we share our experience and best practices for user training, adoption, performance evaluation, deployment, and model governance within the enterprise and the broader community.


Figure 3. Simple example rule that yields a complex decision tree. The optimal rule for this dataset can be stated as AtLeast3( f 0 , . . . , f 4 ), yet the decision tree trained on this dataset has 19 split nodes and is not easily interpretable.
Figure 8. Evolution of the objective function (in this case, the accuracy) in a short example run of the native local solver with non-local moves on the Breast Cancer dataset with max_complexity = 10. The settings for the solver are num_starts = 3, num_iterations = 100, and the temperatures follow a geometric schedule from 0.2 to 10 −4 . The first iteration of each start is indicated by a vertical solid red line. The first num_iterations_burn_in = 50 iterations of each start are defined as the burn-in period (shaded in blue), in which no non-local moves are proposed. After that, non-local moves are proposed when there is no improvement in the accuracy for patience = 10 iterations. The proposals of non-local moves use a subset of the samples of size max_samples = 100 and are indicated by vertical dashed black lines.
Explainable Artificial Intelligence Using Expressive Boolean Formulas

November 2023

·

18 Reads

·

5 Citations

Machine Learning and Knowledge Extraction

We propose and implement an interpretable machine learning classification model for Explainable AI (XAI) based on expressive Boolean formulas. Potential applications include credit scoring and diagnosis of medical conditions. The Boolean formula defines a rule with tunable complexity (or interpretability) according to which input data are classified. Such a formula can include any operator that can be applied to one or more Boolean variables, thus providing higher expressivity compared to more rigid rule- and tree-based approaches. The classifier is trained using native local optimization techniques, efficiently searching the space of feasible formulas. Shallow rules can be determined by fast Integer Linear Programming (ILP) or Quadratic Unconstrained Binary Optimization (QUBO) solvers, potentially powered by special-purpose hardware or quantum devices. We combine the expressivity and efficiency of the native local optimizer with the fast operation of these devices by executing non-local moves that optimize over the subtrees of the full Boolean formula. We provide extensive numerical benchmarking results featuring several baselines on well-known public datasets. Based on the results, we find that the native local rule classifier is generally competitive with the other classifiers. The addition of non-local moves achieves similar results with fewer iterations. Therefore, using specialized or quantum hardware could lead to a significant speedup through the rapid proposal of non-local moves.


Surrogate Membership for Inferred Metrics in Fairness Evaluation

October 2023

·

10 Reads

·

3 Citations

Lecture Notes in Computer Science

As artificial intelligence becomes more embedded into daily activities, it is imperative to ensure models perform well for all subgroups. This is particularly important when models include underprivileged populations. Binary fairness metrics, which compare model performance for protected groups to the rest of the model population, are an important way to guard against unwanted bias. However, a significant drawback of these binary fairness metrics is that they require protected group membership attributes. In many practical scenarios, protected status for individuals is sparse, unavailable, or even illegal to collect. This paper extends binary fairness metrics from deterministic membership attributes to their surrogate counterpart under the probabilistic setting. We show that it is possible to conduct binary fairness evaluation when exact protected attributes are not immediately available but their surrogate as likelihoods is accessible. Our inferred metrics calculated from surrogates are proved to be valid under standard statistical assumptions. Moreover, we do not require the surrogate variable to be strongly related to protected class membership; inferred metrics remain valid even when membership in the protected and unprotected groups is equally likely for many levels of the surrogate variable. Finally, we demonstrate the effectiveness of our approach using publicly available data from the Home Mortgage Disclosure Act and simulated benchmarks that mimic real-world conditions under different levels of model disparity.


Integrating Optimized Item Selection with Active Learning for Continuous Exploration in Recommender Systems

August 2023

·

48 Reads

Recommender Systems have become the backbone of personalized services that provide tailored experiences to individual users, yet designing new recommendation applications with limited or no available training data remains a challenge. To address this issue, we focus on selecting the universe of items for experimentation in recommender systems by leveraging a recently introduced combinatorial problem. On the one hand, selecting a large set of items is desirable to increase the diversity of items. On the other hand, a smaller set of items enables rapid experimentation and minimizes the time and the amount of data required to train machine learning models. We first present how to optimize for such conflicting criteria using a multi-level optimization framework. Then, we shift our focus to the operational setting of a recommender system. In practice, to work effectively in a dynamic environment where new items are introduced to the system, we need to explore users' behaviors and interests continuously. To that end, we show how to integrate the item selection approach with active learning to guide randomized exploration in an ongoing fashion. Our hybrid approach combines techniques from discrete optimization, unsupervised clustering, and latent text embeddings. Experimental results on well-known movie and book recommendation benchmarks demonstrate the benefits of optimized item selection and efficient exploration.


Figure 1 The framework proposed
Holy Grail 2.0: From Natural Language to Constraint Models

August 2023

·

160 Reads

Twenty-seven years ago, E. Freuder highlighted that "Constraint programming represents one of the closest approaches computer science has yet made to the Holy Grail of programming: the user states the problem, the computer solves it". Nowadays, CP users have great modeling tools available (like Minizinc and CPMpy), allowing them to formulate the problem and then let a solver do the rest of the job, getting closer to the stated goal. However, this still requires the CP user to know the formalism and respect it. Another significant challenge lies in the expertise required to effectively model combinatorial problems. All this limits the wider adoption of CP. In this position paper, we investigate a possible approach to leverage pre-trained Large Language Models to extract models from textual problem descriptions. More specifically, we take inspiration from the Natural Language Processing for Optimization (NL4OPT) challenge and present early results with a decomposition-based prompting approach to GPT Models.


Explainable AI using expressive Boolean formulas

June 2023

·

8 Reads

We propose and implement an interpretable machine learning classification model for Explainable AI (XAI) based on expressive Boolean formulas. Potential applications include credit scoring and diagnosis of medical conditions. The Boolean formula defines a rule with tunable complexity (or interpretability), according to which input data are classified. Such a formula can include any operator that can be applied to one or more Boolean variables, thus providing higher expressivity compared to more rigid rule-based and tree-based approaches. The classifier is trained using native local optimization techniques, efficiently searching the space of feasible formulas. Shallow rules can be determined by fast Integer Linear Programming (ILP) or Quadratic Unconstrained Binary Optimization (QUBO) solvers, potentially powered by special purpose hardware or quantum devices. We combine the expressivity and efficiency of the native local optimizer with the fast operation of these devices by executing non-local moves that optimize over subtrees of the full Boolean formula. We provide extensive numerical benchmarking results featuring several baselines on well-known public datasets. Based on the results, we find that the native local rule classifier is generally competitive with the other classifiers. The addition of non-local moves achieves similar results with fewer iterations, and therefore using specialized or quantum hardware could lead to a speedup by fast proposal of non-local moves.


Citations (29)


... In response to the pressing need for standardized and transparent evaluation pipelines, numerous frameworks [3,8,14,15,20,21,25,28,30,32,39,40] have been proposed and adopted by the community, aiming to provide a systematic and rigorous approach to the evaluation of recommender systems, facilitating reproducibility, comparability, and advancement within the field. However, the majority of these frameworks has been conceived as standalone tools aiming to address the entire recommendation pipeline, yet they often fail to comprehensively tackle every aspect of the multifaceted recommendation process. ...

Reference:

DataRec: A Framework for Standardizing Recommendation Data Processing and Analysis
Building Higher-Order Abstractions from the Components of Recommender Systems
  • Citing Article
  • March 2024

Proceedings of the AAAI Conference on Artificial Intelligence

... Black box (BB) models, prevalent in To address these challenges, researchers have proposed diverse methods. Visual rule discovery [16] seeks to extract understandable rules from complex models, while approaches like Boolean formulas [17] aim to provide explicit, logical explanations. Methods exploring the impact of input variables on predictions [18] and quantifying attributions and interactions using game theory [19] further contribute to enhancing model transparency. ...

Explainable Artificial Intelligence Using Expressive Boolean Formulas

Machine Learning and Knowledge Extraction

... Beyond recommenders, we utilized JURITY in the fairness evaluation of propensity models, a well-known industry application for lead generation and marketing campaign optimization. Specifically, we studied two practical scenarios that are often neglected in the academic literature; when the protected membership attribute of individuals is not available,and when the ground truth label is not available (Thielbar et al. 2023). ...

Surrogate Membership for Inferred Metrics in Fairness Evaluation
  • Citing Chapter
  • October 2023

Lecture Notes in Computer Science

... 1. Feature selection and generation: SELECTIVE , SEQ2PAT Kadıoglu et al. 2023;Ghosh et al. 2022), and TEXTWISER (Kilitcioglu and Kadıoglu 2021) 2. Recommendation models: MABWISER (Strong, Kleynhans, andKadıoglu 2019, 2021;Kilitçioglu and Kadıoglu 2022) 3. Performance and fairness evaluation:JURITY (Michalský and Kadıoglu 2021;Cheng, Kilitçioglu, and Kadıoglu 2022) 4. Integration framework: MAB2REC (this paper) ...

Seq2Pat: Sequence‐to‐pattern generation to bridge pattern mining with machine learning

... To answer Q4 and evaluate the sensitivity of ISP with respect to the underlying item embeddings, we solve P max_cover @t on the books dataset with a fixed t = 100 and q = 0.1. We experiment with several complementary embeddings using TextWiser [38]. Besides our baseline TFIDF, we employ FastText Word2Vec to learn word vectors [10,40], GloVe [41] embedding to learn global word representations, and Byte-Pair [42] embedding to learn char- Fig. 4 [Q3] Warm-Start analysis of unit coverage with varying distance quantile q from the top semi-decile, q = 0.05, to q = 0.5 acter level information. ...

Representing the Unification of Text Featurization using a Context-Free Grammar
  • Citing Article
  • May 2021

Proceedings of the AAAI Conference on Artificial Intelligence

... 1. Feature selection and generation: SELECTIVE , SEQ2PAT Kadıoglu et al. 2023;Ghosh et al. 2022), and TEXTWISER (Kilitcioglu and Kadıoglu 2021) 2. Recommendation models: MABWISER (Strong, Kleynhans, andKadıoglu 2019, 2021;Kilitçioglu and Kadıoglu 2022) 3. Performance and fairness evaluation:JURITY (Michalský and Kadıoglu 2021;Cheng, Kilitçioglu, and Kadıoglu 2022) 4. Integration framework: MAB2REC (this paper) ...

Dichotomic Pattern Mining Integrated With Constraint Reasoning for Digital Behavior Analysis

Frontiers in Artificial Intelligence

... The two-stage template mining algorithm allows for scalable and generalizable analysis of large-scale datasets of varying lengths and diverse event sequential patterns. The first frequent sequential pattern mining algorithm [67] provides a comprehensive dataset overview and avoids generating unwieldy templates that can be challenging for experts to interpret and define. It also allows users to add self-defined constraints on template compositions flexibly to accommodate their needs [81,83]. ...

Seq2Pat: Sequence-to-Pattern Generation for Constraint-Based Sequential Pattern Mining
  • Citing Article
  • June 2022

Proceedings of the AAAI Conference on Artificial Intelligence

... Let us finish our treatment of the warm-start procedure with a concrete example using a specific recommendation model. The contextual multi-armed bandits (CMAB) are commonly used in recommendation settings thanks to their principled approach to balance exploration and exploitation trade-off [25][26][27] with regret guarantees. ...

MABWISER: Parallelizable Contextual Multi-armed Bandits
  • Citing Article
  • June 2021

International Journal of Artificial Intelligence Tools

... We recently introduced a multi-level optimization approach to select items to be included in the initial randomized experimentation at the inception of a recommender system on Day-0 [8]. Our selection procedure is designed to maximize knowledge transfer between user responses and minimize the time-to-market for personalization. ...

Optimized Item Selection to Boost Exploration for Recommender Systems
  • Citing Chapter
  • June 2021

Lecture Notes in Computer Science

... The area of machine learning has made great strides in recent years, especially with regard to deep learning models, which perform remarkably well in a variety of applications, from natural language processing to image recognition [1]. However, these models often operate as black boxes that are difficult to understand and interpret [2], which makes them unsuitable for use in delicate industries like finance [3] and healthcare [4]. Moreover, uncertainty in predictions is not naturally taken into account by the deterministic structure of conventional deep learning models [5], which is important for making well-informed decisions in practical situations. ...

Modeling uncertainty to improve personalized recommendations via Bayesian deep learning

International Journal of Data Science and Analytics