Furui Liu's research while affiliated with Huawei Technologies and other places

Publications (17)

Preprint
In the presence of unmeasured confounders, we address the problem of treatment effect estimation from data fusion, that is, multiple datasets collected under different treatment assignment mechanisms. For example, marketers may assign different advertising strategies to the same products at different times/places. To handle the bias induced by unme...
Preprint
Deriving a good variable selection strategy in branch-and-bound is essential for the efficiency of modern mixed-integer programming (MIP) solvers. With MIP branching data collected during the previous solution process, learning to branch methods have recently become superior over heuristics. As branch-and-bound is naturally a sequential decision ma...
Preprint
Full-text available
Collaborative multi-agent reinforcement learning (MARL) has been widely used in many practical applications, where each agent makes a decision based on its own observation. Most mainstream methods treat each local observation as an entirety when modeling the decentralized local utility functions. However, they ignore the fact that local observation...
Preprint
It is evidence that representation learning can improve model's performance over multiple downstream tasks in many real-world scenarios, such as image classification and recommender systems. Existing learning approaches rely on establishing the correlation (or its proxy) between features and the downstream task (labels), which typically results in...
Preprint
Debiased recommendation has recently attracted increasing attention from both industry and academic communities. Traditional models mostly rely on the inverse propensity score (IPS), which can be hard to estimate and may suffer from the high variance issue. To alleviate these problems, in this paper, we propose a novel debiased recommendation frame...
Preprint
Full-text available
Recent studies have shown that introducing communication between agents can significantly improve overall performance in cooperative Multi-agent reinforcement learning (MARL). In many real-world scenarios, communication can be expensive and the bandwidth of the multi-agent system is subject to certain constraints. Redundant messages who occupy the...
Article
Cutting plane methods play a significant role in modern solvers for tackling mixed-integer programming (MIP) problems. Proper selection of cuts would remove infeasible solutions in the early stage, thus largely reducing the computational burden without hurting the solution accuracy. However, the major cut selection approaches heavily rely on heuris...
Preprint
Domain generalization aims to learn knowledge invariant across different distributions while semantically meaningful for downstream tasks from multiple source domains, to improve the model's generalization ability on unseen target domains. The fundamental objective is to understand the underlying "invariance" behind these observational distribution...
Preprint
Centralized Training with Decentralized Execution (CTDE) has been a popular paradigm in cooperative Multi-Agent Reinforcement Learning (MARL) settings and is widely used in many real applications. One of the major challenges in the training process is credit assignment, which aims to deduce the contributions of each agent according to the global re...
Conference Paper
Learning disentanglement aims at finding a low dimensional representation which consists of multiple explanatory and generative factors of the observational data. The framework of variational autoencoder (VAE) is commonly used to disentangle independent factors from observations. However, in real scenarios, factors with semantics are not necessaril...
Preprint
Cutting plane methods play a significant role in modern solvers for tackling mixed-integer programming (MIP) problems. Proper selection of cuts would remove infeasible solutions in the early stage, thus largely reducing the computational burden without hurting the solution accuracy. However, the major cut selection approaches heavily rely on heuris...
Preprint
The capability of imagining internally with a mental model of the world is vitally important for human cognition. If a machine intelligent agent can learn a world model to create a "dream" environment, it can then internally ask what-if questions -- simulate the alternative futures that haven't been experienced in the past yet -- and make optimal d...
Preprint
This paper proposes a Disentangled gEnerative cAusal Representation (DEAR) learning method. Unlike existing disentanglement methods that enforce independence of the latent variables, we consider the general case where the underlying factors of interests can be causally correlated. We show that previous methods with independent priors fail to disent...
Preprint
Full-text available
Adversarial Training (AT) is proposed to alleviate the adversarial vulnerability of machine learning models by extracting only robust features from the input, which, however, inevitably leads to severe accuracy reduction as it discards the non-robust yet useful features. This motivates us to preserve both robust and non-robust features and separate...
Preprint
Learning disentanglement aims at finding a low dimensional representation, which consists of multiple explanatory and generative factors of the observational data. The framework of variational autoencoder is commonly used to disentangle independent factors from observations. However, in real scenarios, the factors with semantic meanings are not nec...

Citations

... To remedy this issue, ALC-MADDPG [39] and CASEC [40] use the predefined threshold to prune the low correlation relationships between agents, while G2ANet [41] and S2RL [42] introduce the sparse attention mechanism to learn the dynamics among agents. Furthermore, REFIL [13] randomly partitions entities into sparse sub-groups and forces each sub-group to learn the same objective function as all entities. ...
... Through both theoretical and empirical explorations, DRL benefits in the following three perspectives: i) Invariance: an element of the disentangled representations is invariant to the change of external semantics [5], [6], [7], [8], ii) Integrity: all the disentangled representations are aligned with real semantics respectively and are capable of generating the observed, undiscovered and even counterfactual samples [9], [10], [11], [12], and iii) Generalization: representations are intrinsic and robust instead of capturing confounded or biased semantics, thus being able to generalize for downstream tasks [13], [14], [15]. ...
... For example, one can learn fast approximations of Strong Branching, an effective but time-consuming branching strategy usually used in B&B [2,22,26,43]. One may also learn cutting strategies [8,21,39], node selection/pruning strategies [20,42], or decomposition strategies [38] with ML models. The role of ML models in those approaches can be summarized as: approximating useful mappings or parameterizing key strategies in MILP solvers, and these mappings/strategies usually take an MILP instance as input and output its key peroperties. ...
... The multi-agent credit assignment aims to correctly allocate the reward signal to each individual agent for a better groups' coordination (Chang, Ho, and Kaelbling 2003). One popular class of solutions is value decomposition, which can decompose team value function into agent-wise value functions in an online fashion under the framework of the Bellman equation (Sunehag et al. 2018;Rashid et al. 2018;Yang et al. 2020b;Li et al. 2021). Different from these works, in this paper, we focus on explicitly decomposing the total reward into individual rewards in an offline fashion under the regression framework, and these decomposed rewards will be used to reconstruct the offline prioritized dataset. ...