Andrew Delong’s research while affiliated with University of Toronto and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (14)


Local Submodularization for Binary Pairwise Energies
  • Article

November 2016

·

32 Reads

·

17 Citations

IEEE Transactions on Pattern Analysis and Machine Intelligence

·

Yuri Boykov

·

·

[...]

·

Andrew Delong

Many computer vision problems require optimization of binary non-submodular energies. We propose a general optimization framework based on local submodular approximations (LSA). Unlike standard LP relaxation methods that linearize the whole energy globally, our approach iteratively approximates the energy locally. On the other hand, unlike standard local optimization methods (e.g., gradient descent or projection techniques) we use non-linear submodular approximations and optimize them without leaving the domain of integer solutions. We discuss two specific LSA algorithms based on trust region and auxiliary function principles, LSA-TR and LSA-AUX. The proposed methods obtain state-of-the-art results on a wide range of applications such as binary deconvolution, curvature regularization, inpainting, segmentation with repulsion and two types of shape priors. Finally, we discuss a move-making extension to the LSA-TR approach. While our paper is focused on pairwise energies, our ideas extend to higher-order problems. The code is available online.


Figure 2. Local linearization of supermodular pairwise potential f (x, y) = α · xy for α > 0. This potential defines four costs f (0, 0) = f (0, 1) = f (1, 0) = 0 and f (1, 1) = α at four distinct configurations of binary variables x, y ∈ {0, 1}. These costs can be plotted as four 3D points A, B, C, D in (a-c). We need to approximate supermodular potential f with a linear function v · x + w · y + const (plane or unary potentials). LSA-TR: one way to derive a local linear approximation is to take Taylor expansion of f (x, y) = α · xy over relaxed variables x, y ∈ [0, 1], see the continuous plot in (a). At first, this idea may sound strange since there are infinitely many other continuous functions that agree with A, B, C, D but have completely different derivatives, e.g. g(x, y) = α · x 2 √ y. However, Taylor expansions of bilinear function f (x, y) = α · xy can be motivated geometrically. As shown in (b), Taylor-based local linear approximation of f at any fixed integer configuration (i, j) (e.g. blue plane at A, green at B, orange at C, and striped at D) coincides with discrete pairwise potential f not only at point (i, j) but also with two other closest integer configurations. Overall, each of those planes passes exactly through three out of four points A, B, C, D. LSA-AUX: another approach to justify a local linear approximation for non-submodular pairwise potential f could be based on upper bounds passing through a current configuration. For example, the green or orange planes in (b) are the tightest linear upper bounds at configurations (0, 1) and (1, 0), correspondingly. When current configuration is either (0, 0) or (1, 1) then one can choose either orange or green plane in (b), or anything in-between, e.g. the purple plane passing though A and D in (c).
Figure 3. Binary deconvolution of an image created with a uniform 3 × 3 filter and additive Gaussian noise (σ ∈ {0.05, 0.1, 0.15, 0.2}). No length regularization was used. We report mean energy (+/-2std.) and time as a function of noise level σ. TRWS and LBP are run for 5000 iterations.  
Submodularization for Binary Pairwise Energies
  • Conference Paper
  • Full-text available

November 2014

·

326 Reads

·

38 Citations

Many computer vision problems require optimization of binary non-submodular energies. We propose a general optimization framework based on local submodular approximations (LSA). Unlike standard LP relaxation methods that linearize the whole energy globally, our approach iteratively approximates the energies locally. On the other hand, unlike standard local optimization methods (e.g. gradient descent or projection techniques) we use non-linear submodular approximations and optimize them without leaving the domain of integer solutions. We discuss two specific LSA algorithms based on trust region and auxiliary function principles, LSA-TR and LSA-AUX. These methods obtain state-of-the-art results on a wide range of applications outperforming many standard techniques such as LBP, QPBO, and TRWS. While our paper is focused on pairwise energies, our ideas extend to higher-order problems. The code is available online.

Download

Segmentation with Non-linear Regional Constraints via Line-Search Cuts

October 2012

·

23 Reads

·

22 Citations

Lecture Notes in Computer Science

This paper is concerned with energy-based image segmentation problems. We introduce a general class of regional functionals defined as an arbitrary non-linear combination of regional unary terms. Such (high-order) functionals are very useful in vision and medical applications and some special cases appear in prior art. For example, our general class of functionals includes but is not restricted to soft constraints on segment volume, its appearance histogram, or shape. Our overall segmentation energy combines regional functionals with standard length-based regularizers and/or other submodular terms. In general, regional functionals make the corresponding energy minimization NP-hard. We propose a new greedy algorithm based on iterative line search. A parametric max-flow technique efficiently explores all solutions along the direction (line) of the steepest descent of the energy. We compute the best “step size”, i.e. the globally optimal solution along the line. This algorithm can make large moves escaping weak local minima, as demonstrated on many real images.


Fast Fusion Moves for Multi-model Estimation

October 2012

·

29 Reads

·

10 Citations

Lecture Notes in Computer Science

We develop a fast, effective algorithm for minimizing a well-known objective function for robust multi-model estimation. Our work introduces a combinatorial step belonging to a family of powerful move-making methods like α-expansion and fusion. We also show that our subproblem can be quickly transformed into a comparatively small instance of minimum-weighted vertex-cover. In practice, these vertex-cover subproblems are almost always bipartite and can be solved exactly by specialized network flow algorithms. Experiments indicate that our approach achieves the robustness of methods like affinity propagation, whilst providing the speed of fast greedy heuristics.


Minimizing Energies with Hierarchical Costs

October 2012

·

27 Reads

·

45 Citations

International Journal of Computer Vision

Computer vision is full of problems elegantly expressed in terms of energy minimization. We characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm. Hierarchical costs are natural for modeling an array of difficult problems. For example, in semantic segmentation one could rule out unlikely object combinations via hierarchical context. In geometric model estimation, one could penalize the number of unique model families in a solution, not just the number of models—a kind of hierarchical MDL criterion. Hierarchical fusion uses the well-known α-expansion algorithm as a subroutine, and offers a much better approximation bound in important cases.


Minimizing sparse high-order energies by submodular vertex-cover

January 2012

·

5 Reads

·

15 Citations

Advances in Neural Information Processing Systems

Inference in high-order graphical models has become important in recent years. Several approaches are based, for example, on generalized message-passing, or on transformation to a pairwise model with extra 'auxiliary' variables. We focus on a special case where a much more efficient transformation is possible. Instead of adding variables, we transform the original problem into a comparatively small instance of submodular vertex-cover. These vertex-cover instances can then be attacked by existing algorithms (e.g. belief propagation, QPBO), where they often run 4-15 times faster and find better solutions than when applied to the original problem. We evaluate our approach on synthetic data, then we show applications within a fast hierarchical clustering and model-fitting framework.


Fig. 1 Motion segmentation on the 1RT2RCR sequence (Tron and Vidal 2007). Energy (1) finds 3 dominant motions (a) but labels many points incorrectly. Energy (2) gives coherent segmentations (b) but finds redundant motions. Our energy combines the best of both (c) 
Fig. 10 Each row shows how GMM algorithms behave on a particular example. This table is for illustrative purposes, and is not meant to be a state-of-the-art comparison. (a) If models do not overlap then all algorithms work. (b) Most algorithms can handle uniform outliers by fitting an extra model. (c) EM finds overlapping models thanks to soft assignment; hard assignment has bias towards isolated models. (d) Basic EM (19) may easily get stuck in local minima with only a little more ambiguity in the data. But, EM with sparsity prior (20) can avoid 
Fig. 16 Comparing mean-shift results (a), (c) versus optimization of energy (38) using UFL heuristics (b), (d) 
Figure 4 of 4
Fast Approximate Energy Minimization with Label Costs

January 2012

·

397 Reads

·

389 Citations

International Journal of Computer Vision

The α-expansion algorithm has had a significant impact in computer vision due to its generality, effectiveness, and speed. It is commonly used to minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main algorithmic contribution is an extension of α-expansion that also optimizes “label costs” with well-characterized optimality bounds. Label costs penalize a solution based on the set of labels that appear in it, for example by simply penalizing the number of labels in the solution. Our energy has a natural interpretation as minimizing description length (MDL) and sheds light on classical algorithms like K-means and expectation-maximization (EM). Label costs are useful for multi-model fitting and we demonstrate several such applications: homography detection, motion segmentation, image segmentation, and compression. Our C++ and MATLAB code is publicly available http://vision.csd.uwo.ca/code/



Recursive MDL via graph cuts: Application to segmentation

November 2011

·

70 Reads

·

8 Citations

Proceedings / IEEE International Conference on Computer Vision. IEEE International Conference on Computer Vision

We propose a novel patch-based image representation that is useful because it (1) inherently detects regions with repetitive structure at multiple scales and (2) yields a parameterless hierarchical segmentation. We describe an image by breaking it into coherent regions where each region is well-described (easily reconstructed) by repeatedly instantiating a patch using a set of simple transformations. In other words, a good segment is one that has sufficient repetition of some pattern, and a patch is useful if it contains a pattern that is repeated in the image.


Interactive Segmentation with Super-Labels

July 2011

·

179 Reads

·

29 Citations

Lecture Notes in Computer Science

In interactive segmentation, the most common way to model object appearance is by GMM or histogram, while MRFs are used to encourage spatial coherence among the object labels. This makes the strong assumption that pixels within each object are i.i.d. when in fact most objects have multiple distinct appearances and exhibit strong spatial correlation among their pixels. At the very least, this calls for an MRF-based appearance model within each object itself and yet, to the best of our knowledge, such a “two-level MRF” has never been proposed. We propose a novel segmentation energy that can model complex appearance. We represent the appearance of each object by a set of distinct spatially coherent models. This results in a two-level MRF with “super-labels” at the top level that are partitioned into “sub-labels” at the bottom. We introduce the hierarchical Potts (hPotts) prior to govern spatial coherence within each level. Finally, we introduce a novel algorithm with EM-style alternation of proposal, α-expansion and re-estimation steps. Our experiments demonstrate the conceptual and qualitative improvement that a two-level MRF can provide. We show applications in binary segmentation, multi-class segmentation, and interactive co-segmentation. Finally, our energy and algorithm have interesting interpretations in terms of semi-supervised learning.


Citations (14)


... Graph-based algorithms represent other class of segmentation (Ayed et al., 2010;Gorelick et al., 2012). These algorithms are used for computing the accurate global minima without level set representation. ...

Reference:

Ray-based segmentation algorithm for medical imaging
Segmentation with Non-linear Regional Constraints via Line-Search Cuts
  • Citing Book
  • January 2012

... Receiver operating characteristics (ROC) curve and the area under ROC (AUC) were reported for both STO accuracy and RISI accuracy using four methods, and DeLong's test ( Delong et al., 2012) with the "pROC" R package ( Robin et al., 2011) was used to determine the significance of difference in AUC between all three deep learning methods and ALOHA. ...

Fast Approximate Energy Minimization with Label Costs

International Journal of Computer Vision

... Many semantic segmentation or deep learning algorithms, such as PointNet and RBM (restricted Boltzmann machine), are based on energy functions. Through energy clustering with the given threshold setting, objects with different semantic information are assigned to various categories [28]. ...

Local Submodularization for Binary Pairwise Energies
  • Citing Article
  • November 2016

IEEE Transactions on Pattern Analysis and Machine Intelligence

... Note that the constraints (i.e., k-system) considered in previous studies may be so general that the proposed algorithms may not perform well under some important subclasses of these constraints. In this paper, we consider the (not necessarily monotone) submodular maximization problem under the intersection of k-matroid and m-knapsack constraints, which arises in numerous applications, e.g., vertex cover (Delong et al. 2012), weighted max-cut (Feldman, Harshaw, and Karbasi 2017;Haba et al. 2020), video summarization (Gygli, Grabner, and Gool 2015;Feldman, Karbasi, and Kazemi 2018), image summarization and revenue maximization (Mirzasoleiman, Badanidiyuru, and Karbasi 2016). We propose a Simultaneous and Partial enumeRation cOnstrained sUbmodular maximizaTion algorithm, called Algorithm Approximation ...

Minimizing sparse high-order energies by submodular vertex-cover
  • Citing Article
  • January 2012

Advances in Neural Information Processing Systems

... This leads to the optimization of a cost function composed by two terms, mimicking the classical MAP-MRF objectives: a modelling error (customers-facility distances) which can be interpreted as a likelihood term, and a penalty term (the cost to open the facilities) that encodes model complexity. Some authors solves it with ILP [32][33][34][35], while others propose different combinatorial optimization techniques [23,[36][37][38]. Although Set Cover and Facility Location are related (the first can be rephrased as a special case of the second) and ILP has been used to solve both, ILP-RansaCov differs from previous work based on Facility Location in many respects. ...

Fast Fusion Moves for Multi-model Estimation
  • Citing Conference Paper
  • October 2012

Lecture Notes in Computer Science

... The energy minimization of (13) is an integer quadratic optimization problem. However, because E H is a nonsubmodular energy function [70]- [72], the minimization problem (13) can not be solved by the traditional graph cuts (such as min-cut/max-flow algorithm [73]), which are designed for minimizing submodular energy functions. To solve (13), the quadratic pseudo-Boolean optimization (QPBO) [72], [74] and the local submodular approximation (LSA) based method [70] can be used, which iteratively uses nonlinear submodular approximations and optimizes them without leaving the domain of integer solutions. ...

Submodularization for Binary Pairwise Energies

... Dense monocular motion segmentation methods can generally be divided into two groups: (1) Intensity based methods [8]- [10], [13], [15], [16] and (2) deep learning based methods [2]- [5], [7], [17]- [21]. Besides these two groups of methods which aim to generate a dense segmentation mask, there are also other motion segmentation methods that perform motion segmentation by clustering pre-computed point trajectories into different motion groups [22]- [30]. These methods do not belong to dense motion segmentation, but since they are relevant to our proposed method, we will briefly introduce them too. ...

Fast Approximate Energy Minimization with Label Costs

International Journal of Computer Vision