Yuri Boykov’s research while affiliated with University of Waterloo and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (126)


Sparse Non-Local CRF With Applications
  • Article

October 2024

·

3 Reads

IEEE Transactions on Pattern Analysis and Machine Intelligence

Olga Veksler

·

Yuri Boykov

CRFs model spatial coherence in classical and deep learning computer vision. The most common CRF is called pairwise, as it connects pixel pairs. There are two types of pairwise CRF: sparse and dense. A sparse CRF connects the nearby pixels, leading to a linear number of connections in the image size. A dense CRF connects all pixel pairs, leading to a quadratic number of connections. While dense CRF is a more general model, it is much less efficient than sparse CRF. In fact, only Gaussian edge dense CRF is used in practice, and even then with approximations. We propose a new pairwise CRF, which we call sparse non-local CRF. Like dense CRF, it has non-local connections, and, therefore, it is more general than sparse CRF. Like sparse CRF, the number of connections is linear, and, therefore, our model is efficient. Besides efficiency, another advantage is that our edge weights are unrestricted. We show that our sparse non-local CRF models properties similar to that of Gaussian dense CRF. We also discuss connections to other CRF models. We demonstrate the usefulness of our model on classical and deep learning applications, for two and multiple labels.


Collision Cross-entropy and EM Algorithm for Self-labeled Classification

March 2023

·

8 Reads

We propose "collision cross-entropy" as a robust alternative to the Shannon's cross-entropy in the context of self-labeled classification with posterior models. Assuming unlabeled data, self-labeling works by estimating latent pseudo-labels, categorical distributions y, that optimize some discriminative clustering criteria, e.g. "decisiveness" and "fairness". All existing self-labeled losses incorporate Shannon's cross-entropy term targeting the model prediction, softmax, at the estimated distribution y. In fact, softmax is trained to mimic the uncertainty in y exactly. Instead, we propose the negative log-likelihood of "collision" to maximize the probability of equality between two random variables represented by distributions softmax and y. We show that our loss satisfies some properties of a generalized cross-entropy. Interestingly, it agrees with the Shannon's cross-entropy for one-hot pseudo-labels y, but the training from softer labels weakens. For example, if y is a uniform distribution at some data point, it has zero contribution to the training. Our self-labeling loss combining collision cross entropy with basic clustering criteria is convex w.r.t. pseudo-labels, but non-trivial to optimize over the probability simplex. We derive a practical EM algorithm optimizing pseudo-labels y significantly faster than generic methods, e.g. the projectile gradient descent. The collision cross-entropy consistently improves the results on multiple self-labeled clustering examples using different DNNs.


Revisiting Discriminative Entropy Clustering and its relation to K-means

January 2023

·

3 Reads

Maximization of mutual information between the model's input and output is formally related to "decisiveness" and "fairness" of the softmax predictions, motivating such unsupervised entropy-based losses for discriminative neural networks. Recent self-labeling methods based on such losses represent the state of the art in deep clustering. However, some important properties of entropy clustering are not well-known, or even misunderstood. For example, we provide a counterexample to prior claims about equivalence to variance clustering (K-means) and point out technical mistakes in such theories. We discuss the fundamental differences between these discriminative and generative clustering approaches. Moreover, we show the susceptibility of standard entropy clustering to narrow margins and motivate an explicit margin maximization term. We also propose an improved self-labeling loss; it is robust to pseudo-labeling errors and enforces stronger fairness. We develop an EM algorithm for our loss that is significantly faster than the standard alternatives. Our results improve the state-of-the-art on standard benchmarks.






Robust Trust Region for Weakly Supervised Segmentation

April 2021

·

19 Reads

Acquisition of training data for the standard semantic segmentation is expensive if requiring that each pixel is labeled. Yet, current methods significantly deteriorate in weakly supervised settings, e.g. where a fraction of pixels is labeled or when only image-level tags are available. It has been shown that regularized losses - originally developed for unsupervised low-level segmentation and representing geometric priors on pixel labels - can considerably improve the quality of weakly supervised training. However, many common priors require optimization stronger than gradient descent. Thus, such regularizers have limited applicability in deep learning. We propose a new robust trust region approach for regularized losses improving the state-of-the-art results. Our approach can be seen as a higher-order generalization of the classic chain rule. It allows neural network optimization to use strong low-level solvers for the corresponding regularizers, including discrete ones.


Confluent Vessel Trees with Accurate Bifurcations

March 2021

·

29 Reads

We are interested in unsupervised reconstruction of complex near-capillary vasculature with thousands of bifurcations where supervision and learning are infeasible. Unsupervised methods can use many structural constraints, e.g. topology, geometry, physics. Common techniques use variants of MST on geodesic tubular graphs minimizing symmetric pairwise costs, i.e. distances. We show limitations of such standard undirected tubular graphs producing typical errors at bifurcations where flow "directedness" is critical. We introduce a new general concept of confluence for continuous oriented curves forming vessel trees and show how to enforce it on discrete tubular graphs. While confluence is a high-order property, we present an efficient practical algorithm for reconstructing confluent vessel trees using minimum arborescence on a directed graph enforcing confluence via simple flow-extrapolating arc construction. Empirical tests on large near-capillary sub-voxel vasculature volumes demonstrate significantly improved reconstruction accuracy at bifurcations. Our code has also been made publicly available.


Image Segmentation Using Deep Learning: A Survey

February 2021

·

1,285 Reads

·

3,370 Citations

IEEE Transactions on Pattern Analysis and Machine Intelligence

·

Yuri Y Boykov

·

·

[...]

·

Demetri Terzopoulos

Image segmentation is a key task in computer vision and image processing with important applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and image compression, among others, and numerous segmentation algorithms are found in the literature. Against this backdrop, the broad success of Deep Learning (DL) has prompted the development of new image segmentation approaches leveraging DL models. We provide a comprehensive review of this recent literature, covering the spectrum of pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings. We investigate the relationships, strengths, and challenges of these DL-based segmentation models, examine the widely used datasets, compare performances, and discuss promising research directions.


Citations (76)


... We present two failure cases in Figure 8. In the left image, the predicted polygon fails to fit concave contours since the pairwise loss prefers the shorter length [73,6]. The right image shows that our model faces challenges distinguishing similar parts from different instances, as it is difficult to reason object ownership based on color alone. ...

Reference:

BoxSnake: Polygonal Instance Segmentation with Box Supervision
Sparse Non-local CRF
  • Citing Conference Paper
  • June 2022

... The balance between precision, recall, and other metrics like BF-score becomes pivotal in evaluating the effectiveness of a method in real-world scenarios. Intriguingly, the Scribble Labels Supervision methods, including [29], [30], [31], [32], and [33], demonstrate diverse strengths and weaknesses across the metrics. [29] particularly stands out with the highest recall, emphasizing its proficiency in capturing a larger portion of true positive instances. ...

Robust Trust Region for Weakly Supervised Segmentation
  • Citing Conference Paper
  • October 2021

... Image segmentation is widely acknowledged as a crucial and fundamental task in image analysis [7]- [8]. It serves as the initial step in extracting significant information from images. ...

Image Segmentation Using Deep Learning: A Survey
  • Citing Article
  • February 2021

IEEE Transactions on Pattern Analysis and Machine Intelligence

... Likewise, [5] proposed the LV model, which segments short-axis echocardiographic views and requires only 5% of the memory of a standard U-Net. Additionally, [19] applied adaptive downsampling using CNNs to refine semantic boundary representation. A common strategy to minimize resource usage and training times involves reducing spatial input dimensions. ...

Efficient Segmentation: Learning Downsampling Near Semantic Boundaries
  • Citing Conference Paper
  • October 2019

... Therefore, other detection methods based on heatmap regression are used. In reference [24][25][26] , the heatmap detection method is used to obtain the probability distribution and location information of the joint points. This method usually has higher prediction accuracy than coordinate regression. ...

RePose: Learning Deep Kinematic Priors for Fast Human Pose Estimation
  • Citing Preprint
  • February 2020

... In computer vision tasks, image segmentation is a crucial step. It is the process of splitting an image into distinct segments-objects [46]. This process involves a DL model capable of processing a dataset of images and understanding image features. ...

Image Segmentation Using Deep Learning: A Survey
  • Citing Preprint
  • File available
  • January 2020

... Yang et al. [23] presented a method comprising of a graph-model-based scheme, i.e., graph cuts [2] and a noisy learning paradigm (abbreviated to GMBM-DLM) for weakly-supervised instrument segmentation. Some studies [18,14] investigated penalization terms to regularize training. ...

Beyond Gradient Descent for Regularized Segmentation Losses