-
[show abstract]
[hide abstract]
ABSTRACT: We present new MCMC algorithms for computing the posterior distributions and
expectations of the unknown variables in undirected graphical models with
regular structure. For demonstration purposes, we focus on Markov Random Fields
(MRFs). By partitioning the MRFs into non-overlapping trees, it is possible to
compute the posterior distribution of a particular tree exactly by conditioning
on the remaining tree. These exact solutions allow us to construct efficient
blocked and Rao-Blackwellised MCMC algorithms. We show empirically that tree
sampling is considerably more efficient than other partitioned sampling schemes
and the naive Gibbs sampler, even in cases where loopy belief propagation fails
to converge. We prove that tree sampling exhibits lower variance than the naive
Gibbs sampler and other naive partitioning schemes using the theoretical
measure of maximal correlation. We also construct new information theory tools
for comparing different MCMC schemes and show that, under these, tree sampling
is more efficient.
07/2012;
-
[show abstract]
[hide abstract]
ABSTRACT: We propose a new Monte Carlo algorithm for complex discrete distributions.
The algorithm is motivated by the N-Fold Way, which is an ingenious
event-driven MCMC sampler that avoids rejection moves at any specific state.
The N-Fold Way can however get "trapped" in cycles. We surmount this problem by
modifying the sampling process. This correction does introduce bias, but the
bias is subsequently corrected with a carefully engineered importance sampler.
06/2012;
-
[show abstract]
[hide abstract]
ABSTRACT: This paper addresses the problem of sampling from binary distributions with
constraints. In particular, it proposes an MCMC method to draw samples from a
distribution of the set of all states at a specified distance from some
reference state. For example, when the reference state is the vector of zeros,
the algorithm can draw samples from a binary distribution with a constraint on
the number of active variables, say the number of 1's. We motivate the need for
this algorithm with examples from statistical physics and probabilistic
inference. Unlike previous algorithms proposed to sample from binary
distributions with these constraints, the new algorithm allows for large moves
in state space and tends to propose them such that they are energetically
favourable. The algorithm is demonstrated on three Boltzmann machines of
varying difficulty: A ferromagnetic Ising model (with positive potentials), a
restricted Boltzmann machine with learned Gabor-like filters as potentials, and
a challenging three-dimensional spin-glass (with positive and negative
potentials).
03/2012;
-
[show abstract]
[hide abstract]
ABSTRACT: This paper introduces a new specialized algorithm for equilibrium Monte Carlo
sampling of binary-valued systems, which allows for large moves in the state
space. This is achieved by constructing self-avoiding walks (SAWs) in the state
space. As a consequence, many bits are flipped in a single MCMC step. We name
the algorithm SARDONICS, an acronym for Self-Avoiding Random Dynamics on
Integer Complex Systems. The algorithm has several free parameters, but we show
that Bayesian optimization can be used to automatically tune them. SARDONICS
performs remarkably well in a broad number of sampling tasks: toroidal
ferromagnetic and frustrated Ising models, 3D Ising models, restricted
Boltzmann machines and chimera graphs arising in the design of quantum
computers.
11/2011;
-
[show abstract]
[hide abstract]
ABSTRACT: This paper proposes a new randomized strategy for adaptive MCMC using
Bayesian optimization. This approach applies to non-differentiable objective
functions and trades off exploration and exploitation to reduce the number of
potentially costly objective function evaluations. We demonstrate the strategy
in the complex setting of sampling from constrained, discrete and densely
connected probabilistic graphical models where, for each variation of the
problem, one needs to adjust the parameters of the proposal mechanism
automatically to ensure efficient mixing of the Markov chains.
10/2011;
-
[show abstract]
[hide abstract]
ABSTRACT: Adiabatic quantum optimization offers a new method for solving hard
optimization problems. In this paper we calculate median adiabatic times (in
seconds) determined by the minimum gap during the adiabatic quantum
optimization for an NP-hard Ising spin glass instance class with up to 128
binary variables. Using parameters obtained from a realistic superconducting
adiabatic quantum processor, we extract the minimum gap and matrix elements
using high performance Quantum Monte Carlo simulations on a large-scale
Internet-based computing platform. We compare the median adiabatic times with
the median running times of two classical solvers and find that, for the
considered problem sizes, the adiabatic times for the simulated processor
architecture are about 4 and 6 orders of magnitude shorter than the two
classical solvers' times. This shows that if the adiabatic time scale were to
determine the computation time, adiabatic quantum optimization would be
significantly superior to those classical solvers for median spin glass
problems of at least up to 128 qubits. We also discuss important additional
constraints that affect the performance of a realistic system.
06/2010;
-
[show abstract]
[hide abstract]
ABSTRACT: CUDA and OpenCL are two different frameworks for GPU programming. OpenCL is
an open standard that can be used to program CPUs, GPUs, and other devices from
different vendors, while CUDA is specific to NVIDIA GPUs. Although OpenCL
promises a portable language for GPU programming, its generality may entail a
performance penalty. In this paper, we use complex, near-identical kernels from
a Quantum Monte Carlo application to compare the performance of CUDA and
OpenCL. We show that when using NVIDIA compiler tools, converting a CUDA kernel
to an OpenCL kernel involves minimal modifications. Making such a kernel
compile with ATI's build tools involves more modifications. Our performance
tests measure and compare data transfer times to and from the GPU, kernel
execution times, and end-to-end application execution times for both CUDA and
OpenCL.
05/2010;
-
[show abstract]
[hide abstract]
ABSTRACT: This paper describes an algorithm for selecting parameter values (e.g. temperature values) at which to measure equilibrium properties with Parallel Tempering Monte Carlo simulation. Simple approaches to choosing parameter values can lead to poor equilibration of the simulation, especially for Ising spin systems that undergo $1^st$-order phase transitions. However, starting from an initial set of parameter values, the careful, iterative respacing of these values based on results with the previous set of values greatly improves equilibration. Example spin systems presented here appear in the context of Quantum Monte Carlo. Comment: Accepted in International Journal of Modern Physics C 2010, http://www.worldscinet.com/ijmpc
04/2010;
-
[show abstract]
[hide abstract]
ABSTRACT: This paper presents two conceptually simple methods for parallelizing a
Parallel Tempering Monte Carlo simulation in a distributed volunteer computing
context, where computers belonging to the general public are used. The first
method uses conventional multi-threading. The second method uses CUDA, a
graphics card computing system. Parallel Tempering is described, and challenges
such as parallel random number generation and mapping of Monte Carlo chains to
different threads are explained. While conventional multi-threading on CPUs is
well-established, GPGPU programming techniques and technologies are still
developing and present several challenges, such as the effective use of a
relatively large number of threads. Having multiple chains in Parallel
Tempering allows parallelization in a manner that is similar to the serial
algorithm. Volunteer computing introduces important constraints to high
performance computing, and we show that both versions of the application are
able to adapt themselves to the varying and unpredictable computing resources
of volunteers' computers, while leaving the machines responsive enough to use.
We present experiments to show the scalable performance of these two
approaches, and indicate that the efficiency of the methods increases with
bigger problem sizes.
03/2010;
-
UAI 2010, Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, Catalina Island, CA, USA, July 8-11, 2010; 01/2010
-
Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, NIPS 2005, December 5-8, 2005, Vancouver, British Columbia, Canada]; 01/2005
-
[show abstract]
[hide abstract]
ABSTRACT: Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, are less frequently discussed. In this paper, we present an analysis of several optimizations done on both central processing unit (CPU) and GPU implementations of a particular computationally intensive Metropolis Monte Carlo algorithm. Explicit vectorization on the CPU and the equivalent, explicit memory coalescing, on the GPU are found to be critical to achieving good performance of this algorithm in both environments. The fully-optimized CPU version achieves a 9× to 12× speedup over the original CPU version, in addition to speedup from multi-threading. This is 2× faster than the fully-optimized GPU version, indicating the importance of optimizing CPU implementations.
Journal of Computational Physics.
-
[show abstract]
[hide abstract]
ABSTRACT: We propose efficient MCMC tree samplers for random fields and factor graphs. Our tree sampling approach combines elements of Monte Carlo simulation as well as exact belief propagation. It requires that the graph be partitioned into trees first. The partition can be generated by hand or automatically using a greedy graph algorithm. The tree partitions allow us to perform exact inference on each tree. This enables us to implement efficient Rao-Blackwellised blocked Gibbs samplers, where each tree is sampled by conditioning on the other trees. We use information theory tools to rank MCMC algorithms corresponding to different partitioning schemes.