Estimating expectations with respect to high-dimensional multimodal distributions is difficult. Here we describe an implementation of Hamiltonian annealed importance sampling in TensorFlow and compare it to other annealed importance sampling implementations.

A common problem in computer science is selecting the $k$ largest (or smallest) elements from an unsorted list containing $n$ elements.

This is a form of partition-based selection. For example, when computing k-nearest-neighbour distances, we first calculate all the pairwise distances between samples, then for each sample we select the $k$ closest distances. In R this is implemented too often as

sort(dists)[1:k]

which is correct but does not scale well. It sorts the entire vector dists before selecting the first $k$ elements. As the number of elements $n$ grows this is inefficient as the sort() call runs in $\mathcal{O}(n \log n)$ time. Partition-based selection algorithms do not sort the entire list nor the selected elements. They run in $\mathcal{O}(n + k \log k)$ resulting in savings of a factor of $\log n$.

The statistical programming language R has an inbuilt and under-appreciated partial sorting implementation that can help tremendously. We showcase, benchmark and discuss this functionality here.