co.combinatorics - How can I generate random permutations of [n] with k cycles, where k is much larger than log n?

Thursday, 6 December 2012

co.combinatorics - How can I generate random permutations of [n] with k cycles, where k is much larger than log n?

One can sample from the uniform distribution on permutations on $n$ elements with $k$ cycles in expected time $O(nsqrt{k})$.

For large $n$ and $k$ this may be more feasible than
the method Herb Wilf refers to in his answer, which,
if I understand right, requires the generation of the Stirling cycle numbers $S(m,r)$
for $mleq n$ and $rleq k$.

The idea is as follows: consider a Poisson-Dirichlet($theta$) partition of $[n]$. The
blocks of such a partition have the distribution of the cycles of a random permutation
of $[n]$ from the distribution which gives weight proportional to $theta^r$ to any permutation with $r$ cycles. In particular, conditioned on the number of cycles, the permutation is uniformly distributed. (Once one has a partition into blocks corresponding to the cycles, one can just fill in the elements of $[n]$ into the positions in the cycles uniformly at random).

Choose $theta$ in such a way
that the mean number of cycles is around $k$.
One can sample a PD($theta$) partition of $[n]$ in time $O(n)$ (see below).
Keep generating independent samples of the partition until you get one
with exactly $k$ blocks.
The variance of the number of cycles will be $O(k)$ (see below) and the
probability that the number of cycles is precisely $k$ will be on the order of $1/sqrt{k}$, so
one will need to generate
$O(sqrt{k})$ such samples before happening upon one with precisely $m$ cycles.

So this is really not so far from what you suggested in the question
(generate random permutations until you find one that fits) with the twist that
instead of generating from the uniform distribution (which corresponds to PD(1))
you choose a better value of $theta$ and generate from PD($theta$).

Here are two nice ways to sample a PD($theta$) partition of $[n]$:

(1) "Chinese restaurant process" of Dubins and Pitman.
We add elements to the partition one by one. Element 1 starts in a block on its own. Thereafter, when we add element $r+1$,
suppose there are currently $m$ blocks whose sizes are $n_1, n_2, ... n_m$. Add element $r+1$ to block $i$ with probability
$n_i/(r+theta)$, for $1leq ileq m$, and put element $r+1$ into a new block on its own with probability $theta/(r+theta)$.

(2) "Feller representation". Generate independent Bernoulli random variables $B_1, dots, B_n$ with $P(B_i=1)=theta/(i-1+theta)$.
Write a string of length $n$ divided up into blocks, with the rule that we start a new block before position $i$ whenever $B_i=1$.
So for example if $n=10$ with $B_1=B_5=B_6=B_9=1$ and the other $B_i$ equal to 0, then the pattern is

(a b c d)(e)(f g h)(i j).

(Note that always $B_1=1$). Then assign the elements of $[n]$ to the positions in the blocks uniformly at random.

The expected number of blocks is $sum_{i=1}^n mathbb{E}B_i$, which is $sum_{i=1}^n theta/(i+theta)$,
which is approximately $theta(log n-log theta)$. If this is equal to $k$ with $1 << k << n$, then
the number of blocks will be approximately normal with mean $k$ and variance $O(k)$.

For details of some of the things mentioned here
to do with Poisson-Dirichlet partitions, random permutations etc, see e.g. Pitman's lecture notes from
Saint-Flour: http://bibserver.berkeley.edu/csp/april05/bookcsp.pdf

Answer Desk

Thursday, 6 December 2012

co.combinatorics - How can I generate random permutations of [n] with k cycles, where k is much larger than log n?

No comments:

Post a Comment