You want $frac15 = sum_t |P_1(count=t) - P_2(count=t)|$.
where $P_1$ has a binomial distribution and $P_2$ is hypergeometric.
The difference between these distributions is shown in this Mathematica demonstration.
I believe both are reasonably well approximated by normal distributions. Both have mean $pk$. The variance for the binomial distribution is $kp(1-p)$, while it is $frac{n-k}{n-1}*k(p)(1-p)$ for the hypergeometric distribution.
So, the value of k should be so that the normal distributions $N(0,1)$ and $N(0,sqrt{frac{n-k}{n-1}})$ have total variation distance $frac1{10}$. That should be at about $k=(1-c)n$ where $N(0,1)$ and $N(0,sqrt{c})$ are $frac1{10}$ apart. Numerically, it seems that $c$ should be about 0.6605 so $sqrt{c}$ should be about 0.8127. $k = 0.3395n$.
It appears this is not sensitive to the value of $p$.
No comments:
Post a Comment