GIVEN: Positive integers $n,m,L$ and probabilities $p_1, p_2, ldots, p_n$.
GOAL: Choose $L$ size-$m$ subsets $S_1, S_2, ldots, S_L$ of ${1,2,ldots,n}$ to maximize $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ]$, where the expectation is taken over (all $2^n$) $S subseteq {1,2,ldots,n}$ using the pdf $f$ defined as
[ f(S) = left( prod_{i in S} p_i right) left( prod_{i notin S} (1-p_i) right) . ]
(Equivalently, $S$ is chosen by examining each $i in {1,2,ldots,n}$ independently, choosing $i$ to be in $S$ with probability $p_i$.)
Note: the problem is trivial unless $m < n$ and $L < binom{n}{m}$.
A nice, closed-form expression for $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ]$ (in terms of $n,m,L$ and the $p_i$ and the $S_ell$) would be a good start.
EDIT:
WLOG, $p_1 geq p_2 geq cdots geq p_n$.
Special cases:
$L=1$: $S_1 = { 1, 2, ldots, m }$ maximizes $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ]$.
$m=1$: Denote by $x_i$ the single element of $S_i$ (i.e., $S_i = { x_i }$) and $X = { x_1, x_2, ldots, x_L } = bigcup S_i$. $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ] = sum_{S cap X neq emptyset} 1 cdot f(S) = 1 - sum_{S subseteq { 1, 2, ldots, n } setminus X} f(S) = 1 - prod_{i in X} (1-p_i)$, which is maximized by the greedy solution of $S_i = { i }$ (for $1 leq i leq L$).
Here's how I examined a few more special cases:
This example ($(n,m,L)=(4,1,2)$) falls under the already discussed $m=1$ case, but I present it to hopefully make the presentation of subsequent examples clearer:
$binom{binom{n}{m}}{L} = binom{binom{4}{1}}{2} = 6$ possibilities for $(S_1,S_2)$:
1100 1ooo,o1oo, p1+p2 - p1p2
1010 1ooo,oo1o, p1+ p3 - p1p3
1001 1ooo,ooo1, p1+ p4 - p1p4
0110 o1oo,oo1o, p2+p3 - p2p3
0101 o1oo,ooo1, p2+ p4 - p2p4
0011 oo1o,ooo1, p3+p4 - p3p4
- Column 1: the sum of the subsequent $L=2$ columns (I used "o" for "0" in those for readability)
- Column 2: represents $S_1$ (the 1st bit is 1 iff $1 in S$ (0 otherwise), the 2nd bit is 1 iff $2 in S$, etc.)
- Column 3: represents $S_2$
- Column 4: $displaystyle mathbb{E}[ max_{1leq ell leq L} |S cap S_ell| ]$ (formatted to put the degree-1 terms all together first, then the degree-2 terms, etc., with spacing so you can look up and down and see which other rows share those terms).
$binom{binom{n}{m}}{L} = binom{binom{4}{2}}{2} = 15$ possibilities for $(S_1,S_2)$:
2110 11oo,1o1o, p1+p2+p3 - p2p3
2101 11oo,1oo1, p1+p2+ p4 - p2p4
1210 11oo,o11o, p1+p2+p3 - p1p3
1201 11oo,o1o1, p1+p2+ p4 - p1p4
2011 1o1o,1oo1, p1+ p3+p4 - p3p4
1120 1o1o,o11o, p1+p2+p3 - p1p2
1021 1o1o,oo11, p1+ p3+p4 - p1p4
1102 1oo1,o1o1, p1+p2+ p4 - p1p2
1012 1oo1,oo11, p1+ p3+p4 - p1p3
0211 011o,o1o1, p2+p3+p4 - p3p4
0121 011o,oo11, p2+p3+p4 - p2p4
0112 01o1,oo11, p2+p3+p4 - p2p3
1111 11oo,oo11, p1+p2+p3+p4 - p1p3-p1p4-p2p3-p2p4 + p1p2p3+p1p2p4+p1p3p4+p2p3p4 - 2p1p2p3p4
1111 1o1o,o1o1, p1+p2+p3+p4 - p1p2- p1p4-p2p3- p3p4 + p1p2p3+p1p2p4+p1p3p4+p2p3p4 - 2p1p2p3p4
1111 1oo1,o11o, p1+p2+p3+p4 - p1p2-p1p3- p2p4-p3p4 + p1p2p3+p1p2p4+p1p3p4+p2p3p4 - 2p1p2p3p4
I separated the rows into two groups; two rows are in the same group if their first-column entries are permutations of each other.
Notes: the coefficient of 2 in the degree-4 terms of the last three rows seems interesting to me. The sign of a term is the negation of $(-1)$ to the degree of the term.
$binom{binom{n}{m}}{L} = binom{binom{4}{2}}{3} = 20$ possibilities for $(S_1,S_2,S_3)$:
3111 11oo,1o1o,1oo1, p1+p2+p3+p4 - p2p3-p2p4-p3p4 + p2p3p4
1311 11oo,o11o,o1o1, p1+p2+p3+p4 - p1p3-p1p4- p3p4 + p1p3p4
1131 1o1o,o11o,oo11, p1+p2+p3+p4 - p1p2- p1p4- p2p4 + p1p2p4
1113 1oo1,o1o1,oo11, p1+p2+p3+p4 - p1p2-p1p3- p2p3 + p1p2p3
2220 11oo,1o1o,o11o, p1+p2+p3 - p1p2p3
2202 11oo,1oo1,o1o1, p1+p2+ p4 - p1p2p4
2022 1o1o,1oo1,oo11, p1+ p3+p4 - p1p3p4
0222 o11o,o1o1,oo11, p2+p3+p4 - p2p3p4
2211 11oo,1o1o,o1o1, p1+p2+p3+p4 - p1p4-p2p3- p3p4 + p1p3p4+p2p3p4 - p1p2p3p4
2121 11oo,1o1o,oo11, p1+p2+p3+p4 - p1p4-p2p3-p2p4 + p1p2p4+ p2p3p4 - p1p2p3p4
2211 11oo,1oo1,o11o, p1+p2+p3+p4 - p1p3- p2p4-p3p4 + p1p3p4+p2p3p4 - p1p2p3p4
2112 11oo,1oo1,oo11, p1+p2+p3+p4 - p1p3- p2p3–p2p4 + p1p2p3+ p2p3p4 - p1p2p3p4
1221 11oo,o11o,oo11, p1+p2+p3+p4 - p1p3-p1p4- p2p4 + p1p2p4+p1p3p4 - p1p2p3p4
1212 11oo,o1o1,oo11, p1+p2+p3+p4 - p1p3–p1p4-p2p3 + p1p2p3+ p1p3p4 - p1p2p3p4
2121 1o1o,1oo1,o11o, p1+p2+p3+p4 - p1p2- p2p4–p3p4 + p1p2p4+ p2p3p4 - p1p2p3p4
2112 1o1o,1oo1,o1o1, p1+p2+p3+p4 - p1p2- p2p3- p3p4 + p1p2p3+ p2p3p4 - p1p2p3p4
1221 1o1o,o11o,o1o1, p1+p2+p3+p4 - p1p2- p1p4- p3p4 + p1p2p4+p1p3p4 - p1p2p3p4
1122 1o1o,o1o1,oo11, p1+p2+p3+p4 - p1p2- p1p4-p2p3 + p1p2p3+p1p2p4 - p1p2p3p4
1212 1oo1,o11o,o1o1, p1+p2+p3+p4 - p1p2-p1p3- p3p4 + p1p2p3+ p1p3p4 - p1p2p3p4
1122 1oo1,o11o,oo11, p1+p2+p3+p4 - p1p2-p1p3- p2p4 + p1p2p3+p1p2p4 - p1p2p3p4
Note: the sign of a term is the negation of $(-1)$ to the parity of the degree of the term only in the first and third block of rows — the degree-3 terms in the 4 rows in the middle block are all negative.
No comments:
Post a Comment