Dharmath Vader: array

Showing posts with label array. Show all posts

Sunday, March 27, 2016

Various double counting problems

These problems can all be solved by the double counting principle, that is, to count the same object two different ways and assert that those must match.

Suppose $n > 4$. Let $A_1, A_2, \dots ,A_k$ each be a set of $n$ integers, where $k \leq 4^{n-1} / 3^n$. Show that we can color each integer with one of the four colors so that each set contains integers of 4 different colors.

In a party attended by $n$ people, there were a number of handshakes. A group of 10 persons is called "tight" if each person shook hands with the other 9. It is known that we cannot add another handshake (two people who didn't shake hands now shaking hands) without increasing the number of tight groups. Prove that the number of handshakes is at least $8n-36$

In a class with many students, we choose 2016 committees among them. Each committee has 11 students, and each student may belong to multiple committees. Prove that there is a way to choose some of the students so that each committee has at least one chosen and one unchosen student.

Tuesday, February 9, 2016

Paintings and colors

In a museum there are $n$ paintings, each of which was painted with at least 3 colors. The total number of colors from all paintings is $m << n$. The average number of colors per painting is $s$.

If we choose $k$ paintings at random, what's the expected number of colors that we get? What's the maximum and minimum value of this expectation, for a fixed color-painting configuration, but for varying choices of paintings?

If we choose $l$ colors at random, what's the expected number of paintings that can be replicated with those colors? (It means each painting is painted only with those colors). Again, what's the maximum and minimum value of this expectation?

Remark: by choosing $m,n,s,k,l$ judiciously we can write problems such as: show that we can choose $k$ paintings such that the number of colors among those paintings is at most / at least $x$.

Solution

There are $N = {n \choose k}$ ways to choose paintings. For each choice of paintings, we tally up the total number of colors produced, and then we sum it over all possible choice of paintings. Let $S$ be this sum of total number of colors. The expected number of colors is then $S / N$.

Suppose the color $c$ is used by $p_c$ paintings. We wish to count the contribution of $c$ towards $S$. The color contributes one point to $S$ for each choice of paintings that includes at least one of the $p_c$, and zero otherwise. That means now the question is, how many ways can we choose $k$ out of $n$ paintings such that at least one of them comes from the $p_c$ paintings? There are $N$ ways total to choose, and there are ${n-p_c \choose k}$ ways to choose such that none of the $p_c$ paintings are chosen, so the total number is: $${n \choose k} - {n-p_c \choose k}$$ So: $$S = \sum_c {n \choose k} - {n-p_c \choose k}$$ where the sum runs over all $m$ colors. This expression is exact. What follows from this point on is an attempted to upper/lower bound $S$, and it may not be exact, although we will give examples to show that the bound is tight.

Now consider the function $f(t) = {n \choose k} - {n-t \choose k}$, defined over $t \in [1,m]$ (the color must be used in at least 1 painting, and at most $m$ paintings obviously). We are given that $m << n$ so it's safe to assume that ${n-m \choose k} > 0$. The convexity and concavity analysis depends on if $k$ is even or odd. For now, assume $k$ is odd. The analysis if $k$ is even is very similar.

The function $(n-t)(n-t-1)\dots (n-t-k+1)$ is convex in the interval $(0,n-k-1)$, and since $m << n$, it is convex in the interval $[1,m]$. That means $f(t)$ is concave. On the other hand, we know that the average number of colors per painting is $s$, so there are $sn$ distinct painting-color pairs, so $\sum_c p_c = sn$.

By Jensen, the maximum of $\sum_c f(p_c)$ is achieved if all the $p_c$ is the same, that is $sn/m$. In that case, $$\frac{S}{N} \leq \frac{\sum_c {n \choose k} - {n-p_c \choose k}}{n \choose k} = \frac{ m({n \choose k} - {n-\frac{sn}{m} \choose k}) }{n \choose k}$$ For example, for $n=50,k=3,m=25,s=10$, we have $S/N = 19.8$. So we can write a problem such as: show that we can choose 3 paintings such that between them we get at most 19 colors.

In order to find the lower bound, by majorization theorem, the minimum of $\sum_c f(p_c)$ is achieved by $[p_c] = [m,m,m,\dots,m,r,1,\dots,1]$ because that's the configuration that majorizes everything else. The number of $m$s, and $1$s and the value of $r$ (if needed) will have to depend on the particular problem setup, to make sure the following two conditions are met:

1. $\sum_c p_c = sn$ 2. The number of elements in $[p_c]$ is $m$.

For example, in the $n=50,k=3,m=25,s=10$ case above, we would have $[p_c] = [25,25,\dots,25,20,1,1,\dots,1]$ where there are 19 25s and 6 1s. Then: $$S \geq 25 \times {50 \choose 3} - 19 \times {50-25 \choose 3} - 1 \times {50-20 \choose 3} - 6 \times {50-1 \choose 3}$$ $$\frac{S}{N} \geq 16.9$$ So we can say: show that we can choose 3 paintings such that between them we get at least 17 colors.

Dancing partners

In a party with $n$ boys and $n$ girls, each boy dances with some girls and vice versa. (No pair of boy and girl dance more than once). There are $n^2-n+1$ dances that occurred throughout the night. Show that we can pair each boy with each girl so that everyone is paired with someone he/she danced with.

Solution

The key fact is because the number of dances is more than $n^2-n$. If it were exactly $n^2-n$, we could have a situation where $n-1$ boys danced with all the girls, and one boy never danced, and this effectively blocks that boy from being paired correctly.

Suppose that there is no way to pair them correctly. That is, for every given pairing, there's always a pair that did not dance. For each pairing, we define the score of said pairing to be the number of pairs that actually danced. This means a legal pairing is the one with score $n$.

Now let $S$ be the sum of all the scores of all possible pairings. Our contrapositive hypothesis is that all pairings have a score of at most $n-1$, so $S \leq (n-1)n!$ (because there are $n!$ possible pairings).

On the other hand, for each dance that happened throughout the night, we want to count how many points it contributes to $S$. Suppose the dance happens between boy $B$ and girl $G$. Then it contributes one point of score whenever the pairing pairs $B$ and $G$. In other words, there are $(n-1)!$ such pairings (because one pair is fixed and we have complete freedom to pair the other $n-1$ boys and $n-1$ girls. So the total number of score is: $$S = (n^2-n+1)(n-1)! > (n^2-n)(n-1)! = (n-1)n!$$ A contradiction.

Thursday, January 28, 2016

Tournament, number of dominated players

In a tournament with $n$ people, each player plays against each other player exactly once, with each game results in a win or a loss (no draw).

If we choose $A$ a set of 5 players at random, let $X_3$ denote the number of players not in $A$ that lose to exactly 3 players in $A$. What's $E[X_3]$? What about $E[X_4]$?

Remark: by choosing $n$ carefully, we can write problems such as: Show that we can choose $A$ and $B$ disjoint groups containing 5 players each so that each player in $B$ loses to exactly 3 players in $A$. The number 5 can be replaced by other numbers too.

Solution

There are $N = {n \choose 5}$ ways to choose set $A$. Of all those choices, we sum up the total number each player loses to exactly 3 players in $A$.

That is, for a given player $x$ that loses $p_x$ times, he contributes to ${p_x \choose 3} {n-p_x-1 \choose 2}$ to the total sum. This is because we can choose ${p_x \choose 3}$ ways to form the part of $A$ that beats $x$, and ${n-p_x-1 \choose 2}$ ways to form the part that doesn't beat $x$, and these are mutually exclusive and independent. Note that if $p_x < 3$ that number is zero as expected.

So the expected value is: $$E[X_3] = \frac{\sum_x {p_x \choose 3} {n-p_x-1 \choose 2}}{n \choose 5}$$ Because $f(t) = {p_x \choose 3} {n-p_x-1 \choose 2} - \frac{t(t-1)(t-2)(n-t-1)(n-t-2)}{3! 2!}$ is a convex function in $t$, we can bound this number by Jensen, noting that $\sum_x p_x = n(n-1)/2$ $$E[X_3] > \frac{{(n-1)/2 \choose 3} {n-(n-1)/2-1 \choose 2}}{n \choose 5}$$ By choosing $n$ carefully, or replacing the numbers 5 and 3 accordingly, we can write problems like: in such and such tournament, show that we can choose 5 players into group $A$ and 5 other players into group $B$ such that each player in group $B$ loses the majority of his games against players in group $A$, or each player in group $A$ wins the majority of his games against players in group $B$.

But finding $E[X_4]$ is a wholly different issue. We can still use the fact that $$E[X_4] = \frac{\sum_x {p_x \choose 4} {n-p_x-1 \choose 1}}{n \choose 5} = \frac{\sum_x {p_x \choose 4} (n-p_x-1)}{n \choose 5}$$ but unfortunately now $f(t) = {t \choose 4} (n-t-1)$ is a concave function, so we can't use Jensen. We can still use majorization theorem however, noting that $\sum_x p_x = n(n-1)/2$. If we want to find a sequence of $p_i$ that's absolutely majorized, we can say that: $$(n-1, n-2, \dots, 1, 0) \succ (p_1, p_2, \dots, p_n)$$ which must be true in the most extreme example that each player has an absolute skill level and each victory is transitive. However, computing $E_4$ in that case is difficult. $$E[X_4] > \frac{1}{n \choose 5} \sum_{k=0}^{n-1} {k \choose 4} (n-k-1)$$ which can be done analytically because the sums of $\sum k, \sum k^2, dots \sum k^5$ can all be computed via telescoping arguments.

If you want to be lazy, you can also say that $(n(n-1)/2,0,\dots,0)$ majorizes everything, but that lower bound is probably lame.

Why did I mention this? Because this type of analysis eventually opens up to problems like:

Show that we can choose $A$ and $B$ such that each member of $B$ loses at least 3 times against $A$, or that players from $A$ wins at least $22$ times against $B$, and so on.

What about the number of victories from $A$ to $B$ though? If we label this quantity $V$, then $E[V] = E[X_1 + 2X_2 + 3X_3 + 4X_4 + 5X_5]$, each of which can be evaluated but the even terms are ugly. $E[X_2]$ is not that bad but $E[X_4]$ is definitely not something you want to do too often.

However, there's a faster way to calculate $E[V]$. By symmetry we know it would be $25/2$, because for every choice of $A$ and $B$, there's an equal probability of choosing $B$ and $A$, and the sum of those victories must be 25.

A related problem from Iran Team Selection Test 2008/6.)

Suppose 799 teams participate in a tournament in which every pair of teams plays against each other exactly once. Prove that there exist two disjoint groups $A$ and $B$ of 7 teams each such that every team from $A$ defeated every team from $B$.

Thursday, February 6, 2014

Searching on a chess board

Alice fills each cell of an $n \times n$ chess board with real numbers, such that there is exactly one zero somewhere on that board. The rest of the cells could be filled with positive or negative numbers, so that all numbers are distinct. Furthermore, she filled it such that each row is sorted from left to right and each column is sorted from top to bottom.

Now she asks Bob to find the location of the zero via a guessing game. Bob is told about the sortedness of the board and that all numbers are distinct, but he doesn't know the numbers themselves. He is allowed to make a series of guesses (of locations), and after each guess, he would be told the number on that cell. The game ends when Bob discovers the zero.

Show that Bob can devise a strategy that guarantees discovery after $2n-1$ guesses.

Show that there is no strategy that guarantees discovery with less than $2n-1$ guesses.

Solution to First Problem

Let cell $(i,j)$ refers to the cell on the $i$-th row and $j$-th column, with $i,j \geq 1$ (usual matrix notation, NOT the usual computer science notation). Bob can follow the following algorithm:

First start by guessing $(n,1)$. If that number is positive, go up 1 cell, and if it's negative, go right one cell.

In order to show that this strategy works, we have to prove three things: termination, correctness, and running time.

Proof of termination and running time

If $(i,j)$ represents the cell of our last guess, let $S = i-j$. Note that in the beginning, $S = n-1$. And for each move, whether that's 1 up or 1 right, $S$ decreases by one. Also that the minimum number of $S$ is attained when $i$ is minimum and $j$ is maximum, namely on cell $(1,n)$, which means $S = 1-n$. That means that after $(n-1)-(1-n)+1 = 2n-1$ moves we are guaranteed to terminate.

Proof of correctness

Let $A$ be the event that we guessed the column (but not necessarily the row) of the zero correctly, and $B$ denotes the event that we guessed the row of the zero correctly. It's easy to see that after $A$ or $B$ is reached, we are guaranteed to reach the zero. For example, once $A$ happens, our guess will keep moving up until it reaches the zero, and vice versa.

Now to show that $A$ or $B$ is ever reached, suppose the zero is at a location $(i_0, j_0)$. That means $A$ happens after $(i_0-1)$ right moves, and $B$ happens after $(n-j_0)$ up moves. Every move is either a right move or an up move, so after many enough moves, either $A$ or $B$ will happen.

Solution to Second Problem

Define the "major" diagonal to be cells $(i,j)$ such that $i+j = n+1$. That is, bottom left to top right. Define the "minor" diagonal to be cells $(i,j)$ such that $i+j = n$. That is, cells above (or to the left of) major diagonal. Now suppose Alice populates the board in the following way: populate everything above the minor diagonal with numbers less than -1, and populate everything below major diagonal with numbers > 1, all while still satisfying the sortedness criteria.

As for the diagonals themselves, note that if the major diagonal is populated with numbers between 0 and 1, and the minor diagonal populated with numbers between -1 and 0, and if the zero is anywhere on the two diagonals, the sortedness criteria is still satisfied. Between cells in the major diagonals themselves (say $(n,1)$ versus $(n-1,2)$) there is no constraint, and likewise there is no constraint between cells in the minor diagonals.

Now suppose Alice never decides on the content of the major and minor diagonals until Bob asks for the content of those cells. In other words, even Alice herself doesn't know where the zero will be. As Bob executes his strategy, if he asks for a cell on the major diagonal, Alice just generates a random number between zero and one. If he asks for a cell on the minor diagonal, she generates a random number between -1 and zero. Note that at any given point, even if Bob knows the content of all the other non-diagonal cells, these numbers are still consistent with the sortedness of the board.

Alice reveals zero only after all the other diagonal cells are exhausted. In other words, Alice only reveals zero if that is the last diagonal cell (major or minor) that Bob hasn't asked. Because there are $2n-1$ diagonal cells, any strategy that Bob has can never guarantee discovery with less than $2n-1$ guesses.

Thursday, November 19, 2009

3 Sequence Inequality

If $a_i, b_i, c_i$ are sequences of positive numbers for $i = 1,2,\cdots,n$, prove the following inequality:

$\sum (a_i+b_i+c_i) \sum \frac{a_ib_i + b_ic_i+ c_ia_i}{a_i+b_i+c_i} \sum \frac{a_ib_ic_i}{a_ib_i + b_ic_i+ c_ia_i} \leq \sum a_i \sum b_i \sum c_i$

where all summations are taken from $i=1$ to $i=n$. When does equality happen?

Dharmath Vader

Pages

Sunday, March 27, 2016

Various double counting problems

Tuesday, February 9, 2016

Paintings and colors

Dancing partners

Thursday, January 28, 2016

Tournament, number of dominated players

Thursday, February 6, 2014

Searching on a chess board

Solution to First Problem

Proof of termination and running time

Proof of correctness

Solution to Second Problem

Thursday, November 19, 2009

3 Sequence Inequality

Links

Followers

Blog Archive

latex

Labels