University of Tokyo Graduate School of Engineering Mathematics 2020 Problem 3

📢 This article was translated by gemini-3-flash-preview

Introduction

Even if I write a Japanese title, this post isn’t primarily in Japanese. Might add more if I have time.

Problem 1: https://blog.yexca.net/en/archives/183

Problem 2: https://blog.yexca.net/en/archives/187

Problem 3: This article

My overall impression is that since these math problems are shared across multiple majors, they seem designed to guide candidates toward proving a specific conclusion or understanding a specific concept. Calculus covered arc-length parameterization, and Linear Algebra (or Probability here) covers the 37% rule. If you already know these conclusions, the problems are much easier. If you’ve never studied them and still solve them from scratch—kudos to you. If I had to start from zero, I’d definitely fail.

Also, since these are established theories, the final answers should be correct. As for the derivation process… well, who knows? XD

Problem

Source: https://www.i.u-tokyo.ac.jp/edu/entra/examarchive.shtml

The copyright of the problem belongs to the University of Tokyo. It is cited here for convenience and non-profit educational purposes.

Suppose you are interviewing $n$ candidates for a part-time job and want to hire the best one. Let $n \ge 2$. There is an absolute ranking from 1 to $n$ among the candidates. For candidates already interviewed, their relative ranking is known. Interviews are conducted one by one in a random order that is not known in advance. The hiring decision is based on the relative ranking of candidates already interviewed, subject to the following conditions:

  • A decision to hire or reject must be made immediately after each interview.
  • The hiring process ends as soon as a candidate is hired.
  • A rejected candidate cannot be hired later.
  • If no candidate is hired by the $(n-1)$-th interview, the $n$-th candidate is hired automatically.

The following strategy is used for hiring, where $1 < r \le n$:

  • Automatically reject the first $r-1$ candidates.
  • For subsequent candidates, hire the first one who is better than the best candidate among the first $r-1$ (relative rank 1).

Let $P_{n}(r)$ be the probability of hiring the candidate with absolute rank 1 using this strategy. Answer the following questions:

(1) Calculate $P_{4}(2)$.

(2) Show that $P_{10}(3)=\frac{2}{10}(\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{9})$.

(3) For $n$ candidates, find the probability that the candidate with absolute rank 1 is hired at the $k$-th interview, where $r \le k \le n$.

(4) In the following recurrence relation, find the expressions for $A$ and $B$:

$$ P_{n}(r)=A+B\times P_{n}(r+1) $$

$A$ and $B$ should be expressions involving $n, r$, and constants.

(5) Let $q=r/n$. Explain that $P_{n}(r)$ can be approximated by $-q\ln(q)$ when $n$ is sufficiently large. Furthermore, find the value of $q\in (0,1]$ that maximizes $-q\ln(q)$. Note that $\ln$ denotes the natural logarithm.

Solution

Background: This problem describes the “37% Rule” (The Secretary Problem). If you’ve seen math popularization videos on optimal stopping, this should look familiar.

Question 1

Calculate $P_{4}(2)$.

$$ \begin{align} P_{4}(2) &=\frac{1}{4} + \frac{1}{4} \times \frac{1}{2} + \frac{1}{4} \times \frac{1}{3} \\ &=\frac{1}{4} + \frac{1}{8} + \frac{1}{12} \\ &=\frac{11}{24} \end{align} $$

The calculation sums the probability that the best candidate is selected if they appear in the 2nd, 3rd, or 4th position. To select the best candidate, no one appearing between the $r$-th person and the best candidate can be better than the best of the first $r-1$ people. Let’s break it down:

If the best candidate is in the 1st position, they are automatically rejected. Probability = 0.

If the best candidate is in the 2nd position ($k=2$), since we start hiring from the 2nd person, and no one can be better than the “best” (absolute rank 1), the selection probability is 1.

If the best candidate is in the 3rd position ($k=3$), there are two cases for the first two people:

  1. If the 2nd person is better than the 1st, the 2nd person is hired. The best (3rd) is missed.
  2. If the 1st person is better than the 2nd, the 3rd person (the best) will be hired.

The probability of hiring the best here is $1/2$.

If the best candidate is in the 4th position ($k=4$), the best candidate is hired only if the best among the first 3 people is located within the first $r-1$ (the first person). This happens with a probability of $1/3$.

Summing these up: $P = \frac{1}{4} \times 0 + \frac{1}{4} \times 1 + \frac{1}{4} \times \frac{1}{2} + \frac{1}{4} \times \frac{1}{3}$.

Question 2

Show that $P_{10}(3)=\frac{2}{10}(\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{9})$.

Assume the best candidate is at position $m$ ($3 \le m \le 10$). To hire them, the best candidate among the first $m-1$ must be within the first 2 positions ($r-1=2$). The probability of this is $\frac{2}{m-1}$. Thus, the probability of selecting the best candidate given they are at position $m$ is:

$$ \begin{align} P =&\frac{1}{10} \times \frac{2}{2}+\frac{1}{10} \times \frac{2}{3}+\frac{1}{10} \times \frac{2}{4}+\frac{1}{10} \times \frac{2}{5}+\\ &\frac{1}{10} \times \frac{2}{6}+\frac{1}{10} \times \frac{2}{7}+\frac{1}{10} \times \frac{2}{8}+\frac{1}{10} \times \frac{2}{9} \\ =&\frac{1}{10}(\frac{2}{2}+\frac{2}{3}+\frac{2}{4}+\frac{2}{5}+\frac{2}{6}+\frac{2}{7}+\frac{2}{8}+\frac{2}{9}) \\ =&\frac{2}{10}(\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\frac{1}{5}+\frac{1}{6}+\frac{1}{7}+\frac{1}{8}+\frac{1}{9}) \end{align} $$

Since the best candidate cannot be hired if they are in the first two positions:

$$ \begin{align} P_{10}(3) &=\frac{1}{10} \times 0 + \frac{1}{10} \times 0 + P \\ &=P \end{align} $$

The proposition is proven.

Question 3

Find the probability of hiring the best candidate at the $k$-th interview ($r \le k \le n$).

For the best candidate to be hired at the $k$-th interview, the best candidate must be at position $k$ (probability $1/n$). Additionally, the best person among the first $k-1$ candidates must be within the first $r-1$ candidates (so that no one is hired before $k$). The probability of this is $\frac{r-1}{k-1}$. Therefore:

$$ P = \frac{1}{n} \times \frac{r-1}{k-1} $$

Question 4

Find $A$ and $B$ for the recurrence $P_{n}(r)=A+B\times P_{n}(r+1)$.

$$ \begin{align} P_{n}(r) &= \frac{1}{n}\sum_{k=r}^{n} \frac{r-1}{k-1} \\ &= \frac{1}{n}(\frac{r-1}{r-1}+\frac{r-1}{r}+\frac{r-1}{r+1}+\cdots+\frac{r-1}{n-1}) \\ &= \frac{r-1}{n}\sum_{i=r-1}^{n-1}\frac{1}{i} \end{align} $$

Similarly:

$$ P_{n}(r+1)=\frac{r}{n}\sum_{i=r}^{n-1}\frac{1}{i} $$$$ \frac{r}{n} \times B = \frac{r-1}{n} \implies B = \frac{r-1}{r} $$

To find $A$: The sum for $P_{n}(r)$ has an extra term for $i=r-1$ compared to the sum in $P_{n}(r+1)$. The term is $\frac{r-1}{n} \times \frac{1}{r-1} = \frac{1}{n}$.

Thus, $A = \frac{1}{n}$ and $B = \frac{r-1}{r}$.

Question 5

Explain the approximation $P_{n}(r) \approx -q\ln(q)$ for large $n$ and find the max.

Using the formula from above, when $n$ is large:

$$ \sum_{i=r-1}^{n-1}\frac{1}{i} \approx \int_{r}^{n} \frac{1}{x} dx = \ln(n)-\ln(r)= \ln\left(\frac{n}{r}\right) $$

Thus:

$$ P_{n}(r) \approx \frac{r}{n}\ln\left(\frac{n}{r}\right) = q\ln\left(\frac{1}{q}\right) = -q\ln(q) $$

Let $y = -q\ln(q)$. To find the maximum, take the derivative: $\frac{\mathrm{d} y}{\mathrm{d} q} = -\ln(q) - 1$. Setting $\frac{\mathrm{d} y}{\mathrm{d} q} = 0$ gives $\ln(q) = -1$, so $q = \frac{1}{e}$.

So, the maximum value of $-q\ln(q)$ occurs at $q = 1/e$ (approx 0.37).


Wrote with ChatGPT