Papers Published
Abstract:
In the secretary problem of Cayley (1875) and Moser (1956), $n$ non-negative, independent, random variables with common distribution are sequentially
presented to a decision maker who decides when to stop and collect the most
recent realization. The goal is to maximize the expected value of the collected
element. In the $k$-choice variant, the decision maker is allowed to make $k
\leq n$ selections to maximize the expected total value of the selected
elements. Assuming that the values are drawn from a known distribution with
finite support, we prove that the best regret---the expected gap between the
optimal online policy and its offline counterpart in which all $n$ values are
made visible at time $0$---is uniformly bounded in the the number of candidates
$n$ and the budget $k$. Our proof is constructive: we develop an adaptive
Budget-Ratio policy that achieves this performance. The policy selects or skips
values depending on where the ratio of the residual budget to the remaining
time stands relative to multiple thresholds that correspond to middle points of
the distribution. We also prove that being adaptive is crucial: in general, the
minimal regret among non-adaptive policies grows like the square root of $n$.
The difference is the value of adaptiveness.