Hello Ludwig. Yes, the traditional model (37%=1/e) isn't always reflective of real-world hiring practices or decision-making scenarios where companies try to settle for "one of the best" and that is often more practical and desirable than just finding "the absolute best."
I have actually thought about this a couple of years ago and agree it is an important concept in real-world decision making. Many companies, including mine, are dealing with huge number of applicants and we just don’t have time to interview all, so we need to find that threshold to make sure or be as confident as possible to select among the top X% of the applicants, not just the absolute top. In that case, the percentage of applicants to interview is much lower than the 37% and selecting an applicant in the top X% works almost the same as selecting the absolute top in real world. This will save more of the company's resources. Competition is high and there are a lot of good applicant. Please see below for more explanation.
To address your question, the traditional problem's structure and solution need some adjustments. Please note that there's no one-size-fits-all algorithm for these scenarios because of varying definitions of what is "one of the best" and “top X%”.
To maximize the probability of selecting a candidate within the top X% of all candidates, the approach can be adjusted into (a simplified example for Top X%):
Define Success Criteria (this is where it could vary by definitions)
- Decide what qualifies as a successful outcome (e.g., landing a candidate in the top 5%).
Adjust the Observation Phase
- For targeting the top X%, you might adjust this observation phase based on simulations. Instead of passing on the first 37% of candidates, a different percentage needs to be calculated. This percentage isn't straightforward and depends on different variables and definitions. More explanation below.
Modify Decision Rule
- Develop a decision rule that accounts for the adjusted criteria. This could mean selecting the first candidate who ranks above a certain threshold observed during the initial phase.
Exact algorithms for these variants are more complex and not universally applicable. They often involve a combination of simulation and optimization techniques to find an optimal stopping rule modified to the specific goal (e.g., top 5%). For these modified goals deriving a simple, universal percentage like the 37% rule is not possible. It involves adjusting both the length of the observation phase and the criteria for selection based on the specific goals and the total number of candidates. I will provide an example below.
A simplified example to modify the Secretary Problem towards selecting a candidate within the top 5%, can be as below:
N=100 candidates and aim for the top P=5%.
The process can include simulating the selection process under various observation percentages and determining which yields the highest success rate of landing a top 5% candidate.
Iterate over a range of observation percentages (from 0% to 100% in suitable increments, such as 1%).
For each observation percentage, simulate the scenario multiple times to average out randomness and get a reliable success rate.
Assume candidate quality is uniformly distributed for simplicity, with each candidate having a unique rank from 1 to 100. Ranks are randomly shuffled for each simulation run to ensure unbiased outcomes.
The strategy after the observation phase is to select the first candidate who ranks higher than any observed candidate thus far and hope they are in the top 5.
A selection is considered successful if the chosen candidate is within the top 5 ranks.
Compare success rates across all observation percentages to identify the optimal one for selecting a top 5% candidate.
By running a simulation like above, Iterating thousands of times for each observation percentage to fix the randomness and obtain an average success rate for each, we find that the optimal observation percentage for maximizing the chances of selecting a candidate within the top 5% out of 100 candidates is 19%, with a success rate of 71.5%. After calculating the success rates for all observation percentages, they are compared against each other to identify which observation percentage returned the highest average success rate of selecting a top 5% candidate.
So, basically by observing the first 19% of candidates, you gather enough information to establish a meaningful benchmark of quality, which then informs a more informed decision-making process for the remaining candidates.
This 19% represents the point at which observing more candidates begins to significantly reduce the likelihood of still having a top 5% candidate available to select.
The success rate evolves as the observation percentage changes, indicating how varying the length of the observation phase impacts the probability of successfully picking a top-tier candidate. As the observation percentage increases, the pool of candidates left available for selection decreases. For example, when you observe 99% of the candidates, you're essentially left with only 1% of the candidates to choose from for selection. This dramatically reduces the chances of that remaining small pool containing a top 5% candidate, as most of the top candidates were likely included in the observation phase and hence were not selected. Thanks.
The explanation above can be easily converted to an algorithm.
Here is the Pseudo Algorithm:
Inputs: N (total number of candidates)
P (target top percentile, e.g., 5%)
Output: Selected candidate within the top P% of candidates
1. Define the success criteria:
- Set the target percentile P (e.g., top 5%).
2. Initialize variables:
- Set the optimal observation percentage O to 0.
- Set the highest success rate S to 0.
- Set the number of simulations per percentage to a large number for accuracy.
3. Determine the observation percentage O:
- For each percentage from 0 to 100:
- Run multiple simulations at this percentage.
- For each simulation:
- Observe the first O% of candidates without selecting.
- After the observation phase, select the first candidate who ranks higher than any observed so far.
- If the selected candidate is within the top P%, mark the simulation as successful.
- Calculate the success rate for this percentage.
- If the success rate is higher than the current highest S, update O and S with this percentage and success rate.
4. Use the optimal observation percentage O for actual candidate selection:
- Observe the first O% of candidates without selecting.
- Select the first candidate after the observation phase who ranks higher than any observed so far.
5. Return the selected candidate.
I’m asking for an algorithm but you derive an answer by simulation. That’s not really a solution