Probability puzzle: what are the odds of a bad password?

The other day I got an email about an interesting probability problem:

Hi Presh -

I’m a regular reader of your site and I came across a question which has bugged me the last couple of days.

The problem is as follows: A system has 100 accounts, two of which have bad passwords (let’s call these bad accounts). If someone could only test 20 accounts, what are the chances that one will net a bad account?

It was inspired while reading the following post on StackOverflow regarding best practices for site authentication.

Thanks for any insights!

I spent a few minutes on the problem, and I was happy that I could solve it.

Can you figure it out?

I gave the issue more thought, and I came up with a few harder questions you can give a try.

Extensions:

1. What is the probability of netting both bad accounts in the sample of 20? What about exactly one bad account?

2. What is the probability of netting a bad account if you have k bad accounts, there are N total accounts, and you can sample n accounts at one time?

3. [Edit 8-15]: Go back to the problem with 100 accounts, and 2 bad accounts] Suppose you can vary how many accounts you can sample. If you want a 50 percent chance of netting a bad account, what’s the minimum sample size needed?

As usual, post your ideas/questions below.

[The only hint I will give is that I used numerical methods (namely WolframAlpha) to solve this.]

I will provide a solution on Wednesday in the comment section.



Share this post:

| More

Previous post:

Next post:



  • Sebastian

    P(0bad)=((98Choose20)*(2Choose0))/(100Choose20)=316/495
    P(1bad)=((98Choose19)*(2Choose1))/(100Choose20)=32/99
    P(2bad)=((98Choose18)*(2Choose2))/(100Choose20)=19/495
    -> at least one bad: 179/495 =36%. Scarry!
    Or in general, if there are N total, k are special, n things are drawn at once and we want exactly 1 special one:
    P(X=1)=((N-k Choose n-1)*(k Choose 1))/(N Choose n)
    It’s basically just P(Event)= #OfCombinationsInThisEvent/#OfAllPossibleCombinations.
    In the example, to have P(X>1)=P(X=1)+P(X=2)>0.5 (chance of at least one bad account)
    I did some algebra (just write out the factorials and cancel stuff) and got x>=30.
    If you meant P(X=1)>0.5 (chance of exactly one bad)
    then 45=<x=<55.
    It would be interesting to do these calculations for the 500million+ Facebook accounts out there!

  • http://lovemeow.com eric

    Thanks for giving it away sebastian :)

  • Nino
  • Sebastian

    Sorry for spoiling it Eric. :S Maybe Presh should have the comment section hidden under a link? Or include spoiler alerts?
    Also I don’t actually understand WHY
    (n choose k) = n!/((n-k)!*k!) gives the number of coimbinations? Similarly why does it work for Permutations, i.e. when the order is important?
    Any insights/good links?

  • http://www.mindyourdecisions.com/blog/ Presh Talwalkar

    Sorry for the late post of the solution.

    The answer can be found using the hypergeometric distributrion.

    The same problem can be restated as follows: if you are drawing 20 balls from an urn of 98 white balls and 2 black balls, what are the chances of drawing a black
    ball?

    The way I calculated this is to find the chance of drawing only white balls and finding the complement event. Thus the chance is:

    1 – (2 choose 0)(98 choose 20)/(100 choose 20) = 36 percent.

    1. What is the probability of netting both bad accounts in the sample of 20? What about exactly one bad account?

    Two bad accounts is:
    (2 choose 2)(98 choose 18)/(100 choose 20) = 19/495 or about 4 percent

    One bad accountis:
    (2 choose 1)(98 choose 19)/(100 choose 20) = 32/99 or about 32 percent

    2. What is the probability of netting a bad account if you have k bad accounts, there are N total accounts, and you can sample n accounts at one time?

    You find the chance of getting no bad accounts and then find the complement.

    This is:
    1 – (k choose 0)([N-k] choose n)/(N choose n)

    3. Go back to the problem with 100 accounts, and 2 bad accounts] Suppose you can vary how many accounts you can sample. If you want a 50 percent chance of netting a bad account, what’s the minimum sample size needed?

    I used a numerical method to get to 30.

    It’s interesting that you only need to sample about a third of the population to have a better than even chance of finding both bad accounts!

Previous post:

Next post: