What is Probabilistic Analysis?

Those algorithms which we want to take the analysis of algorithm with help of probability this type of analysis is called Probabilistic Analysis. Probabilistic analysis goals to measure the probability that a given problem or algorithm fulfils a needed property. It is most commonly used in such algorithms model where random input is matter.

“In analysis of algorithms, probabilistic analysis of algorithms is an approach to estimate the computational complexity of an algorithm or a computational problem. It starts from an assumption about a probabilistic distribution of the set of all possible inputs. This assumption is then used to design an efficient algorithm or to derive the complexity of a known algorithm.” 1

To achieve a probabilistic analysis, we must use knowledge of making assumptions about, the distribution or scattering of the inputs. When we analyze our algorithm, calculate an average-case running time, where we take the average for the circulation of our different possible inputs. Thus, we are, in effect, averaging the running time over all possible inputs. When reporting such a running time, we will refer state it as the average-case running time.

The Problems where we can Apply Probabilistic Analysis:

We must be very cautious in deciding on the distribution of inputs. For some problems, we may wisely assume something about the set of all possible inputs, and then we can use probabilistic analysis as a technique for constructing an ef?cient or fast algorithm and as a means for gaining understanding into a problem. For other problems, we cannot describe a sensible input distribution, and in these cases, we can’t deal with the probabilistic analysis.

Through this analysis we can solve many algorithms that are based on the probability or selection in the form of yes or no.

Randomized Algorithm

An algorithm that uses random input to adopt what to do next everywhere in its logic is called Randomized Algorithm e.g., in Randomized Quick Sort, we use random input to pick the next pivot in the array.

“A randomized algorithm is an algorithm that hires a degree of randomness as part of its logic.”2

Randomized Algorithm

To use probabilistic analysis, we need to know something about the distribution of the inputs. In many instances, we know very little about the input distribution. Even if we do know something about the distribution, we may not be able to model this knowledge computationally. Yet we often can use probability and randomness as a tool for algorithm design and analysis, by making the behavior of part of the algorithm random.

A probabilistic analysis can guide us for the development of a randomized algorithm through which we can solve many problems.

Hiring Problem

Hiring problem is such a problem in which a company hire a best candidate with the help of employment agency, the agency sends you a one candidate each day. And our duty is that to interview the candidate and decide the candidate is suitable for the company or not. And at each interview we need to hire a best candidate from the available list to fire the fire previous one. Our main Purpose is that to calculate or estimate the hiring cost of a best candidate towards the employment agency.

There is different algorithm to solve the Hiring Problem. But the most common and standard algorithm to solve the hiring problem is randomized algorithm. A randomized algorithm is one that receives, in addition to its input data, a stream of random bits that it can use for making random choices. By using this we can analysis the best, average and worst case of hiring through which will hire a such a candidate who have higher best probability according to interview standards.

The Method HIRE CANDIDATE is explain below through a pseudo code. Company want to hire a candidate best and we initialize the 0 the most less qualified dummy candidate. The method assumes that company can, after interviewing candidate i, determine whether candidate i is the best candidate you have seen so far. To initialize, the procedure creates a dummy candidate, numbered 0, who is less quali?ed than each of the other candidates.

The algorithm is give below

HIRE CANDIDATE (n)

best = 0

for i = 1 to n

interview candidate i

if candidate i is better than candidate best

best =i

hire candidate i

Best Case Analysis

Interviewing has a low cost, say ci, whereas hiring is expensive, costing ch. Letting m be the number of people hired, the total cost associated with this algorithm is O(cin + chm). No matter how many people we hire, we always interview n candidates and thus always incur the cost cin associated with interviewing. We therefore concentrate on analyzing chm, the hiring cost. This quantity varies with each run of the algorithm.

Worst Case Analysis

In the worst case, we hire every candidate that we interview. This situation occurs if the candidates come in strictly increasing order of quality, in which case we hire n times, for a total hiring cost of O(chn).

Analysis of the hiring problem using indicator random variables

Returning to the hiring problem, we now wish to compute the expected number of times that we hire a new of?ce assistant. To use a probabilistic analysis, we assume that the candidates arrive in a random order. Let X be the random variable whose value equals the number of times we hire a new of?ce assistant.

We must compute the following equation which give us total hiring time.

EXPr{X=x}

De?ning one variable associated with the number of times we hire a new of?ce assistant, we de?ne n variables related to whether or not each particular candidate is hired. We let Xi be the indicator random variable associated with the event in which the ith candidate is hired. Thus,

Xi =If {candidate i is hired}

= 1 if candidate i is hired; 0 if candidate i is not hired;

And

X = X1+ X2 +…+Xn

EXi= Pr{candidate i is hired}

We have assumed that the candidates arrive in a random order, the ?rst i candidates have appeared in a random order. Any one of these ?rst i candidates is equally likely to be the best-quali?ed so far. Candidate i has a probability of 1/i of being better quali?ed than candidates 1 through i -1 and thus a probability of 1/i of being hired.

So the probability of EXi is,

EXi =1/i

Now we can compute EX

Even though we interview n people, we hire only approximately ln(n) of them, on average.

Applications

There are many applications of this algorithm but here are two explained.

1) Birthday Paradox

2) Balls and Pins

Birthday Paradox

What is Paradox?

A paradox statement is a statement which seems to contradict itself but actually may be true.This is the start of the end or Social media usage is a disease both looks like contradict statements but later on may be proved true.

Below some definition are given abut paradox statement.

· “A seemingly absurd or contradictory statement or proposition which when investigated may prove to be well founded or true.”3

· “A paradox is a statement that, despite apparently sound reasoning from true premises, leads to an apparently self-contradictory or logically unacceptable conclusion. A paradox involves contradictory yet interrelated elements that exist simultaneously and persist over time.”4

Birthday Paradox:

Consider there is a party room which is initially empty. After a particular time how many people must there be in a room before there is a 50% chance that two of them were born on the same day of the year? It looks like very easy when we are assuming less amount of people.

Initial claim:

If we check the probability that what is the chance of two or more persons having the birthday at same day. The answer will be ½ for each person. Mean there is 50% chance that both of the persons have the birthday on the same day. But if the number of persons in the room is more than two? The chance ratio will be same? It has dramatically effects on it.

Calculations:

Assumptions:

First of all we will ignore the issues like leap year. Our all computation will be made on the assumption that there are exactly 365 days in a year. n=365

Now we have to give an index number to all people in the room.

We index the people in the room with the integers (1,2…k) where k is the number of people in the room.

Let bi be the day of the year on which person i’s birthday falls, where (1?bi ?n.)

We also assume that birthdays are uniformly distributed across the n days of the year.

Probability=Pr= {bi =r} which is = 1/n (50%)

For (i=1,2…k) and (r=1,2…n).

We are also assuming from now that birthdays are independent.

The probability that two given people, say i and j, have matching birthdays depends on whether the random selection of birthdays is independent.

As we are assuming that birthdays of both persons are independent hence birthday for person I and birthday for person j will be fall on the same day r.

Probability=Pr={bi=r}

Probability =Pr={bj=r}

As we know that probability for 1 is 1/n hence probability for 2 will be

Pr={{bi=r}.{bj=r}}

=1/n2

Thus probability for both to fall on the same day birthday category is

=

=

=1/n

Analysis using indicator random variable:

Now to get an simpler and approximate analysis of Birthday paradox problem .For each pair which we are assuming from k peoples available in room we will define random indicator variable

Let X b the variable which counts the number of pairs in the room having birthday at same day.

We have

Xij it will be define for 1

After solving we will have

K=(?2n) +1

Where n=365

K=28

Now putting the value of K in K(k-1)/2n

Aprox.1.035

Which means if there are 28 people in the room then there is only 1 chance that 2 may have birthday at the same day.

Balls and bins:

The problem involves b balls and n bins.

Each time, a single ball is placed into one of the bins. After all balls are in the bins, we look at the number of balls in each bin.

We call this number the load on the bin and ask: what is the maximum load on a single bin?

This problem is also known as the coupon collector’s problem, which says that a person trying to collect each of b different coupons expects to acquire approximately b lnb randomly obtained coupons in order to succeed

Explanation of problem:

Suppose we are using identical balls and we are tossing them into the bins b.

Where b= (1,2….b).

Our second assumptions is the tosses are completely independent and at the end of the each toss there are equally likely chances that balls can be end up in any bin from 1,2…n

The probability for each bin for success means that ball will end up in that bin is 1/b.

· Now one of the important question to be answered is how many balls fall in a particular given bin b?

For n ball the probability for each bin to receive number of balls will be n/2.

· The second important question is how many balls must we toss, on the average, until a given bin contains a ball?

Its answer is simple keep tossing the ball into bin until the given bin receives a ball follows the geometric distribution with probability 1=b

The expected number of tosses until success will be 1/(probability of success) according to Bernoulli’s trials will be equal to 1/(1/b)=b.

· The third important question is how many balls must we toss until every bin contains at least one ball?

Let us start with a toss and if ball end up in the bin we will say it hit. Our aim is to find expected number of tosses n to get b hit.

At the i’th stage (i-1) hits will be have occurred. It means that all the bins already have balls expect the i’th bin.

Now for each toss in ith stage b-(i-1)

Probability for hit

b-i+1/b

ni is number of tosses in ith stage

n=

Expectations of geometric distribution

EX=1/p

As we know

P= b-i+1/b

Eni=1/( b-i+1)/b

Eni=b/ (b-i+1)

Now

En=E

En=

En=

En=b -> 1/ (b-1+1)+ 1/ (b-2+1)+ 1/ (b-3+1).. 1/ (b-b+1)

En=b

By nth harmonic

b(lnb+O(1))

Hence there will b(lnb) tosses before we will expect that every bin has a ball.

Advantages

There are some advantages toward the Randomized algorithm.

Efficiency:

In this algorithm several elements for the identification are commonly repeated and this algorithm solve hiring problem with ease and efficiently.

Simplicity:

In the literature the randomized algorithm commonly use and it is very simple than any other algorithm.

Complexity:

This algorithm is free from any kind of complexity. This algorithm is also easy to practice.

Drawbacks

Time:

This type of algorithm works in the loop and due to repetition of the several process it take the many time slots.

Hardware:

By continuous running this type of algorithm we face hardware failure problem.

Space:

As the algorithm works the different process as many time again and again then it requires a large storage space.

PacMan Game is the best example of this algorithm and also face these limitations of algorithms.