introductionIn the past 5 decades efficient market hypothesis has been a topic of intense debate and rigorous academic research. According to the EMH Markets operate efficiently, and stock prices reflects all available information instantly. Since all the participants have the same information, there is unpredictable price fluctuation which immediately responds to new information. Therefore in an efficient market it is not possible to make to make above average returns without taking some additional risk. Efficient market hypothesis is an investment theory which is traced back to two individuals in the 1960s Eugene F. Fama and Paul A. Both individuals independently developed the same basic notion of market efficiency. According to the efficient market theory it is impossible to beat the market since the stock market efficiency causes existing share prices to always reflect relevant information. The efficient market hypothesis considers how much information about the company and its stock price is readily available to investors. The more information there is, the stronger the EMH.There are three forms of EMH, weak form, strong form, semi strong form. This paper will focus on the weak form of the efficient market hypothesis.in the weak form only past market trading information, such as trading volume, stock prices and interests are considered. Weak form efficiency also does not accept technical analysis as an accurate means of predicting the future price and it further asserts that even the fundamental analysis can sometimes be wrong. But If the market is not efficient in the weak form then it is possible to beat the market with technical analysisData and methodology The sample for our study spans from November 1997 to November 2017.the data comprises of 5032 observations of the daily closing price and the daily volume of the S&P500. We perform various tests on this data to check for the form of the EMH MethodologyMarket efficiency under the random walk model reveals that past prices cannot be used to predict future prices that is why the prices are independently and identically distributed. We used econometric methods and conducted tests to test for the weak form of the efficient market hypothesis, so we can find out if the prices are predictable. We have conducted the ADF (Augmented dickey fuller test), runs test and the box test. the tests are both parametric and non-parametric. We are doing these tests to check if the market is efficient in the weak form, if it is possible to beat the market and if the price changes are independent and random.Random walkRandom walk refers to the fact that price changes are independent of each other and have the same distribution. This implies that all the previous stock trends and movements in the past cannot be used to predict future movement. Runs testThe runs test is a non-parametric test which is designed to examine whether an observed sequence is random. Sequences of consecutive positive and negative returns are tabulated and compared against its sampling distribution under the RWH random walk hypothesis. A run is basically the occurrence of the same value of a variable. The type of the run and the length of the run are the two parameters it is indexed by. price runs can be positive, negative or have no change. How often the run type occurs in succession is the length.Hypothesis is written as H_0:?=0 the data is independent and random H_1:?<0 the data is not independent and serial correlation might exist. R is the observed number of runs. sR is the standard deviation of the number of runs, and R- is the expected number of runs. In order to implement the runs test, we can use standard normal z statistics whether the actual number of runs is consistent with the hypothesis of independences. Formula of the standard score is given by Z=(r±0.05-µ)/?Where r is the actual number of runs µ is the expected number of runsADF test (Augmented dickey-fuller test)The augmented dickey fuller test is a parametric test which is a type of a unit root test.it is the opposite of ljung box test. By estimating the following equation, we conduct the augmented dickey-fuller test?Y_t=?Y_(t-1)+?_twhere ?=?-1 and ?Y_t=Y_t-Y_(t-1)the hypothesis is written as ;H0:?=0 The data is not distributed randomly; they show serial correlation)H1:?<0 the data is distributed independently. The correlations in the population from which the sample is taken are 0, so that any observed correlations in the data result from randomness of the sampling processLjung box testThe ljung box test is a parametric test to check for randomness.one common method of testing randomness is autocorrelation plots. The ljung box test is based on autocorrelation plot but instead of testing randomness at each distinct lag, the overall randomness is tested based on the number of lags.H0:?=0 the data is randomly and independently distributedH1:?<0 the data is not randomly distributed and the null hypothesis is that the data is independently distributed (the correlations in the population from which the sample is taken is 0, so that any observed correlations in the data result from randomness of the sampling process)Q=N(N+2)?_(k=1)^h?(? ?_k^2)/(N-k) ? ?_k^2 is the sample autocorrelation function evaluated at lag k for k= 1 here we are testing the lag 1 autocorrelation. After the first order auto correlation the formula is reduced to Q=N(N+2)(? ?_1^2)/(N-1)Here the ? ?_1^2 can be computed using the formula? ?_1^2=(?_(t=2)^N?(y_t-y ? ) (y_(t-1)-y ? ))/(?_(t=2)^n?(y_t-y ? )^2 )We then express this as? ?_1^2=(cov(y_t,y_(t-1)))/(Var( y_t))cov(y_t,y_(t-1)) is the covariance between y_t and y_(t-1). And Var( y_t) is the variance of y_t. y_(t-1) is the original series with first observation removed.The degree of freedom is k, and for lag k=1 ,df=1ResultsDescriptive statistics (Quantitative data): Statistic daily return Nbr. of observations 5031 Minimum -0.104 Maximum 0.099 1st Quartile -0.006 Median -0.001 3rd Quartile 0.005 Mean 0.000 Variance (n-1) 0.000 Standard deviation (n-1) 0.012 We start by finding the basic descriptive statistics. The mean return is not positive which means that the peaks are not higher than expected which means that it is possible to predict the future prices to an extent. This result has been calculated by using the daily return percentage. The table above shows the results we attain after conducting the ljung test the p value that we obtain is 1-P(X_1^2?Q) which is equal to 1-1=0. Since the p value is less than 0.05 we reject the null hypothesis and accept the alternative hypothesis that implies that the data is not independent and serial auto correlation could exist in the data. This implies that the prices are not random, and autocorrelation exists. The table shows the results, after conducting the augmented dickey fuller test on the data. We obtain the p value of 1.so here we accept the null hypothesis since the p value is greater than the significance level. This implies that the data is not independently distributed, and serial correlation exists. Therefore the data displays a predictable behavior and implies that there is a pattern running. The results shown above are of the runs test on the data. it shows 2652 runs altogether. Here we reject the null hypothesis that the order of the data is random and accept the alternative hypothesis which was that the data is not random because the p value is less than the significance value. The results from all three tests clearly show that the data is not random over the period of the study and serial correlation exists ConclusionEven after thorough research on the efficient market hypothesis sometimes its difficult to prove the theory correct when some famous investors such as warren buffet and others consistently beat the market. The paper examines the tests for the weak form efficiency by using the daily observations of the s&p500 closing price and volume data. We have conducted both parametric and non-parametric tests to check for the randomness and the independence of the data. The results of the runs test, ljung box test both reject the null hypothesis that the data is random and the Augmented dickey fuller test accepts the null hypothesis according to which the data is not independent and also not random. according to our tests that we have conducted we can say that the price movement in the us market is not completely random and can be predicted to a certain extent. This tell us that the market is not efficient in the weak form in this case. The results also go against the random walk theory. This could be why warren buffet some others were able to beat the market. The results of the serial correlation reject the presence of random walk furthermore runs test and the unit root test both also conclude that the weak form is inefficient. Further studies and test in the future will assist in examining whether market efficiency improved over the span of time in these markets.