In today’s world, we are faced with situations everyday where Statistics can be applied. In general, Statistics is the science of collecting, organizing, and analyzing numerical data. The techniques involved in Statistics are important for the work of many professions, thus the proper preparation and theoretical background of Statistics is valuable for many successful career paths. Marketing campaigns, the realm of gambling, professional sports, the world of business and economics, the political domain, education, and forecasting future occurrences are all areas which fundamentally rely on the use of Statistics.
Statistics is a broad subject that branches off into several categories. In particular, Inferential Statistics contains two central topics: estimation theory and hypothesis testing. The goal of estimation theory is to arrive at an estimator of a parameter that can be implemented into one’s research. In order to achieve this estimator, statisticians must first determine a model that incorporates the process being studied. Once the model is determined, statisticians must find any limitations placed upon an estimator.
These limitations can be found through the Cramer-Rao lower bound. Under smoothness conditions, the Cramer-Rao lower bound gives a formula for the lower bound on the variance of an unbiased estimator. Once the estimator is developed, it is tested against the limitations to see if it is valid relative to the model. Lastly, experiments are run using the estimator to test performance. From real data, statisticians are able to decide whether the estimator is incorrect, and in this case, they can go back and find a new estimator.
It is important for an estimator to achieve a minimum average error (i. e. minimum variance unbiased estimator). This type of estimator is known to be an efficient estimator because the average error measure is the variance. Other performance measures for estimators include: bias and consistency. An estimator is said to be unbiased if the expected value of the estimator equals the true value of the parameter. An estimator is said to be consistent if the mean-squared error tends to zero as the number of observations becomes large.
As mentioned previously, the use of statistics is relevant to everyday life all over the world. One example is through our democratic voting process. In politics, it is useful to estimate the proportion of voters who support a certain candidate for election or a certain piece of legislation. This proportion is unobservable until after Election Day, a time when these results are no longer desired. In order to estimate the proportion of voters, statisticians can find an estimator based on a random sample of voters through a particular model.
Another real life example is through weather forecasting, specifically natural disaster tracking (i. e. hurricanes). It is beneficial for statisticians to develop models to forecast hurricanes using historical data to monitor trends, along with overall climate. In the world of finance, assumptions regarding systematic risk of a stock (beta), earnings expectations, future growth, and interest rates all must be estimated in order for portfolio managers and analysts (using fundamental analysis) to forecast future occurrences and efficiently offer their clients valuable investments.
These statistical techniques are beneficial, as they are able to accurately (if analyzed correctly) forecast future results and probabilities of occurrence, thus informing the general public. Clients pay large sums of money for analysts who can model trends within the stock market (within reason) and put their savings into securities that maximize the probability of an increase in an individual’s net worth. Hypothesis testing tests a relevant null-hypothesis claim to an alternative hypothesis, a claim that can be concluded from rejecting the null-hypothesis.
During a test, it is important to consider any assumptions being made such as the form of the distributions of observations or statistical independence. Once we have achieved a claim and acknowledged the assumptions, we must compute a relevant test statistic. The distribution of the test statistic under the null-hypothesis is derived from the assumptions identified previously. Common test statistics may follow the following distributions: Normal, Student T, and Chi-Square.
This distribution separates the possible values of the estimator into two categories: values for which the null-hypothesis is accepted or rejected. The region for which we accept the null-hypothesis is called the critical region and the area underneath the curve that corresponds to the critical region is known as the level of confidence. Hence, we can develop a confidence interval for which we can see the lowest and highest point of the critical region.
Any observed sample mean that lies outside of this confidence interval (outside the critical region) would cause us to reject the null-hypothesis in favor of the alternative hypothesis. The area of the rejection region is known as the level of significance and represents type I error (alpha) corresponding to the probability that a true null-hypothesis is rejected (as opposed to type II error- beta; the probability of accepting a false null-hypothesis). Essentially, hypothesis testing calls for comparing a test statistic to the critical value of the test statistic.
If this test statistic is greater than the critical value of the test statistic, we will reject the null hypothesis in favor of the alternative hypothesis. If the test statistic is less than the critical value of the test statistic, we will accept the null-hypothesis and conclude that the results are statistical insignificant. In the world of business, virtually every management team for well-managed firms make claims that involve quantitative aspects that can be tested using hypothesis-testing techniques.
Furthermore, it is important for companies and consumers to also be able to realize error associated with these claims. Type I error is known as producer’s risk, and in business, it is this risk that the company assumes when making claims regarding their products or services. On the other hand, type II error is known as consumer’s risk, and this represents the risk the consumer assumes by accepting a claim. One example would be bottling companies claiming their bottled drinks contain twelve fluid ounces. Many filler systems pour liquids that model a normal distribution.
Thus, a statistician can gather data by taking a large enough sample and later calculating the sample mean and sample standard deviation; then they can carry out the test. Managers strive for being 95% confident, only assuming less than 5% risk. In order for consumers or lawyers to deem a claim bogus (thus insinuating a legal proceeding), you must achieve an observed significance level less than 5% for any given test. Hypothesis-testing techniques can also be used to compare two or more samples relative to one another. For example, a BC student can compare one dining facility (McElroy) to another dining facility (Lower).
Upon taking a large enough sample, this student can gather a sample mean rating for each dining facility and a sample standard deviation rating; then they can carry out the test with null-hypothesis: “there is no statistically significant difference between the mean ratings of McElroy and Lower dining halls. ” In conclusion, estimation theory is the branch of Statistics that deals with estimating parameters based on empirical observed data. These parameters are then used to describe an underlying concept or phenomena being studied in such a way that the values of the parameters affect the whole distribution of observed data.
Thus, we achieve estimators that attempt to approximate unknown parameters. Hypothesis testing is the branch of Statistics that uses experimental data and statistical methods for decision-making. By and large, the goal of performing a test is to test whether a result is statistically significant. Through the use of a null-hypothesis, we are interested in knowing whether or not an outcome is unlikely to have occurred by chance (statistically significant). These two topics are the cornerstone of statistical inference and lay the foundation quantitative numerical analysis.