Hypothesis testing compares the data being studied to an observed characteristic of the population from which the data are sampled. This is a method of inferential statistics, and the data must be properly sampled in order for the results of testing to be valid.
The researcher has a proposed hypothesis about a population characteristic and conducts a study to discover if it is reasonable, or, acceptable. The proposed hypothesis is called the alternative hypothesis and is labelled Ha.
The observed characteristic is a value such as a mean, or a proportion, or a variance, that is already known as "true." This value is called a parameter. The null hypothesis states what the parameter is, and is labelled Ho.
The alternative hypothesis claims that the population characteristic is different than the observed parameter. This difference is either that the characteristic has increased, decreased, or, possibly either increased or decreased.
The standard notation for these hypotheses are:
Ho: ε ≤ #
Ha: ε > # (an increase)
Ho: ε ≥ #
Ha: ε < # (a decrease)
Ho: ε = #
Ha: ε ≠ # (either increase or decrease)
- where ε represents the symbol for the parameter.
For example, a study of the mean value would show a μ symbol:
Ho: μ = #
Ha: μ ≠ #
The researcher will measure the sample's characteristic and use it to calculate a test statistic. There are a number of different test statistic formulas, that depend upon what data is used, and which parameter is being tested.
The test is based upon an assumed distribution of the population. The null is made upon this assumption. The test statistic will have a certain likelihood for occuring, according to the distribution being used. When this likelihood is small, this indicates that the sample data are either from an unusual sample, or, that the distribution of the population actually is different than assumed. If the sample is properly drawn, there is small risk that the sample is unusual, and, so, it is safe to draw a conclusion that the distribution may be changed. This allows the conclusion that the null hypothesis may have changed, and that the alternative hypothesis might be accepted instead. This conclusion leads the researcher to "reject" the null hypothesis.
The likelihood that is small "enough" to reject the null is a subjective rule that is determined by the researcher before the test is conducted. This likelihood is called alpha α. Common practice sets alpha to either .01, .05, .10. This is also called a rejection region - referring to the graphed area of the distribution.
An alternative approach to using alpha is to calculate a p-value, which is thought to bring more flexibilty to the conclusion.
The case that the sample data are unusual, and the underlying population actually would have fit the distribution, so, the null hypothesis is rejected in error, is called Type I error. The probability for making this error is equal to the value of alpha, or, to the p-value, whichever has been used to draw the erroneous conclusion.
Showing posts with label critical value. Show all posts
Showing posts with label critical value. Show all posts
Thursday, October 14, 2010
Tuesday, September 14, 2010
Sample Size
The size of a sample influences the cost of a study, as well as the usefulness of the results. A sample that is too small can exclude information. One too large is costly and cumbersome.
Often, researchers need to know the smallest sample that can be taken and yet still have estimates that are accurate.
Decision-makers first agree to the amount of error they will tolerate from the results. This is called the margin of error (E).
Along with margin of error, researchers also assign a critical value (C.V.) that is based upon the probability for extreme values in the population.
These two factors are combined with knowledge about the population's standard deviation (sigma) to reach a recommmended sample size.
n= [(C.V. * sigma) / E]^2
In order to apply the Central Limit Theorem, the common rule of thumb is a minimum sample size of 30. However, if the population is bell-shaped, it can be smaller.
Often, researchers need to know the smallest sample that can be taken and yet still have estimates that are accurate.
Decision-makers first agree to the amount of error they will tolerate from the results. This is called the margin of error (E).
Along with margin of error, researchers also assign a critical value (C.V.) that is based upon the probability for extreme values in the population.
These two factors are combined with knowledge about the population's standard deviation (sigma) to reach a recommmended sample size.
n= [(C.V. * sigma) / E]^2
In order to apply the Central Limit Theorem, the common rule of thumb is a minimum sample size of 30. However, if the population is bell-shaped, it can be smaller.
Friday, July 23, 2010
Margin of Error
Margin of Error (E) is the error that can be tolerated when estimating a value.
For confidence intervals, it is calculated as the critical value multiplied by the standard error -
E = Crit Val * Std Err
First, you look up the critical value from the probability table (t or z), then you calculate the standard error. Multiply these together.
Margin of Error tells you how much 'cushion' to place on your estimated value.
This cushion will be larger or smaller depending on the critical value that the researcher has chosen.
However, to determine sample size (n), the margin of error is chosen, not calculated.
For example, a buyer wants to know the sample size needed to estimate the average cost of shoes. He needs the estimate to be within ten dollars of the true population mean.
In this case, you will use E=10 in the formula for solving sample size.
For confidence intervals, it is calculated as the critical value multiplied by the standard error -
E = Crit Val * Std Err
First, you look up the critical value from the probability table (t or z), then you calculate the standard error. Multiply these together.
Margin of Error tells you how much 'cushion' to place on your estimated value.
This cushion will be larger or smaller depending on the critical value that the researcher has chosen.
However, to determine sample size (n), the margin of error is chosen, not calculated.
For example, a buyer wants to know the sample size needed to estimate the average cost of shoes. He needs the estimate to be within ten dollars of the true population mean.
In this case, you will use E=10 in the formula for solving sample size.
Alpha
Alpha is chosen and represents the level of error the researcher can tolerate. Alpha is the probability of rejecting a correct null hypotheses. It is also referred to as the rejection region.
Alpha corresponds with a critical value. It is graphically defined as a 'tail' region - that is, the diminishing area under a bell-shaped curve, that extends either left of a negative critical value, or right of a positive critical value. See an image at: rejection region.
Assuming that a hypothesis is true, then sample measurements are not expected to fall in this tail region, since it is a small area. When such a sample measurement occurs, it is unlikely, and, therefore, indicates that the hypothesis could be wrong. Researchers will reject an hypothesis if it falls into this alpha region.
However, these unlikely values do occur, even if they are less likely. When the hypothesis is rejected due to an unlikely sample measurement, when, in fact, the hypothesis is true, this is called "Type I error."
Popular alpha values are .01, .05, and .10.
If an alpha value of .10 is used, then type I error is 10% likely to occur.
The terms type I error and alpha are sometimes used synonymously, depending on context.
Alpha corresponds with a critical value. It is graphically defined as a 'tail' region - that is, the diminishing area under a bell-shaped curve, that extends either left of a negative critical value, or right of a positive critical value. See an image at: rejection region.
Assuming that a hypothesis is true, then sample measurements are not expected to fall in this tail region, since it is a small area. When such a sample measurement occurs, it is unlikely, and, therefore, indicates that the hypothesis could be wrong. Researchers will reject an hypothesis if it falls into this alpha region.
However, these unlikely values do occur, even if they are less likely. When the hypothesis is rejected due to an unlikely sample measurement, when, in fact, the hypothesis is true, this is called "Type I error."
Popular alpha values are .01, .05, and .10.
If an alpha value of .10 is used, then type I error is 10% likely to occur.
The terms type I error and alpha are sometimes used synonymously, depending on context.
Critical Value
A critical value (C.V.) is a number that is used to make estimates and test hypotheses. Critical values always correspond to a probability.
This number represents the distance from itself to the center of a bell-shaped graph, either the z or t distribution. The area in this section represents the probability of the C.V.
For example, using the z distribution, the number 1.96 is 47.5% likely. When you also include -1.96, then the likelihood is doubled.
Alpha and Confidence Level are probabilities that correspond to critical values.
This number represents the distance from itself to the center of a bell-shaped graph, either the z or t distribution. The area in this section represents the probability of the C.V.
For example, using the z distribution, the number 1.96 is 47.5% likely. When you also include -1.96, then the likelihood is doubled.
Alpha and Confidence Level are probabilities that correspond to critical values.
Subscribe to:
Comments (Atom)