Statistical analysis is the science of analyzing patterns and trends in collected data and deriving inferences from them.
Analysis of Variance (ANOVA) is a statistical technique that measures the difference between the means of two or more groups in a sample. ANOVA gives similar results to the t-test when used for two groups. A t-test is an inferential statistical test used to determine the statistical difference between the means in two unrelated groups. In contrast, ANOVA is predominantly used when three or more groups are to be compared. ANOVA is one of the methods to determine the significance of experimental results.
A simple example of this could be, a manufacturing plant that uses three different methods to package their finished product. To know which method works the best, ANOVA analysis can be used.
There are two types of ANOVA tests depending upon the number of independent variables – one-way and two-way. In the one-way test, there is one independent variable with two groups/levels. In the two-way test, there are two independent variables along with the possibility of multiple groups/levels. The two-way test can be performed with or without replication.
The computations of the ANOVA test statistic are arranged in an ANOVA table given below. It contains the values corresponding to Sum of Squares (SS), degrees of freedom (df), Mean Square (MS), and the F-value.
Source of Variation | SS | df | MS | F-ratio |
Between Samples | SSB | k-1 | MSB = SSB/(k-1) | F = MSB/MSW |
Within Samples | SSW | n – k | MSW = SSW/(n-k) | — |
Total | SST = SSB + SSW | n - 1 | — | — |
Where,
SSB = sum of squares between samples
SSW = sum of squares within samples
MSB = mean square between samples
MSW = mean square within samples
n = total sample size (sum of each of the sample sizes)
k = total number of treatments or observations (number of independent samples)
Normality tests in statistics are used in model selection and computation of the normal distribution of a variable in a dataset.
In descriptive statistics, a normal model’s fit is measured – if the fit is poor, then its data is considered not properly modeled with respect to normal distribution.
In statistical hypothesis testing, an alternative hypothesis for the probability distribution of data is proposed. A hypothesis test specifies the results of a study that may lead to the rejection of the null hypothesis.