Shapiro-Wilk Test

Shapiro-Wilk Test
Description	The Shapiro-Wilk test is a normality test in probability determination statistics. It is used to determine whether a simple random sample of a variable’s values has been derived from a normal distribution.
Why to use	For normality test
When to use	To find out whether a random sample has been derived from a normal distribution.	When not to use	On data other than numerical data.
Prerequisites	The input variable should be of numerical type. Shapiro-Wilk normality test generates a significant result if the sample size is sufficiently large.
Input	Any dataset that contains numerical data.	Output	W Statistic p-Value alpha (α)
Statistical Methods used	NA	Limitations	It can be used only on numerical data. The data is inferred to be normally distributed depending upon the user’s assessment or requirements. For sample size > 5000, the normality test result can be inferred only from the W Statistic value.

The p-value is the probability of attaining observed results of a statistical hypothesis test, assuming that the null hypothesis is true.

The null hypothesis of the Shapiro-Wilk test is – Input data comes from a normal distribution, while the alternative hypothesis is – Input data does not come from a normal distribution.

The Shapiro-Wilk test rejects the null hypothesis of normality when the p-value is less than or equal to 0.05. Failing the normality test allows you to state with 95% confidence that the data does not fit the normal distribution. Passing the normality test enables you to declare that no significant departure from normality was found.

The test generates a W Statistic value which depends on the ordered random sample values and the constants generated by covariances, variances, and means of a normally distributed random sample. If the W Statistic value is small, the null hypothesis is rejected, and it can be concluded that the random sample is not normally distributed.

Shapiro-Wilk normality test generates a significant result if the sample size is sufficiently large.

Related Articles
One Sample T Test
One Sample T Test Description A one-sample t-test is a statistical test for determining if the mean of a single sample varies significantly from a hypothesized population mean. Why to use To determine if there is statistical difference between sample ...
One Sample Z Test
One Sample Z Test Description One-sample z-test is a statistical test used to determine if the mean of a single sample is significantly different, from a hypothesized population mean, when the population standard deviation is known. Why to use ...
Train Test Split
Train Test Split Description The data is split randomly into train data and test data. Ideally, the split is in the ratio of 70:30 or 80:20 for train and test. Why to use To evaluate the accuracy of the model with an unknown dataset. When to use The ...
Train Test Split
Train Test Split Description The data is split randomly into train data and test data. Ideally, the split is in the ratio of 70:30 or 80:20 for train and test. Why to use To evaluate the accuracy of the model with an unknown dataset. When to use The ...
One Sample Proportion Test
One Sample Proportion Test Description A one-sample proportion test is a statistical test used to determine if a single proportion (or percentage) of a population is statistically different from a hypothesized value. Why to use To determine if a ...

Shapiro-Wilk Test

Shapiro-Wilk Test

Related Articles

One Sample T Test

One Sample Z Test

Train Test Split

Train Test Split

One Sample Proportion Test