Introduction and Objective:

The Nkye shoe company has some data on SHOE SIZE, HEIGHT, GENDER and they are looking to change their business model based on this data which is to “include only one size of shoes – regardless of height or gender of the wearer”. So based on their objective we can define the hypothesis as below for Gender and Height:

For gender:

Null Hypothesis (Ho): There is no statistically significant difference in the population mean size of shoes for males and females.

Alternative Hypothesis (Ha): There is a statistically significant difference in the population mean size of shoes for males and females.

For Height

Null Hypothesis (Ho): There is no statistically significant difference in the population mean size of shoes among different height categories.

Alternative Hypothesis (Ha): There is a statistically significant difference in the population mean size of shoes among different height categories. At least one category has different mean from each other.

Data and variables:

Here we have three variables which are

SHOE SIZE, HEIGHT, GENDER

Here SHOE SIZE is continuous variable

GENDER is categorical variable

and HEIGHT is also continuous variable.

We have 35 observations in these three variables and there is no missing data.

As our objective is to see whether there is any significant difference between the population shoe size for across different heights of the wearer so we need to convert this variable into category so that we can perform the test and get the information from the analysis. The Height variable will be categorized as

60-65 -> Group 1

65-70 -> Group 2

70-75 -> Group 3

75-80 -> Group 4

Note: We will test the above hypothesis at 5% level of significance.

Method of Analysis:

Here we have to test the population mean for two groups as well as for more than two groups. We will use t test for unequal variance and ANOVA (One way analysis of variance) to test the above hypothesis and objective.

T test for independent sample will be performed for Gender as there are only two groups such as Male and Female.

ANOVA test will be performed for Height as there are more than two groups which is defined below again.

60-65 -> Group 1

65-70 -> Group 2

70-75 -> Group 3

75-80 -> Group 4

Modified data and output:

Data for gender is given below:

Male Shoe Size Female Shoe Size

7 5

11 7.5

12 9

14 7

10.5 7.5

11 8

10 6.5

12 7

10.5 7.5

12 6.5

9.5 6

11.5 6.5

14 10

13.5 6.5

9.5 7

13 6

11 7

7.5

Output for t test is given below:

t-Test: Two-Sample Assuming Unequal Variances

Male Shoe Size Female Shoe Size

Mean 11.29412 7.111111

Variance 3.251838 1.281046

Observations 17 18

Hypothesized Mean Difference 0 df27 t Stat 8.165111 P(T<=t) one-tail 4.53E-09 t Critical one-tail 1.703288 P(T<=t) two-tail 9.06E-09 t Critical two-tail 2.05183

Data for height is given below:

Group 1 Group 2 Group 3 Group 4

6 6.5 7.5 14

6 6.5 9 14

5 7 7.5 13.5

6.5 7 11.5 7 7 7.5 6.5 8 7.5 7 10.5 9.5 11 10 11 12 12 9.5 10.5 10 13 12 11

Output for ANOVA test is given below:

Anova: Single Factor SUMMARY Groups Count Sum Average Variance Group 1 6 37 6.166667 0.466667 Group 2 11 90 8.181818 3.263636 Group 3 14 140.5 10.03571 3.671703 Group 4 3 41.5 13.83333 0.083333 ANOVA Source of Variation SS dfMS F P-value F critBetween Groups 140.3668 3 46.78893 16.9385 1.27E-06 2.922277

Within Groups 82.86851 30 2.762284 Total 223.2353 33

Interpretation of results:

We can see from the t test that the P-value of the two tailed critical value is less than 0.05, so in this case we will reject the null hypothesis because p-value is less than the 5% level of significance. It means that there is a statistically significant difference in the population mean size of shoes for males and females.

From ANOVA table we can that the P-value is 0.000 which is less than 0.05, so in this case we also will reject the null hypothesis because p-value is less than the 5% level of significance. It means that there is a statistically significant difference in the population mean size of shoes among different height categories. At least one category has different mean from each other.

Conclusion and Recommendation:

We can conclude from the above analysis performed with respect to this data that there is a significant difference in the shoe size for gender as well as for height. Nkye Company needs to make shoes of different sizes. There is no way or no possibility to make one size of shoes because it varies with respect to gender and height.