I was working on a sample data last Friday and testing if it is really worth looking or spending time because someone has requested for an analysis that I have revised a lot of times and one of the frustrations that I have been encountering so far is to translate these statistical tests into business language. That is another topic that I need to rant on.
Anyway, like I mentioned that two separate data were collected. You would think that these as pre and post tests, in a sense but the data’s background is that it was measured again after two weeks. I will start of in encoding these into R.
# Load ggplot2 package. Install this if necessary: # install.packages("ggplot2") library(ggplot2) # Creating Dataframe of Paired Data test.data <- data.frame(Test = as.character(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), Score = c(0.54, 0.573, 0.575, 0.589, 0.639, 0.624, 0.64, 0.565, 0.694, 0.605, 0.632, 0.535, 0.556, 0.533, 0.516, 0.575, 0.57, 0.608, 0.58, 0.502))
As usual, I am a fan of the subset function. I could use the open square brackets, but I am very comfortable in using this; it takes the job done.
# Subset test1 <- subset(test.data, Test == 1) test2 <- subset(test.data, Test == 2)
Now that we have subset the data. Let us look how far they are to each other. Most people are intimidated looking at these boxplots. I will not discuss further how to read and interpret these but you can actually see the difference between the mean, which is the small dot in between the boxes, and the median, the straight line across each box.
My question is, are these two data sets statistically significant to say that they are different to each other?
ggplot(data = test.data, aes(x = Test, y = Score)) + stat_boxplot(geom = "errorbar") + geom_boxplot(aes(fill = Test)) + stat_summary(fun.y = mean, geom = "point", aes(group = 1)) + ylab("Scores") + xlab("Test") + theme(legend.position = "none")
t.test(x = test1$Score, y = test2$Score, alternative = "two.sided", paired = T, conf.level = 0.95)
## ## Paired t-test ## ## data: test1$Score and test2$Score ## t = 2.018, df = 9, p-value = 0.07432 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -0.00528 0.09268 ## sample estimates: ## mean of the differences ## 0.0437
If you need me to compute these manually, I would love to. Starting from the standard deviation of differences of the two means to standard error, the degrees of freedom, until we arrive at the p-value according to t-test value. If I would plot this on a normal curve, the end point of t test value of 2.018 in a 9 degrees of freedom, the probability is 0.07.
Even on a 95% confidence level, I could say that they are not different to each other basing it on the p-value of more than 0.05. Why? Let’s construct the hypothesis statement first.
H0 = Test 1 = Test 2, Test 1 and Test 2 are equal to each other on a two-sided tail
HA = Test 1 ≠ Test 2, Test 1 and Test 2 are not equal to each other on a two-sided tail
Given that the p-value of 0.07, where the significance level is at 0.05 cut off, we retain the null hypotheses. Therefore, we conclude that these two tests are equal to each other. With all of these languages I speak, what do they really mean?
If you look into both means or averages of the two data sets, they are different. 0.6044 and 0.5607, respectively. I can say that the request I am working on is not worth looking at into a lower level. This is where decision error takes in place of what would be the implication if I continue looking for answers or I just decide not to because it is not worth looking at. Decision Errors is another topic, maybe in the next few posts.