top of page

Statistical Tests

To statistically analyze our data, we used a software called Fathom to complete the Mann-Whitney U-test and average difference T-test.

Average Difference T-Tests

Both an alternative hypothesis (HA) and null hypothesis (HO) were written. For example:

HA: Bullpasture River will have a greater nitrate content than the Greenbriar River.

HO: Bullpasture River will NOT have a greater nitrate content than the Greenbriar River.

To complete this test, the average difference (avediff) of the collected sample was calculated by avediff = mean(Set A) - mean(Set B). The sample data was graphed. Then the data was scrambled and an avediff measurement was collected 100 times to “shuffle and deal” the data.  Then the avediff measures were graphed and our original avediff value was plotted to see where our avediff was on the graph.

If there was no difference in the two data sets, then the avediff would be zero, so the scrambled data should create a normal curve centered around zero. A difference only due to chance would cause our original avediff to fall near the center of our scrambled curve. If the original avediff fell outside the bell curve of the data, the alternate hypothesis was accepted. The percent of the all shuffled avediff greater than the original avediff called p-value was also calculated which indicates the probability of observing a sample statistic as extreme as the test statistic. We considered an alpha value of 0.05 to be acceptable.

Mann-Whitney U-Test

A Mann-Whitney U-Test is another test for analyzing data. Unlike the t-test it does not require the assumption of normal distributions. It is nearly as efficient as the t-test on normal distributions.

Both an alternative hypothesis (HA) and null hypothesis (HO) were written.  For example:

HA: The month you are born is greater if you were born in VA.

HO: The month you are born is NOT greater if you were born in VA.

 

First, the data values are ranked from the lowest to the highest.  Assign ranks to each of the data starting from 1. If there are several data sharing the same value, average what their rank should have been to get the real rank.

 

Calculate the mean of the ranks and U value using the formula below:

Where Na and Nb are the number of data points in each category and Ra is the sum of the ranking values for that category.

Find the lower of the two values and compare it in the table of critical values to determine whether or not is significant.

bottom of page