A Goodness of Fit test is used to determine if values from a single categorical variable come from a specified distribution.

Exploritory Data Analysis

You will want to summarize the data in a table and display it in a bar chart. Once you are convinced that the Goodness of Fit test is appropriate the chisq.test() function will do all of the necessary calculations for you.

For this example suppose that the single categorical variable of interest has three categories, A, B, and C. The values and frequencies are stored in vals.table.

## my.vals
##  A  B  C 
## 41 49 60

Goodness of Fit Test

The default test performed by chisq.test() assumes that the proportion, or frequencies, of successes are equal among all the categories. This can be changed by modifying the p = argument of the function. For our example, the null hypothesis would be H_0: \(p_A=1/3\), \(p_B = 1/3\), \(p_C = 1/3\)

chisq.test(vals.table) # Calculates a test statistic and p-value.
## 
##  Chi-squared test for given probabilities
## 
## data:  vals.table
## X-squared = 3.64, df = 2, p-value = 0.162

Expected Values

You can also specify values from the test by using the $. For example, chisq.test(tb)$expected returns the values that are most likely to occur assuming the null hypothesis is true. With 150 observations and 3 groups, we expect each group to have 50 observations each.

chisq.test(vals.table)$expected
##  A  B  C 
## 50 50 50

Mathematicss, Computer Science, and Statistics Department Gustavus Adolphus College