A Chi-Squared, \(\chi^2\), Test for Association, or Test for Independence as it is sometimes called, is used to determine whether two categorical variables are associated, or independent.

Exploritory Data Analysis

You will want to summarize the data in a contingency table and display it in stacked bar charts or in side-by-side bar charts. Once you are convinced that the Goodness of Fit test is appropriate the chisq.test() function will do all of the necessary calculations for you.

Suppose a new drug is being tested. It is often interesting to know if the new drug is producing "successes" at a rate similar to the old drug. There are several ways to approach this problem but the most common is a Chi-squared Test.

tbl <- with(trt, table(Treatment, Outcome))
tbl
##          Outcome
## Treatment Failure Success
##       New      30      20
##       Old      30      20

Chi-squared Test for Association

The null hypothesis is that there is no association between these two variables.

chisq.test(tbl, correct=FALSE)
## 
##  Pearson's Chi-squared test
## 
## data:  tbl
## X-squared = 0, df = 1, p-value = 1

Expected Values

You can also specify values from the test by using the $. For example, chisq.test(tb)$expected returns the values that are most likely to occur assuming the null hypothesis is true.

chisq.test(tbl)$expected
##          Outcome
## Treatment Failure Success
##       New      30      20
##       Old      30      20

Mathematicss, Computer Science, and Statistics Department Gustavus Adolphus College