In general, the functions used to calculate summary statistics for a single group are the same functions used to calculate summary statistics for multiple groups.
The examples below use data from the mtcars
data set.
The aggregate()
function is a function that can be used to calculate a statistics for many groups. This function is automatically loaded when R is started.
with(mtcars, aggregate(mpg ~ gear, FUN = mean))
## gear mpg
## 1 3 16.10667
## 2 4 24.53333
## 3 5 21.38000
with(mtcars, aggregate(mpg ~ gear, FUN = median))
## gear mpg
## 1 3 15.5
## 2 4 22.8
## 3 5 19.7
with(mtcars, aggregate(mpg ~ gear, FUN = sd))
## gear mpg
## 1 3 3.371618
## 2 4 5.276764
## 3 5 6.658979
The dplyr
package makes calculating statistics for multiple groups easy. This process is the same as calculating summary statistics for a sinble group with one additional step. See the dplyr
section of the summary statistics page for details.
library(dplyr)
mtcars %>%
group_by(gear) %>%
summarize( Min = min(mpg),
Q1 = quantile(mpg, .25),
Avg_MPG = mean(mpg),
Q3 = quantile(mpg, .75),
Max = max(mpg)
)
## # A tibble: 3 x 6
## gear Min Q1 Avg_MPG Q3 Max
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 3 10.4 14.5 16.1 18.4 21.5
## 2 4 17.8 21 24.5 28.1 33.9
## 3 5 15 15.8 21.4 26 30.4
The mosaic
package makes changes to some of the basic functions in R so that they may use the ~
operator to calculate summary statistics for groups. It also modifies functions to better handle missing data.
library(mosaic) # Loads the mosaic package
favstats(mpg ~ gear, data = mtcars)
## gear min Q1 median Q3 max mean sd n missing
## 1 3 10.4 14.5 15.5 18.400 21.5 16.10667 3.371618 15 0
## 2 4 17.8 21.0 22.8 28.075 33.9 24.53333 5.276764 12 0
## 3 5 15.0 15.8 19.7 26.000 30.4 21.38000 6.658979 5 0
library(mosaic) # Loads the mosaic package
mean(mtcars$mpg ~ mtcars$gear)
## 3 4 5
## 16.10667 24.53333 21.38000
Mathematicss, Computer Science, and Statistics Department Gustavus Adolphus College