Manipulating data frames with dplyr
Using base R data frame subsetting, create from star_wars…
02:30
name homeworld height weight
2 Padme Naboo 1.6 45
3 Luke Tatooine 1.7 77
You can form a logical vector through a logical comparison:
[1] TRUE FALSE TRUE FALSE
[1] TRUE TRUE

Four Principles:
|>slice()select()filter()mutate()arrange()summarize()group_by()Load the package.
slice()slice()Isolates particular rows of a data frame by row number.
slice()Isolates particular rows of a data frame by row number.
select()Selects variables by name or number.
select()Selects variables by name or number.
name homeworld height weight
1 Anakin Tatooine 1.8 84
2 Padme Naboo 1.6 45
3 Luke Tatooine 1.7 77
4 JarJar Naboo 1.9 90
homeworld
1 Tatooine
2 Naboo
3 Tatooine
4 Naboo
select()Selects variables by name or number.
name homeworld height weight
1 Anakin Tatooine 1.8 84
2 Padme Naboo 1.6 45
3 Luke Tatooine 1.7 77
4 JarJar Naboo 1.9 90
homeworld name
1 Tatooine Anakin
2 Naboo Padme
3 Tatooine Luke
4 Naboo JarJar
filter()filter()Returns rows that meet certain criteria.
filter()Returns rows that meet certain criteria.
filter()Returns rows that meet certain criteria.
You can add multiple conditions separated by
,
mutate()Adds a new variable that can be a function of previous variables.
arrange()Sort the rows of a data frame by the values of variables.
arrange()Sort the rows of a data frame by the values of variables.
arrange()Sort the rows of a data frame by the values of variables.
arrange()Sort the rows of a data frame by the values of variables.
summarize()Summarize a variable with a statistic.
summarize()Summarize a variable with a statistic.
name homeworld height weight
1 Anakin Tatooine 1.8 84
2 Padme Naboo 1.6 45
3 Luke Tatooine 1.7 77
4 JarJar Naboo 1.9 90
avg_height
1 1.75
summarize()Summarize a variable with a statistic.
name homeworld height weight
1 Anakin Tatooine 1.8 84
2 Padme Naboo 1.6 45
3 Luke Tatooine 1.7 77
4 JarJar Naboo 1.9 90
avg_height sd_height
1 1.75 0.1290994
Can calculate multiple statistics separated by
,
Using dplyr, create from star_wars…
03:00
select()Selects variables by name or number.
select()Selects variables by name or number.

library(tibble)
day_at_cal <- tibble(
weekday = factor(c("Mon","Mon","Tue","Tue","Wed","Wed"),
levels = c("Mon","Tue","Wed","Thu","Fri"), ordered = TRUE),
study_spot = factor(c("Moffitt","Doe","Cory","MLK","CITRIS","Soda")),
ai_hours = c(1.2, 0.4, 2.1, 1.0, 0.8, 1.6),
steps_k = c(8.5, 6.2, 9.1, 7.0, 5.8, 10.3))