+ - 0:00:00
Notes for current slide
Notes for next slide

FundRmentals 05

Dr Danielle Evans | University of Sussex

1 / 11

Setup & Q&A

  1. Open RStudio

  2. Open/create your fundRmentals R Project (click the blue cube in the top right corner)

    • To create one, go to File, New Project, New Directory, New Project, give your project a name, then click Create
    • Then create data & r_docs folders (in the Files pane click New Folder)
  3. Download the worksheet here... & save it to your r_docs folder

  4. Run the code chunks where we load tidyverse & read in the data



Any questions from the last tutorial?

2 / 11

Overview

  • tidyverse

  • Wrangling data

  • The pipe %>%

  • Next steps

3 / 11

tidyverse & dplyr

  • The tidyverse is a collection of packages that all share a similar design philosophy, grammar, & data structure

  • When we load tidyverse, it loads several different packages automatically:

    • readr, dplyr, forcats, ggplot2, tidyr, purrr, tibble, & stringr
  • dplyr (like pliers..) is a useful package for wrangling/manipulating data & includes functions to:

    • select(),

    • filter(),

    • mutate() &

    • rename() variables

4 / 11

dplyr::select()

  • dplyr::select() is for selecting or removing specific columns in a dataset, follows this structure:
object_name <- dplyr::select(.data = object_name, column_name) # for selecting
object_name <- dplyr::select(.data = object_name, -column_name) # for removing

Assignment <-

  • Last week we learned that we can save the output of commands by creating an object with the assignment operator <-

  • When manipulating our data sometimes it might be useful to create a new object (& keep the original) by using a different object name, and other times it might be more useful to overwrite the original object by using the same object name

Task: remove the Experience column & create a new object by calling it emp_data_2

5 / 11

dplyr::filter()

  • dplyr::filter() is for keeping rows of data based on some conditions like these:

To keep rows equal to some text:

object_name <- dplyr::filter(.data = object_name, column_name == "some text")

To keep rows not equal to some text:

object_name <- dplyr::filter(.data = object_name, column_name != "some text")

To keep rows where value is smaller than 50:

object_name <- dplyr::filter(.data = object_name, column_name < 50)

To keep rows that meet BOTH conditions;

object_name <- dplyr::filter(.data = object_name, column_name_1 < 2 & column_name_2 == FALSE)

Task: filter the emp_data_2 object by keeping participants above Age 29

6 / 11

dplyr::mutate()

  • dplyr::mutate() is for creating new columns/overwriting columns from existing ones by mutating them in some way

Creates a new column by multiplying the old values by 500:

object_name <- dplyr::mutate(.data = object_name, new_column = old_column*500)

Creates a new column by squaring the old values:

object_name <- dplyr::mutate(.data = object_name, new_column = old_column^2)

Overwrites the old column by dividing the old values by 2:

object_name <- dplyr::mutate(.data = object_name, old_column = old_column/2)


Task: using the emp_data_2 object create a new column called Attractiveness from the Q15620 column, by multiplying the scores by 100

7 / 11

dplyr::rename()

  • dplyr::rename() is for renaming columns

To rename one variable:

object_name <- dplyr::rename(.data = object_name, new_column_name = old_column_name)

To rename multiple variables:

object_name <- dplyr::rename(.data = object_name,
new_column_name = old_column_name,
new_column_name_2 = old_column_name_2,
new_column_name_3 = old_column_name_3)





Task: using the emp_data_2 object rename the Q14097 column to be called: Empathy

8 / 11

The Pipe %>%

  • The pipe %>% is part of the magrittr package & loads with library(tidyverse) or library(magrittr)

  • We can use the pipe %>% to chain multiple commands together

  • It pipes the output from the left into the functions/commands on the right:

a_function(object_name) %>% another_function()
  • It helps us avoid confusing nested functions which makes code easy to read & less prone to errors

  • To insert a pipe %>% we can use the keyboard shortcut of Ctrl + Shift + M or Command + Shift + M

9 / 11

The Pipe %>%

Nested example ()

am_routine <- leave_house(get_dressed(get_ready(wake_up(person = me, time = "too_early"),
existential_crisis = TRUE, breakfast = FALSE, coffee_cups = 50), clothing = "pyjamas",
footwear = "fluffy_slippers"), university = FALSE, zoomiversity = TRUE)


Piped example %>%

am_routine <- me %>%
wake_up(person = ., time = "too_early") %>%
get_ready(person = ., existential_crisis = TRUE, breakfast = FALSE, coffee_cups = 50) %>%
get_dressed(person = ., clothing = "pyjamas", footwear = "fluffy_slippers") %>%
leave_house(person = ., university = FALSE, zoomiversity = TRUE)
  • Whenever you see . or ., it's just a placeholder for the function argument that's being piped through

Task: try putting all the previous steps (select, filter, mutate, rename) into one pipeline!

10 / 11

Next Steps

  • Save the RMarkdown document from today Ctrl + S or Command + S

  • For next week, work through the discovr_05 tutorial by pasting the below code into the Console in RStudio & pressing Enter.

learnr::run_tutorial("discovr_05", package = "discovr")
  • This tutorial is pretty long, and generally visualising data isn't something we learn by heart (or need to), but you should aim to try go through the ggplot2 section as a minimum, then ideally complete the boxplots, plotting means, and scatterplots sections (you can ignore the faceting tasks if you want to)

  • There's also an empty code chunk for you to have a go at creating a plot from the emp_data in the worksheet from today if you want to!

11 / 11

Setup & Q&A

  1. Open RStudio

  2. Open/create your fundRmentals R Project (click the blue cube in the top right corner)

    • To create one, go to File, New Project, New Directory, New Project, give your project a name, then click Create
    • Then create data & r_docs folders (in the Files pane click New Folder)
  3. Download the worksheet here... & save it to your r_docs folder

  4. Run the code chunks where we load tidyverse & read in the data



Any questions from the last tutorial?

2 / 11
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow