class: center, middle, inverse, title-slide # FundRmentals 03 ### Dr Danielle Evans | University of Sussex --- # Overview - Q&A - Session setup - Packages, functions, & objects - Importing data - Inspecting data - Next steps --- class: middle, center # Any questions from the last tutorial? --- # Setup 1. Open RStudio 1. Create an R Project for the fundRmentals course or open the project we made last week + To create one, go to **File**, **New Project**, **New Directory**, **New Project**, give your project a name, then click **Create** + Then create **data** & **r_docs** folders (in the **Files pane** click **New Folder**) + To open your project from last week, check if you're already in it by looking at the project name in the top right corner of RStudio, if you are not, click on the dropdown arrow and select it from the list 1. Create a new RMarkdown document + Go to **File**, **New File**, **RMarkdown**, name it **week_03**, then click **Create** 1. Delete everything in the RMarkdown document that we don't need: from line 11 down, then save your document to the **r_docs** folder --- # Packages - Packages are essentially just add-ons for RStudio - Last week we learned that we can use **install.packages()** to install most packages we want to use, where we just insert the name of the package in the brackets - If you weren't here last week, you need to install **"tidyverse"** on your device - To do that, we can run the below line in the **Console** ```r install.packages("tidyverse") ``` --- # Loading packages - To be able to use the contents of package within RStudio, we need to load it - We can do this by using the **library()** command - We should load a package once at the top of our RMarkdown document in a code chunk like this: ```r library("tidyverse") ``` - If we don't load a package before we try to use it, we'll get an error appear, so it's best to do it right at the beginning <br> ### Task 1. Load tidyverse in a new code chunk (**Ctrl + Alt + I** or **Cmd + Alt + I** to insert a new chunk) - don't forget to run it! --- # Functions & assignment <- - We've used two functions already today: **install.packages()** & **library()** - A function is a way of performing a task in R & they follow this type of structure: ```r package_name::function_name(argument_1 = object_name, argument_2 = 1, argument_3 = TRUE) ``` - Where inside the brackets, we specify different arguments/options - these could be the names of any data/objects, numbers, logical operators etc, & they give us some type of output - To save the output of a command, we can use the assignment operator **<-** - We use **<-** as a way of giving a name to some sort of **object** - whether that's the output of code, a dataset, a string of text, a table, a plot etc, like this: .tiny[ ```r object_name <- package_name::function_name(argument_1 = object_name, argument_2 = 1, argument_3 = TRUE) ``` ] ??? -- ```r my_name <- "danielle" ``` --- # Importing data with `tidyverse` - The **read_csv()** function from the **readr** package is one function we can use to import .csv data files - The **read_sav()** function from the **haven** package is one function we can use to import .sav data files - They both have one main argument which is the path to the file: ```r object_name <- readr::read_csv("./directions/to/the/data/file/go/here.csv") ``` ```r object_name <- haven::read_sav("./directions/to/the/data/file/go/here.sav") ``` ### Task 1. Create a new code chunk (**Ctrl + Alt + I** or **Cmd + Alt + I**), open the datafile at [this link](https://raw.githubusercontent.com/de84sussex/sussex_fundrmentals/master/docs/empathy_data.csv), copy the web address into the **readr::read_csv()** function, name the object **emp_data** & run it --- # Inspecting data - Below are some useful functions to explore your data, you should run these only in the console or make sure to delete them out of your RMarkdown file after you've seen the output - All you need to do is put the name of your **object** into the brackets of the function .pull-left[ To see your raw data: .tiny[ ```r object_name ``` ```r print(object_name) ``` ```r View(object_name) ``` ```r head(object_name) ``` ```r tail(object_name) ``` ] ] .pull-right[ To see characteristics of your data: .tiny[ ```r str(object_name) ``` ```r summary(object_name) ``` ```r names(object_name) ``` ```r ncol(object_name) ``` ```r nrow(object_name) ``` ] ] **Task**: try out these functions on our **emp_data** --- # Sub$etting - Sometimes, we'll want to access a single column in our dataset, we can do this by subsetting with the $, like this: ```r object_name$column_name ``` - By subsetting, we can apply functions to a single column - Two useful functions for inspecting columns are table() & class() ```r table(object_name$column_name) ``` ```r class(object_name$column_name) ``` **Task**: try out these functions on the **Harm** variable of our **emp_data** --- # Next Steps - For next week you should complete the **Summarizing data (1)** & **Creating a summary table (2)** sections of the **discovr_02** tutorial by pasting the below code into the **Console** in RStudio, pressing **Enter** & skipping to the relevant sections ```r learnr::run_tutorial("discovr_02", package = "discovr") ``` - I'm estimating 20 minutes for this tutorial - Next week we'll cover the pipe **%>%**, data wrangling & summarising