Lab 6: R Packages & Metrics for Classification Models

Project 6

Agenda

  1. Ice Breaker
  2. Review R Packages
  3. Project 6 Part 1
  4. Classification Model Metrics for Project 6
  5. Worksheet
  6. Time for work on Project 5.

R Packages

devtools Functions:

  • load_all(): loads all functions and data.
  • document(): remakes documentation.
  • check(): checks whether package meets CRAN’s guidelines.
  • test(): runs unit tests (not required for project).

usethis Functions:

  • use_r(): creates R script for function.
  • use_data(): saves data in data/ folder.

Roxygen2 Documentation

  • Added using special comments that start with #'.
  • Placed directly above function definition.
  • Fills Rd files in man/ folder when devtools::document() is run.
    • These files are used to create help pages.
#' Add together two numbers
#' 
#' @param x A number.
#' @param y A number.
#' @returns A numeric vector.
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
  x + y
}

Project 6 Part 1

Work time for Project 6 Part 1!

15:00

TP, TN, FP, FN

Example: Classifying tumors.

Postive class: “malignant” (cancerous)

Negative class: “benign” (not cancerous)

  • True Positives (TP): Model predicts malignant, reality is malignant.
  • True Negatives (TN): Model predicts benign, reality is benign.
  • False Positives (FP): Model predicts malignant, reality is benign.
  • False Negatives (FN): Model predicts benign, reality is malignant.

Which of these is the most important to minimize (in this context)?

Accuracy, Precision, and Recall

\[ \begin{align*} \text{Accuracy} \quad &= \quad \frac{TP + TN}{TP + TN + FP + FN} &&= \quad \mathbb{P}[\text{correct prediction}], \\ \text{Precision} \quad &= \quad \frac{TP}{TP + FP} &&= \quad \mathbb{P}[(+ \text{ in reality}) \text{ given } (\text{test } +)] \\ \text{Recall} \quad &= \quad \frac{TP}{TP + FN} &&= \quad \mathbb{P}[(\text{test } +) \text{ given } (+ \text{ in reality})] \end{align*} \]

Worksheet

Work on the worksheet for Project 6!

15:00

Project 5

Work time for Project 5.

30:00