Introduction to normality and one-sample tests

Objective

To introduce students to hypothesis testing for continous data from one population. By the end of the lesson, you should be able to

  • define and recognize a normal distribution
  • explain how tests for continuous data differ in their assumptions and explain when each may be appropriate, including for
    • Z tests
    • t tests
    • Wilcoxon (signed-rank test)
    • sign tests
  • carry out the noted tests in R

Background reading

Course notes links for background reading also contain code used to produce R output used in slides.

Lecture slides (click to open in Google slides!)

Connected swirl lesson

Swirl is an R package that provides guided lessons to help you learn and review material. These lessons should serve as a bridge between all the code provided in the slides and background reading and the key functions and concepts from each lesson. A full course lesson (all lessons combined) can also be downloaded using the following instructions.

THIS IS ONE OF THE FEW TIMES I RECOMMEND WORKING DIRECTLY IN THE CONSOLE! THERE IS NO NEED TO DEVELOP A SCRIPT FOR THESE INTERACTIVE SESSIONS, THOUGH YOU CAN!

  • install the “swirl” package

  • run the following code once on the computer to install a new course

    library(swirl)
    install_course_github("jsgosnell", "JSG_swirl_lessons")
  • start swirl!

    swirl()
    • swirl()
  • then follow the on-screen prompts to select the JSG_swirl_lessons course and the lessons you want

    • Here we will focus on the Tests for continuous data from one sample lesson
  • TIP: If you are seeing duplicate courses (or odd versions of each), you can clear all courses and then re-download the courses by

    • exiting swirl using escape key or bye() function

      bye()
    • uninstalling and reinstalling courses

      uninstall_all_courses()
      install_course_github("jsgosnell", "JSG_swirl_lessons")
    • when you restart swirl with swirl(), you may need to select

      • No. Let me start something new

Connected assignment(click here)

Using these skills and applying concepts correctly to interpret data sets may seem easy when you read about them or listen during class, but practice is key to ensuring you understand the material. Practice problems are provided for each lesson. The link above points you to the appropriate link in the course notes. You can make a copy (technically a fork, since you can’t directly edit it) of the entire course notes website in github @ https://github.com/jsgosnell/cuny_biostats_book and work from there. The benefit is this allows you to see updates to the site (if you sync your fork). The downside is you have to work interactively or build the entire site when you render a changed file. This is doable but may take more time than students need (and may lead to merge issues!).

Alternatively,your instructor may use a different delivery method (like github classroom) or provide alternative problems.

In general you should only work edit .qmd files! Everything/anything else is produced during the session and should not be edited. All files can be uploaded to github though.

Solutions are also provided for all problems via the course notes, but try them before you look at the answers!

Extra material

Data referenced in class

  • Crossley et al. (2020)

  • Sandidge (2003)

References

Crossley, Michael S., Amanda R. Meier, Emily M. Baldwin, Lauren L. Berry, Leah C. Crenshaw, Glen L. Hartman, Doris Lagos-Kutz, et al. 2020. “No Net Insect Abundance and Diversity Declines Across US Long Term Ecological Research Sites.” Nature Ecology & Evolution 4 (10): 1368–76. https://doi.org/10.1038/s41559-020-1269-4.
Hunt, Michael. n.d. “QQ-Plots.” RPubs. https://rpubs.com/mbh038/725314.
Sandidge, Jamel S. 2003. “Scavenging by Brown Recluse Spiders.” Nature 426 (6962): 30–30. https://doi.org/10.1038/426030a.