Joshua Korenblat will cover cleaning datasets using R packages. If you would like to follow along, please install R and the Tidyverse.
R is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. R is an integrated suite of software facilities for data manipulation, calculation and graphical display and includes
- an effective data handling and storage facility,
- a suite of operators for calculations on arrays, in particular matrices,
- a large, coherent, integrated collection of intermediate tools for data analysis,
- graphical facilities for data analysis and display either on-screen or on hardcopy, and
- a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.
Lightning Talks
- Currently looking for volunteers. Sign up if you’d like to give one.