Reproducible Data Analysis
The ability to analyze data using code is increasingly in demand across academic disciplines from sciences to the humanities, as well as industry.
Using code instead of point-and-click software means that analyses can be reproducible, which allows them to be re-used and trusted.
In this course, students will learn how to use the R programming language among other tools for reproducible data analysis. Emphasis will be placed on practical usage and best practices to ensure reproducibility. No prior programming experience is required, but it will be helpful if students have a topic that they are interested in analyzing for the final report. All students are expected to have access to a modern computer (laptop) capable of installing and running R, RStudio, Git, and Quarto.
Goals
The goal of this course is to learn how to conduct reproducible data analysis using R.
By the end of the course, students should be able to:
- Load, clean, and visualize data using R
- Track changes to code using Git and GitHub
- Write a reproducible report using Quarto
Slides
Recordings
For other lecture recordings, see Moodle.
Materials
Important Deadlines
2025-06-25 11:59PM: Day 2 homework due
2025-07-02 11:59PM: Day 3 homework due
2025-07-09 11:59PM: Day 4 homework due
2025-07-16 11:59PM: Day 5 homework due
2025-07-30 11:59PM: Final paper due
About homework sets
All homework assignments (except Day 2) are submitted as R scripts.
Your code must run without errors.
The easiest way to check this is to restart RStudio by closing and opening it, then opening the R script and pressing the “Source” button. Your code should run without errors, and any expected objects such as answer_1, etc. should show up in the environment panel.
The process for the final paper is similar, but instead of clicking “Source”, click “Render”.
Please check this before submitting your assignments!
Extra Practice
The following are interactive tutorials in Japanese that you can run in your web browser. They were designed for the 情報処理演習 course. Note that although the tutorials are used as homework in the 情報処理演習 course, they are NOT required assignments for this course. I only provide them here so you can use them for extra practice if you want. Please note that bandwidth is limited to run the tutorials, so they may not always be available.
以下のリンクは元々は「情報処理演習」のために作成された、webブラウザの中で実行できるRコードのチュートリアルです。
注意:以下のチュートリアルは「情報処理演習」の宿題に使われていますが、本授業の宿題ではありません。 復習のためだけに掲載されています。
なお、使用量に限りがありますので、同時に数人以上使われている場合は接続できない可能性があります。
Office Hours
By appointment, please send an email to joelnitta@chiba-u.jp