Allen Shen

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2021-122

May 14, 2021

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-122.pdf

The quick rise in popularity of data science courses has led to a need to develop scalable infrastructure to support such courses. The most essential part of this scalable infrastructure is a scalable grading system that enables instructors to automatically grade student submissions without requiring them to look at each submission individually for an extended period of time. In this work, we discuss two autograding systems that help accomplish this task by providing a means to autograde Jupyter notebook assignments and Java-based assignments. We expand on the former by highlighting a workflow for distributing and autograding Jupyter notebook assignments via Otter Grader. We then talk about the pedagogy and infrastructure that enables data systems courses to support hundreds of students, and we finally discuss tools for learning data visualization with an emphasis on the Lux Jupyter notebook widget. We hope that this work helps lay the groundwork for the development of future tools and methods that enable learning at scale for data science courses.

Advisors: Joshua Hug


BibTeX citation:

@mastersthesis{Shen:EECS-2021-122,
    Author= {Shen, Allen},
    Title= {Pedagogy and Infrastructure for Upper-Division Data Science Courses},
    School= {EECS Department, University of California, Berkeley},
    Year= {2021},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-122.html},
    Number= {UCB/EECS-2021-122},
    Abstract= {The quick rise in popularity of data science courses has led to a need to develop scalable infrastructure to support such courses. The most essential part of this scalable infrastructure is a scalable grading system that enables instructors to automatically grade student submissions without requiring them to look at each submission individually for an extended period of time. In this work, we discuss two autograding systems that help accomplish this task by providing a means to autograde Jupyter notebook assignments and Java-based assignments. We expand on the former by highlighting a workflow for distributing and autograding Jupyter notebook assignments via Otter Grader. We then talk about the pedagogy and infrastructure that enables data systems courses to support hundreds of students, and we finally discuss tools for learning data visualization with an emphasis on the Lux Jupyter notebook widget. We hope that this work helps lay the groundwork for the development of future tools and methods that enable learning at scale for data science courses.},
}

EndNote citation:

%0 Thesis
%A Shen, Allen 
%T Pedagogy and Infrastructure for Upper-Division Data Science Courses
%I EECS Department, University of California, Berkeley
%D 2021
%8 May 14
%@ UCB/EECS-2021-122
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-122.html
%F Shen:EECS-2021-122