Pedagogy, Infrastructure, and Analytics for Data Science Education at Scale
Vinitra Swamy
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2018-81
May 19, 2018
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-81.pdf
This report presents an educational computing environment for data science education at scale, highlighted in use at the University of California, Berkeley. With the rise of online learners in massively open computing courses (MOOCs), we detail a relevant technical case study of the decisions made in converting an introductory undergraduate data science course into a series of data science edX MOOCs. The focus of this study is on the student and instructor workflow, distributed system infrastructure, cost analysis, cloud resource allocation, and autograding integration in the scaling process. We implement an analytics pipeline for collecting data from Jupyter notebooks and propose a Deep Knowledge Tracing modification to model student progress on coding assignments.
Advisors: David E. Culler
BibTeX citation:
@mastersthesis{Swamy:EECS-2018-81, Author= {Swamy, Vinitra}, Title= {Pedagogy, Infrastructure, and Analytics for Data Science Education at Scale}, School= {EECS Department, University of California, Berkeley}, Year= {2018}, Month= {May}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-81.html}, Number= {UCB/EECS-2018-81}, Abstract= {This report presents an educational computing environment for data science education at scale, highlighted in use at the University of California, Berkeley. With the rise of online learners in massively open computing courses (MOOCs), we detail a relevant technical case study of the decisions made in converting an introductory undergraduate data science course into a series of data science edX MOOCs. The focus of this study is on the student and instructor workflow, distributed system infrastructure, cost analysis, cloud resource allocation, and autograding integration in the scaling process. We implement an analytics pipeline for collecting data from Jupyter notebooks and propose a Deep Knowledge Tracing modification to model student progress on coding assignments.}, }
EndNote citation:
%0 Thesis %A Swamy, Vinitra %T Pedagogy, Infrastructure, and Analytics for Data Science Education at Scale %I EECS Department, University of California, Berkeley %D 2018 %8 May 19 %@ UCB/EECS-2018-81 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-81.html %F Swamy:EECS-2018-81