All assignments for the class will be listed here.

  • There will be approximately 4-5 homework assignments, each with a number of programming problems.
  • There may be additional sets of practice problems assigned to help reinforce various topics.
  • A midterm “tutorial” assignment where you will write up a short tutorial on a data science subject.
  • A final project, done in groups, on a data science problem of your choosing.

All assignments will be due at 11:59 pm ET (midnight) on the due date.

You are expected to know and adhere to the course policies, which govern late days, submissions, and collaboration.

Assignment dates

Due dates are tentative for any assignments that haven’t been released yet.

Assignment Due date Files (zipped tarball) Colab version
Homework 1 Feb 8 hw1_get_started.tar.gz
hw1_scraper.tar.gz
hw1_xml_parser.tar.gz
hw1_get_started
hw1_scraper
hw1_xml_parser
Homework 2 Feb 28 hw2_relational_data.tar.gz
hw2_time_series.tar.gz
hw2_graph_library.tar.gz
hw2_relational_data
hw2_time_series
hw2_graph_library
Homework 3 Mar 28 hw3_linear.tar.gz
hw3_text.tar.gz
hw3_nlp.tar.gz
hw3_linear
hw3_text
hw3_nlp
Tutorial Mar 16 (Proposal)
Apr 6 (Submission)
Apr 13 (Peer evaluation)
   
Homework 4 Apr 25 hw4_bayes.tar.gz
hw4_unsupervised.tar.gz
hw4_cf.tar.gz
hw4_bayes
hw4_unsupervised
hw4_cf
Project Apr 15 (Proposal)
May 4 (Video)
May 9 (Report)
   

Homework

Homeworks are distributed Jupyter notebooks (we will also link Colab notebooks shortly), and are submitted for grading using code in the notebook as well (we will post a description of this proceess along with the first homework). To submit the assignments, sign up for an account (with your andrew email) on the autograding site https://mugrade.datasciencecourse.org

Tutorial

In lieu of a midterm exam, students will write a tutorial on a data science topic of their choosing. More information may be found here.

Again, no late days are permitted on the tutorial, and failure to submit by the deadline will result in zero points for the proposal component.

Final project

The final project of the course will consist of a large data science project done in teams of 2-3 people (single person or four person teams will be considered on an individual basis). The final report for this project will be a Jupyter notebook detailing the data collection, analysis, and results. In addition to the report, teams will also prepare a short video for showing during a final project video session.