All assignments for the class will be listed here.

  • Five Four homework assignments, each with 2-3 programming problems.
  • A midterm “tutorial” assignment where you will write up a short tutorial on a data science subject.
  • A final project, done in groups, on a data science problem of your choosing.

All assignments will be released by 11:59 PM ET on the release date, and are due at 11:59pm ET (midnight) on the due date.

You are expected to know and adhere to the course policies, which govern late days, submissions, and collaboration.

Assignment dates

We may occasionally modify assignment dates and scopes. If we do that, there will be an announcement in-class and an update here.

Assignment Release date Due date Notebooks Colab
Homework 1 Feb 9 Feb 23 hw1_get_started
hw1_scraper
hw1_xml_parser
hw1_get_started
hw1_scraper
hw1_xml_parser
Homework 2 Feb 28 Mar 16 hw2_relational_data
hw2_time_series
hw2_graph_library
hw2_relational_data
hw2_time_series
hw2_graph_library
Tutorial Mar 15 Mar 26 (proposal)
April 8 (submission)
April 15 (evaluations)
   
Homework 3 Mar 16 April 1 hw3_linear
hw3_text
hw3_nlp
hw3_linear
hw3_text
hw3_nlp
Project Apr 12 Apr 22 (proposal)
May 14 (video)
May 17 (report)
   
Homework 4 April 15 April 29 hw4_bayes
hw4_unsupervised
hw4_cf
hw4_bayes hw4_unsupervised hw4_cf

TAs may not be available to answer questions about an assignment after its due date; keep this in mind before deciding to use your grace days.

Homework

Homeworks are distributed Jupyter notebooks (we will also link Colab notebooks shortly), and are submitted for grading using code in the notebook as well (we will post a description of this proceess along with the first homework). To submit the assignments, sign up for an account (with your andrew email) on the autograding site https://mugrade.datasciencecourse.org

Tutorial

In lieu of a midterm exam, students will write a tutorial on a data science topic of their choosing. More information will be posted here when the assignment is released. Again, no late days are permitted on the tutorial, and failure to submit by the deadline will result in zero points.

Final project

The final project of the course will consist of a large data science project done in teams of 2-3 people (single person or four person teams will be considered on an individual basis). The final report for this project will be a Jupyter notebook detailing the data collection, analysis, and results. In addition to the report, teams will also prepare a short video for showing during a final project video session.

No late days are permitted on the final project, and failure to submit by the deadline will result in zero points.