Course Information

  • Instructor: Genya Ishigaki
  • Office Hours:
    • Mondays 1:15 PM - 3:00 PM (Zoom)
      • You do NOT need to make an appointment for these Zoom office hours. You can join the zoom meeting any time during this period.
      • Please expect some waiting time. You will be admitted when your turn comes.
    • By appointment
  • Class Days/Time: Mondays & Wednesdays 12:00 PM - 1:15 PM
  • Class mode: Hybrid
    • If not specified, the classes will be conducted on Zoom.
    • In-person sessions will be specified in the course schedule. The classroom for the in-person sessions is MacQuarrie Hall 225.
  • Prerequisites: CS 157A. Limited to MSCS, MSBI, and MSDS students.

Course Description

Introduction to reinforcement learning, deep reinforcement learning, other online learning algorithms, and their applications.

Course Learning Outcomes (CLO)

Upon successful completion of this course, students will be able to:

  • Distinguish different types of reinforcement learning algorithms and when to use them.
  • Describe the benefits and potential challenges of deep reinforcement learning.
  • Apply reinforcement learning algorithms to real-world problems.
  • Analyze and evaluate the performance of reinforcement algorithms.
  • Create a machine learning project to solve a social or technical issue.

Textbook

  • Richard S. Sutton and Andrew G. Barto, Reinforcement learning: An introduction (Second edition), MIT press, 2018.
    • This book is available online for free on the authors’ page.
    • We do not cover all the topics in the book as it is a comprehensive textbook. Appropriate sections will be indicated in syllabus and classes.
  • Open AI, Spinning Up in Deep RL
    • While the page says “Deep” RL, many of their resources explain the basics of RL itself.
  • (Optional) Yoav Shoham and Kevin Leyton-Brown, Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations, Cambridge University Press, 2009.

Other Equipment

Grading

Exams, Assignments, and Projects

  • This course is designed as a research-oriented course so that students can experience a process of machine learning projects: problem formulation, modeling, method selection, and development.
    • The project requires students to apply (deep) reinforcement learning to some practical problems.
    • It is recommended to form a group of TWO students. I may approve exceptions (individual or group of three) with a valid reason.
    • Some example topics will be presented and discussed during a class, but students can choose any topic that they found interesting.
    • Major programming contribution from EVERY group member is required for a passing grade. Details will be explained in class.
  • All exams are planned to be conducted during the regular class hours.
    • [Note for Spring 2022] The exam format may be altered to take-home, depending on the COVID situation. The announcement will be made during the class and through Canvas.
  • Assignments may include both theoretical and programming questions.
    • You are allowed to use programs to answer the theoretical questions, too.
Item % in Final Grade
Exam 1 16 %
Exam 2 16 %
Assignment 1 13 %
Assignment 2 13 %
Assignment 3 13 %
Project Idea/Proposal Presentations 5 %
Project Final Presentation 8 %
Project Paper 16 %

Grading Table

Total Grade Letter Grade
97% and above A plus
92% to 96% A
90% to 91% A minus
87% to 89% B plus
82% to 86% B
80% to 81% B minus
77% to 79% C plus
72% to 76% C
70% to 71% C minus
67% to 69% D plus
62% to 66% D
60% to 61% D minus
59% and below F

Extra-credits and Reworks

In the exams, you will see some extra-credit problems to earn more points towards the total grade. No other extra-credit or rework opportunity will be given.

Late Submission

No late submission will be accepted. Please make sure you submit the assignments before the deadlines. There is no exception.

Attendance

I will take attendance for some random classes only to check the study progress with the online learning settings. The attendance will NOT be reflected in the grade.

Students not attending either of the first two classes will be dropped to make room for students on the waiting list. Attempting to get marked as present (by having someone else attend in your place or using technological deceptions) will be considered academic dishonesty and at a minimum will result in you getting dropped from the course.

Grading Policy

The University Policy S16-9, Course Syllabi (http://www.sjsu.edu/senate/docs/S16-9.pdf) requires the following language to be included in the syllabus:

“Success in this course is based on the expectation that students will spend, for each unit of credit, a minimum of 45 hours over the length of the course (normally three hours per unit per week) for instruction, preparation/studying, or course related activities, including but not limited to internships, labs, and clinical practica. Other course structures will have equivalent workload expectations as described in the syllabus.”

University Policies

Per University Policy S16-9, university-wide policy information relevant to all courses, such as academic integrity, accommodations, etc. will be available on Office of Graduate and Undergraduate Programs’ Syllabus Information web page at http://www.sjsu.edu/gup/syllabusinfo/. Make sure to review these policies and resources.

Tentative Schedule and Topics

Week Date Topic Reference Note
1 1/26 Overview    
2 1/31 What is Learning? Shoham & Leyton-Brown Chap 7
Paper
 
2 2/2 MDP Sutton & Barto Chap 3  
3 2/7 Policies and Value Functions Sutton & Barto Chap 3  
3 2/9 Dynamic Programming Sutton & Barto Chap 4  
4 2/14 Dynamic Programming    
4 2/16 Model-free prediction Sutton & Barto Chap 5 Assignment 1 due
5 2/21 Model-free prediction    
5 2/23 Model-free control Sutton & Barto Chap 6  
6 2/28 Approximation Sutton & Barto Chap 9  
6 3/2 Taxonomy and Review Spinning Up in Deep RL: Taxonomy Assignment 2 due
7 3/7 Exam 1   In-Person
7 3/9 Approximation Example Spinning Up in Deep RL  
8 3/14 Deep RL   Project Pair due
8 3/16 Deep RL    
9 3/21 Project explanation & Implementation    
9 3/23 MAB and Regret Sutton & Barto Chap 2
Shoham & Leyton-Brown Chap 7
 
10 4/4 Application of RL    
10 4/6 Integrating Learning and Planning Sutton & Barto Chap 8  
11 4/11 Project Discussion   Project Idea Slides due
11 4/13 Policy Gradient Methods Sutton & Barto Chap 13  
12 4/18 Policy Gradient Methods    
12 4/20 Actor-Critic Methods    
13 4/25 Proposal Presentation   Project Proposal Slides due
13 4/27 Review Sutton & Barto Chap 13 Assignment 3 due
14 5/2 Exam 2   In-Person
14 5/4 Explainable RL    
15 5/9 Distributed and Federated RL    
15 5/11 Final presentation   Final Presentation Slides due
16 5/16 Final presentation    
  5/20     Project Paper due
  • If you do not have right equipment (laptop, etc.)
  • If you want to talk to someone
    • “Whether you are struggling with stress, depression, anxiety or relationship problems, Counseling and Psychological Services is here to provide the support you need to succeed at SJSU. In our current state of remote online instruction, CAPS is providing all of its services through confidential telehealth sessions.”
    • https://www.sjsu.edu/counseling/
  • If you need additional accommodation for your learning
    • “The Accessible Education Center (AEC) proudly presents its vision of redefining ability at San Jose State University by providing comprehensive services in support of the educational development and success of student with disabilities.”
    • https://www.sjsu.edu/aec/
  • If you find a financial challenge
    • “SJSU Cares is here to provide assistance when you need it most. We provide resources and services for SJSU students facing an unforseen financial crisis. If you’re having trouble paying for food, housing or other bills, face homelessness, food insecurity, etc.”
    • https://www.sjsu.edu/sjsucares/