CSCI 7000: Systems for Machine Learning (Fall 2025)
The recent surge in AI capabilities has largely been fueled by the large-scale computing infrastructure that trains massive machine learning models and serves them to millions of users. This course provides an overview of the systems powering modern AI applications such as ChatGPT, with a focus on designing performant, efficient, and scalable systems for ML training and inference. Topics include core systems techniques deployed in modern ML infrastructure, ML data management systems, and large language model (LLM) systems. Students will study real-world ML systems, discuss cutting-edge research, and collaborate on an open-ended research project.
This course is a graduate-level Topics class focused on research in systems infrastructure for machine learning. Students will a) learn how to train and deploy large-scale machine learning models on real-world infrastructure, b) deploy state-of-the-art systems techniques to improve scalability and efficiency within ML systems, and c) understand and synthesize recent advancements in ML systems research. A large portion of this course will focus on modern large language model (LLM) systems. This course is designed for students interested in pursuing research in machine learning systems or planning to work in machine learning infrastructure and engineering. A solid background in computer systems is required. Prior experience with machine learning is recommended but not mandatory. Advanced undergraduate students should reach out if they are interested in taking the course.
Lectures: Tuesday/Thursday, 12:30-1:45PM, Muenzinger Psyc & Biopsych E432
Communications: Piazza
Assignment Submission: Gradescope
Course Notes: Canvas
Instructor
Mark Zhao
Office Hours: Thursdays 2:00PM-3:00PM, or by appointment.
ECCR 1B26, Engineering Center.
Contact: You can reach me at myzhao@colorado.edu.
BASIL Research Group