This course aims to provide a strong foundation for students to understand modern computer system architecture and to apply these insights and principles to future computer designs. While suitable for those students directly interested in computer engineering, this course can also provide a foundation for students interested in performance programming, compilers, and operating systems; and it can provide system-level context for students interested in emerging technologies and digital circuits.

The course is structured around the three primary building blocks of general-purpose computing systems: processors, memories, and networks. The first half of the course focuses on the fundamentals of each building block and will enable students to understand the design of a basic message-passing multicore system. Topics include processor microcoding and pipelining; cache microarchitecture and optimization; and network topology, routing, and flow control. The second half of the course delves into more advanced techniques and will enable students to understand how these three building blocks can be integrated to build a modern shared-memory multicore system. Topics include complex pipelining, out-of-order execution, register renaming, memory disambiguation, VLIW, vector processing, multithreading, synchronization, consistency, coherence, address translation and protection, virtual memory, flit-based flow control, and virtual channels. Students will learn how to evaluate design decisions in the context of past, current, and future application requirements and technology constraints.

A significant project is decomposed into seven lab assignments. Throughout the semester, students will gradually design, implement, test, and evaluate a complete multicore system capable of running real parallel applications at the register-transfer level.