BlackWidow High-Radix Clos Inter-Chip Network

S. Scott, D. Abts, J. Kim, and W.J. Dally. "The BlackWidow High-Radix Clos Network." International Symposium on Computer Architecture (ISCA), June 2006.

This paper discusses the network used in the Cray X2 (BlackWidow) supercomputer. There are actually two different networks: the system-level network that connects processors together and the router-level on-chip network that implements a single-chip high-radix router (YARC). What topology, routing, and flow-control is used in each network? Students should attempt to try and understand each network's application requirements and technology constraints, since ultimately these factors result in two very different network architectures. The system-level network topology diagrams in Figure 1 and 2 are actually a little difficult to understand. Students will probably need to read the text in Section 2.2 carefully and sketch their own diagrams. Look at Figure 4 and identify the router stages we have discussed in class. Why is such a long router latency acceptable? Students should spend some time trying to understand how the YARC microarchitecture actually works. It might be interesting for students to consider how the YARC microarchitecture relates to the on-chip flattened butterfly MICRO'07 paper by a similar set of authors. Students should also try and understand the system-level routing algorithm used by the Cray X2. Why does it use a oblivious deterministic routing algorithm for requests but an adaptive routing algorithm for responses? How can the request routing algorithm, which is oblivious and deterministic, still provide good global load balancing? Section 7 provides a nice discussion of various aspects of both the system-level and router-level networks. The related ISCA'05 paper introduces the general concept of implementing high-radix routers in this style, and the related SC'07 paper describes the Cray X2 supercomputer that actually uses this network.