We are starting to hear about “Thin” and “Fat” rails in AI fabrics. These are just another way to say a connection from compute into the Fabric. The link either has sufficient capacity (Fat) or insufficient capacity (Thin).
A thin rail provides the node with a single path into the fabric. That may be one NIC into one switch plane. It can also be two NICs that both land behind the same oversubscribed uplink.
A fat rail gives the node more usable path into the fabric. That may come from faster NICs, multiple fabric NICs, or separate switch planes that remain separate to diverse switching. The useful part is not the number of cables by itself. The useful part is the amount of bandwidth available for node-to-node traffic.
A single 400G rail can be “fat” enough for one cluster and too thin for another. The difference depends on what the job is doing. If nodes pass small messages and do little east-west traffic, the rail may have room. If the job moves large data sets between node groups, the uplink may saturate and become “thin”.
Think of this as just slang for a connection that is either too small or the right size for the type of AI job it is passing.
j2networks family of siteshttps://j2sw.com
https://startawisp.info
https://indycolo.net
#packetsdownrange #routethelight
Discover more from Justin Wilson (j2sw)
Subscribe to get the latest posts sent to your email.