Infragraph Blueprints
Infragraph provides users with multiple blueprints that help define the foundational components of an infrastructure. These blueprints cover both devices and fabrics, enabling flexible and extensible modeling of network and compute infrastructure.
Each blueprint is implemented as Python source code, allowing users to create classes and objects that inherit from the generated SDK. This approach makes it possible to define more realistic and customizable representations of devices and fabrics while maintaining compatibility with Infragraph’s schema and data models.
All available blueprints can be found in the following directory:
Device Blueprints
Device blueprints enable you to instantiate specific device types using the Infragraph schema format. Each blueprint encapsulates attributes, component definitions, and configuration logic that represent actual hardware or node types within a network topology.
All device blueprints are organized under:
Each Infragraph device can expose multiple variants, allowing the same blueprint to represent different SKUs or generations of hardware. For example, a QSFP device supports the following variants:
- qsfp_plus_40g
- qsfp28_100g
- qsfp56_200g
- qsfp_dd_400g
- qsfp_dd_800g
Infragraph devices support variant selection at initialization time, enabling end users to choose the exact hardware variant they want to model when creating a device object.
The example below shows QSFP class definition and initialization using a specific variant, demonstrating how variants are applied in practice.
Creating a QSFP Transceiver Blueprint
The QSFP class defines a device blueprint for QSFP-family pluggable transceivers in InfraGraph.
It inherits from the base Device class and models the internal structure, ports, and signal binding of a QSFP module in a consistent and reusable way.
All QSFP variants share the same internal topology:
- One electrical host-facing interface
- One optical media-facing interface
- A fixed one-to-one electrical ↔ optical binding
Only the capabilities (form factor, speed, and lane count) differ across variants, and these are strictly controlled using a variant catalog.
Example: qsfp.py
Location: /src/infragraph/blueprints/devices/common/transceiver
QSFP device definition using OpenApiArt generated classes
from typing import Literal, Dict
from infragraph import *
QsfpVariant = Literal[
"qsfp_plus_40g",
"qsfp28_100g",
"qsfp56_200g",
"qsfp_dd_400g",
"qsfp_dd_800g"
]
QSFP_VARIANT_CATALOG: Dict[QsfpVariant, dict] = {
"qsfp_plus_40g": {
"form_factor": "QSFP_PLUS",
"speed": "40Gbps",
"lanes": 4,
},
"qsfp28_100g": {
"form_factor": "QSFP28",
"speed": "100Gbps",
"lanes": 4,
},
"qsfp56_200g": {
"form_factor": "QSFP56",
"speed": "200Gbps",
"lanes": 4,
},
"qsfp_dd_400g": {
"form_factor": "QSFP-DD",
"speed": "400Gbps",
"lanes": 8,
},
"qsfp_dd_800g": {
"form_factor": "QSFP-DD",
"speed": "800Gbps",
"lanes": 8,
}
}
class QSFP(Device):
"""
InfraGraph model of a QSFP-family pluggable transceiver.
This class represents QSFP+, QSFP28, QSFP56, QSFP-DD 400 & 800G
optical modules. All variants share the same internal topology:
a one-to-one binding between an electrical host interface and
an optical media interface.
Variants are profile-locked via an authoritative catalog and
differ only in bandwidth, signaling lanes, and form factor.
"""
def __init__(self, variant: QsfpVariant = "qsfp28_100g"):
"""
Initialize a QSFP transceiver model.
Args:
variant (QsfpVariant, optional):
QSFP variant to instantiate. Determines form factor,
total bandwidth, and number of electrical lanes.
Defaults to "qsfp28_100g".
Raises:
ValueError:
If an unsupported QSFP variant is specified.
"""
super(Device, self).__init__()
self._validate_variant(variant)
self.variant = variant
self.cfg = QSFP_VARIANT_CATALOG[variant]
self.name = self.cfg["form_factor"].lower().replace("-", "_")
self.description = f"{self.cfg['form_factor']} {self.cfg['speed']} Transceiver"
self.electrical_port = self._add_electrical_port()
self.optical_port = self._add_optical_port()
self.binding = self._add_links()
self._wire_internal()
def _validate_variant(self, variant: QsfpVariant):
if variant not in QSFP_VARIANT_CATALOG:
raise ValueError(f"Unsupported QSFP variant: {variant}")
def _add_electrical_port(self):
port = self.components.add(
name="electrical_port",
description=f"Electrical host interface ({self.cfg['lanes']} lanes)",
count=1,
)
port.choice = Component.PORT
return port
def _add_optical_port(self):
port = self.components.add(
name="optical_port",
description=f"Optical media interface ({self.cfg['speed']})",
count=1,
)
port.choice = Component.PORT
return port
def _add_links(self):
return self.links.add(
name="internal_binding",
description="Electrical-to-optical signal binding",
)
def _wire_internal(self):
edge = self.edges.add(
scheme=DeviceEdge.ONE2ONE,
link=self.binding.name,
)
edge.ep1.component = "electrical_port"
edge.ep2.component = "optical_port"
if __name__ == "__main__":
print(QSFP("qsfp_dd_400g").serialize(encoding=Device.YAML))
Composability
Infragraph supports modeling a device inside another device, where the nested device behaves as a component of the parent device using Component.Device.
The idea is that we can add a CX5 device to a DGX, and then add a QSFP to the CX5:
Example: Composing DGX-CX5-QSFP
DGX composed of CX5 which has QSFP as a device
qsfp = QSFP("qsfp28_100g")
cx5 = Cx5(variant="cx5_100g_single", transceiver=qsfp)
dgx = NvidiaDGX("dgx_h100", cx5)
infrastructure = Api().infrastructure()
# we need to append added devices
infrastructure.devices.append(dgx).append(cx5).append(qsfp)
infrastructure.instances.add(name=dgx.name, device=dgx.name, count=1)
service = InfraGraphService()
service.set_graph(infrastructure)
g = service.get_networkx_graph()
This creates a single DGX h100 variant composed of a CX5 100g single port which is composed of a QSFP 28 100g. Generated YAML:
DGX-CX5-QSFP composed device definition as yaml
devices:
- components:
- choice: cpu
count: 2
description: AMD EPYC 9654 (Genoa)
name: cpu
- choice: xpu
count: 8
description: NVIDIA H100 / H200 SXM5
name: xpu
- choice: switch
count: 4
description: NVIDIA NVSwitch
name: nvlsw
- choice: switch
count: 3
description: Broadcom PCIe Gen5 Switch
name: pciesw
- choice: custom
count: 8
custom:
type: pcie_slot
description: PCIe Gen5 x16 slots (ConnectX / BlueField)
name: pciesl
- choice: device
count: 8
description: Mellanox ConnectX-5 100GbE NIC
name: cx5_100gbe
description: NVIDIA DGX System
edges:
- ep1:
component: cpu
ep2:
component: cpu
link: cpu_fabric
scheme: many2many
- ep1:
component: cpu[0]
ep2:
component: pciesl[0:3]
link: pcie
scheme: many2many
- ep1:
component: cpu[1]
ep2:
component: pciesl[4:7]
link: pcie
scheme: many2many
- ep1:
component: cpu[0]
ep2:
component: pciesw[0]
link: pcie
scheme: one2one
- ep1:
component: cpu[1]
ep2:
component: pciesw[1]
link: pcie
scheme: one2one
- ep1:
component: pciesl[0]
ep2:
component: xpu[0]
link: pcie
scheme: one2one
- ep1:
component: pciesl[1]
ep2:
component: xpu[1]
link: pcie
scheme: one2one
- ep1:
component: pciesl[2]
ep2:
component: xpu[2]
link: pcie
scheme: one2one
- ep1:
component: pciesl[3]
ep2:
component: xpu[3]
link: pcie
scheme: one2one
- ep1:
component: pciesl[4]
ep2:
component: xpu[4]
link: pcie
scheme: one2one
- ep1:
component: pciesl[5]
ep2:
component: xpu[5]
link: pcie
scheme: one2one
- ep1:
component: pciesl[6]
ep2:
component: xpu[6]
link: pcie
scheme: one2one
- ep1:
component: pciesl[7]
ep2:
component: xpu[7]
link: pcie
scheme: one2one
- ep1:
component: nvlsw[0:4]
ep2:
component: pciesw[2]
link: pcie
scheme: many2many
- ep1:
component: cpu[0]
ep2:
component: pciesw[2]
link: pcie
scheme: one2one
- ep1:
component: nvlsw[0:4]
ep2:
component: xpu[0:8]
link: pcie
scheme: many2many
- ep1:
component: pcie_endpoint[0]
device: cx5_100gbe[0:8]
ep2:
component: pciesl[0:8]
link: pcie
scheme: one2one
links:
- description: infinity_fabric
name: cpu_fabric
- description: nvlink_4
name: xpu_fabric
- description: PCI Express PCIE_GEN5 x16
name: pcie
name: dgx_h100
- components:
- choice: custom
count: 1
custom:
type: pcie_endpoint
description: PCI Express GEN3 x16 endpoint
name: pcie_endpoint
- choice: cpu
count: 1
description: ConnectX-5 network processing ASIC
name: asic
- choice: port
count: 1
description: Ethernet port (100GbE)
name: port
- choice: device
count: 1
description: QSFP transceiver device
name: qsfp28
description: Mellanox ConnectX-5 100GbE NIC
edges:
- ep1:
component: pcie_endpoint
ep2:
component: asic
link: pcie_internal
scheme: one2one
- ep1:
component: asic
ep2:
component: port
link: serdes
scheme: many2many
- ep1:
component: port[0]
ep2:
component: electrical_port[0]
device: qsfp28[0]
link: electrical
scheme: one2one
- ep1:
component: port[0]
ep2:
component: electrical_port[0]
device: qsfp28[0]
link: electrical
scheme: one2one
links:
- description: Internal PCIe GEN3 fabric
name: pcie_internal
- description: High-speed SerDes lanes
name: serdes
- description: Electrical interface to transceiver
name: electrical
name: cx5_100gbe
- components:
- choice: port
count: 1
description: Electrical host interface (4 lanes)
name: electrical_port
- choice: port
count: 1
description: Optical media interface (100Gbps)
name: optical_port
description: QSFP28 100Gbps Transceiver
edges:
- ep1:
component: electrical_port
ep2:
component: optical_port
link: internal_binding
scheme: one2one
links:
- description: Electrical-to-optical signal binding
name: internal_binding
name: qsfp28
instances:
- count: 1
device: dgx_h100
name: dgx_h100
name: dgx_cx5_qsfp_composed
Example: DGX with CX5 as a NIC
In some scenarios, a user may prefer to model cx5 as an abstracted NIC within a dgx device rather than as a fully instantiated device.
To add cx5 as a NIC component, the user can initialize the dgx device by passing the CX5 variant directly, as shown below:
DGX with CX5 as NIC
dgx_profile = "dgx_h100"
cx5_variant = "cx5_100g_single"
device = NvidiaDGX(dgx_profile, cx5_variant)
infrastructure = Api().infrastructure()
infrastructure.devices.append(device)
infrastructure.instances.add(name=device.name, device=device.name, count=1)
service = InfraGraphService()
service.set_graph(infrastructure)
g = service.get_networkx_graph()
This creates a cx5 as a nic component in dgx device. Generated YAML:
DGX with CX5 NIC definition as yaml
devices:
- components:
- choice: cpu
count: 2
description: AMD EPYC 9654 (Genoa)
name: cpu
- choice: xpu
count: 8
description: NVIDIA H100 / H200 SXM5
name: xpu
- choice: switch
count: 4
description: NVIDIA NVSwitch
name: nvlsw
- choice: switch
count: 3
description: Broadcom PCIe Gen5 Switch
name: pciesw
- choice: custom
count: 8
custom:
type: pcie_slot
description: PCIe Gen5 x16 slots (ConnectX / BlueField)
name: pciesl
- choice: nic
count: 8
description: cx5_100g_single
name: cx5_100g_single
description: NVIDIA DGX System
edges:
- ep1:
component: cpu
ep2:
component: cpu
link: cpu_fabric
scheme: many2many
- ep1:
component: cpu[0]
ep2:
component: pciesl[0:3]
link: pcie
scheme: many2many
- ep1:
component: cpu[1]
ep2:
component: pciesl[4:7]
link: pcie
scheme: many2many
- ep1:
component: cpu[0]
ep2:
component: pciesw[0]
link: pcie
scheme: one2one
- ep1:
component: cpu[1]
ep2:
component: pciesw[1]
link: pcie
scheme: one2one
- ep1:
component: pciesl[0]
ep2:
component: xpu[0]
link: pcie
scheme: one2one
- ep1:
component: pciesl[1]
ep2:
component: xpu[1]
link: pcie
scheme: one2one
- ep1:
component: pciesl[2]
ep2:
component: xpu[2]
link: pcie
scheme: one2one
- ep1:
component: pciesl[3]
ep2:
component: xpu[3]
link: pcie
scheme: one2one
- ep1:
component: pciesl[4]
ep2:
component: xpu[4]
link: pcie
scheme: one2one
- ep1:
component: pciesl[5]
ep2:
component: xpu[5]
link: pcie
scheme: one2one
- ep1:
component: pciesl[6]
ep2:
component: xpu[6]
link: pcie
scheme: one2one
- ep1:
component: pciesl[7]
ep2:
component: xpu[7]
link: pcie
scheme: one2one
- ep1:
component: nvlsw[0:4]
ep2:
component: pciesw[2]
link: pcie
scheme: many2many
- ep1:
component: cpu[0]
ep2:
component: pciesw[2]
link: pcie
scheme: one2one
- ep1:
component: nvlsw[0:4]
ep2:
component: xpu[0:8]
link: pcie
scheme: many2many
- ep1:
component: pciesl[0]
ep2:
component: cx5_100g_single[0]
link: pcie
scheme: one2one
- ep1:
component: pciesl[1]
ep2:
component: cx5_100g_single[1]
link: pcie
scheme: one2one
- ep1:
component: pciesl[2]
ep2:
component: cx5_100g_single[2]
link: pcie
scheme: one2one
- ep1:
component: pciesl[3]
ep2:
component: cx5_100g_single[3]
link: pcie
scheme: one2one
- ep1:
component: pciesl[4]
ep2:
component: cx5_100g_single[4]
link: pcie
scheme: one2one
- ep1:
component: pciesl[5]
ep2:
component: cx5_100g_single[5]
link: pcie
scheme: one2one
- ep1:
component: pciesl[6]
ep2:
component: cx5_100g_single[6]
link: pcie
scheme: one2one
- ep1:
component: pciesl[7]
ep2:
component: cx5_100g_single[7]
link: pcie
scheme: one2one
links:
- description: infinity_fabric
name: cpu_fabric
- description: nvlink_4
name: xpu_fabric
- description: PCI Express PCIE_GEN5 x16
name: pcie
name: dgx_h100
instances:
- count: 1
device: dgx_h100
name: dgx_h100
name: dgx_cx5_nic
Fabric Blueprints
Fabric blueprints allow users to define network fabric topologies by combining multiple devices and specifying their interconnections. They provide an intuitive way to model complex infrastructure setups such as datacenter tiers, clusters, or multi-device fabrics.
All fabric blueprints are located in:
Please note that fabric builders only work with non-composed devices.Creating a Single Tier Fabric with Multiple DGX Hosts
The following example demonstrates how to use the SingleTierFabric class to create a simple fabric connecting two DGX devices via a generic switch:
from infragraph.blueprints.devices.nvidia.dgx import NvidiaDGX
from infragraph.blueprints.fabrics.single_tier_fabric import SingleTierFabric
Instantiate a DGX device
dgx = NvidiaDGX()
# Create a single-tier fabric connecting two DGX devices via a single switch
fabric = SingleTierFabric(dgx, 2) # 2 DGX devices
# 'fabric' now contains the infrastructure graph with two DGX devices and the connecting switch
The SingleTierFabric blueprint returns an Infragraph object that includes two DGX devices along with a generic switch defined in the device blueprints, connected to form a simple topology.
Creating a CLOS Fat Tree Fabric with DGX Hosts
The following example demonstrates how to use the ClosFatTreeFabric class to create a clos fat tree fabric connecting dgx devices via a generic switch:
from infragraph.blueprints.fabrics.clos_fat_tree_fabric import ClosFatTreeFabric
from infragraph.blueprints.devices.dgx import NvidiaDGX
from infragraph.blueprints.devices.generic.generic_switch import Switch
# Instantiate a DGX device
dgx = NvidiaDGX()
# Instantiate a Switch
switch = Switch(port_count=16)
# Create a clos fat tree fabric with switch radix as 16 and with two levels:
clos_fat_tree = ClosFatTreeFabric(switch, dgx, 2, [])
# 'fabric' now contains the infrastructure graph with two tier clos fat tree fabric
CLOS Fat Tree Fabric Overview
The CLOS Fat Tree fabric builds a scalable network using multiple levels of identical switches connected in a tree pattern. It is defined by FT(k, L), where:
- k = number of ports per switch (switch radix)
- L = number of levels in the fabric (depth)
Inputs
- Switch device: Defines the switch type and port count (k).
- Host device: Defines the servers or hosts (e.g., DGX) connected to the fabric edge.
- Levels (L): Number of switch layers (tiers) in the network.
- Bandwidth array: Link speeds at each level, e.g., host-to-edge (tier_0), edge-to-aggregation (tier_0 -> tier_1), aggregation-to-spine (tier_1 -> tier_2).
Two-Level Fat-Tree: Network Sizing Computation
- switch downlink: k/2
- Number of Hosts: = (2 * (switch_downlink) ^ Levels)/ (total ports in host)
- tier_0 (rack switches): Number of switches facing hosts.
= (2 * (switch_downlink) ^ Levels - 1) - tier_1 (spine switches): Uplink switches connecting tier_0 switches.
= ((switch_downlink) ^ Levels - 1)
Connections: Hosts → tier_0 (rack) → tier_1 (spines).
Three-Level Fat-Tree Topology: Network Sizing Computation
- switch downlink: k/2
- Number of Hosts: = (2 * (switch_downlink) ^ Levels)/ (total ports in host)
- Tier_0 (rack switches):
= (2 * (switch_downlink) ^ Levels - 1) - Tier_1 (aggregation switches): Same number as tier_0:
= (2 * (switch_downlink) ^ Levels - 1) - Pods: Derived by dividing tier_0 by the downlinks per switch (k/2), matching pod size:
= switch downlink = k/2 - Spines (core switches): Number of spines is:
= ((switch_downlink) ^ Levels - 1) - Spine Sets: Spine switches grouped to connect to tier_0 switches in pods.
= ((switch_downlink/2) * (tier_0 to tier_1 bandwidth)/(tier_1 to tier_2 bandwidth))
Example FT(32, 2):
- Switch Radix = 32
- Levels = 2
- Host per switch = 2
- Total Hosts = 64
- tier_0 switches = 32
- tier_1 switches = 16
- Pods = 2
- Spine sets = 16
Connections: Hosts → tier_0 → tier_1 (spine switches).