Infragraph Blueprints

Infragraph provides users with multiple blueprints that help define the foundational components of an infrastructure. These blueprints cover both devices and fabrics, enabling flexible and extensible modeling of network and compute infrastructure.

Each blueprint is implemented as Python source code, allowing users to create classes and objects that inherit from the generated SDK. This approach makes it possible to define more realistic and customizable representations of devices and fabrics while maintaining compatibility with Infragraph’s schema and data models.

All available blueprints can be found in the following directory:

src/infragraph/blueprints

Device Blueprints

Device blueprints enable you to instantiate specific device types using the Infragraph schema format. Each blueprint encapsulates attributes, component definitions, and configuration logic that represent actual hardware or node types within a network topology.

All device blueprints are organized under:

src/infragraph/blueprints/devices/<vendor>

Each Infragraph device can expose multiple variants, allowing the same blueprint to represent different SKUs or generations of hardware. For example, a QSFP device supports the following variants: - qsfp_plus_40g - qsfp28_100g - qsfp56_200g - qsfp_dd_400g - qsfp_dd_800g

Infragraph devices support variant selection at initialization time, enabling end users to choose the exact hardware variant they want to model when creating a device object.

The example below shows QSFP class definition and initialization using a specific variant, demonstrating how variants are applied in practice.

Creating a QSFP Transceiver Blueprint

The QSFP class defines a device blueprint for QSFP-family pluggable transceivers in InfraGraph.
It inherits from the base Device class and models the internal structure, ports, and signal binding of a QSFP module in a consistent and reusable way.

All QSFP variants share the same internal topology:

One electrical host-facing interface
One optical media-facing interface
A fixed one-to-one electrical ↔ optical binding

Only the capabilities (form factor, speed, and lane count) differ across variants, and these are strictly controlled using a variant catalog.

Example: `qsfp.py`

Location: /src/infragraph/blueprints/devices/common/transceiver

QSFP device definition using OpenApiArt generated classes

from typing import Literal, Dict
from infragraph import *

QsfpVariant = Literal[
    "qsfp_plus_40g",
    "qsfp28_100g",
    "qsfp56_200g",
    "qsfp_dd_400g",
    "qsfp_dd_800g"
]

QSFP_VARIANT_CATALOG: Dict[QsfpVariant, dict] = {
    "qsfp_plus_40g": {
        "form_factor": "QSFP_PLUS",
        "speed": "40Gbps",
        "lanes": 4,
    },
    "qsfp28_100g": {
        "form_factor": "QSFP28",
        "speed": "100Gbps",
        "lanes": 4,
    },
    "qsfp56_200g": {
        "form_factor": "QSFP56",
        "speed": "200Gbps",
        "lanes": 4,
    },
    "qsfp_dd_400g": {
        "form_factor": "QSFP-DD",
        "speed": "400Gbps",
        "lanes": 8,
    },
    "qsfp_dd_800g": {
        "form_factor": "QSFP-DD",
        "speed": "800Gbps",
        "lanes": 8,
    }
}

class QSFP(Device):
    """
    InfraGraph model of a QSFP-family pluggable transceiver.

    This class represents QSFP+, QSFP28, QSFP56, QSFP-DD 400 & 800G
    optical modules. All variants share the same internal topology:
    a one-to-one binding between an electrical host interface and
    an optical media interface.

    Variants are profile-locked via an authoritative catalog and
    differ only in bandwidth, signaling lanes, and form factor.
    """

    def __init__(self, variant: QsfpVariant = "qsfp28_100g"):
        """
        Initialize a QSFP transceiver model.

        Args:
            variant (QsfpVariant, optional):
                QSFP variant to instantiate. Determines form factor,
                total bandwidth, and number of electrical lanes.
                Defaults to "qsfp28_100g".

        Raises:
            ValueError:
                If an unsupported QSFP variant is specified.
        """
        super(Device, self).__init__()

        self._validate_variant(variant)

        self.variant = variant
        self.cfg = QSFP_VARIANT_CATALOG[variant]

        self.name = self.cfg["form_factor"].lower().replace("-", "_")
        self.description = f"{self.cfg['form_factor']} {self.cfg['speed']} Transceiver"

        self.electrical_port = self._add_electrical_port()
        self.optical_port = self._add_optical_port()
        self.binding = self._add_links()

        self._wire_internal()

    def _validate_variant(self, variant: QsfpVariant):
        if variant not in QSFP_VARIANT_CATALOG:
            raise ValueError(f"Unsupported QSFP variant: {variant}")

    def _add_electrical_port(self):
        port = self.components.add(
            name="electrical_port",
            description=f"Electrical host interface ({self.cfg['lanes']} lanes)",
            count=1,
        )
        port.choice = Component.PORT
        return port

    def _add_optical_port(self):
        port = self.components.add(
            name="optical_port",
            description=f"Optical media interface ({self.cfg['speed']})",
            count=1,
        )
        port.choice = Component.PORT
        return port

    def _add_links(self):
        return self.links.add(
            name="internal_binding",
            description="Electrical-to-optical signal binding",
        )

    def _wire_internal(self):
        edge = self.edges.add(
            scheme=DeviceEdge.ONE2ONE,
            link=self.binding.name,
        )
        edge.ep1.component = "electrical_port"
        edge.ep2.component = "optical_port"

if __name__ == "__main__":
    print(QSFP("qsfp_dd_400g").serialize(encoding=Device.YAML))

Composability

Infragraph supports modeling a device inside another device, where the nested device behaves as a component of the parent device using Component.Device.

The idea is that we can add a CX5 device to a DGX, and then add a QSFP to the CX5:

Example: Composing DGX-CX5-QSFP

DGX composed of CX5 which has QSFP as a device

qsfp = QSFP("qsfp28_100g")
cx5 = Cx5(variant="cx5_100g_single", transceiver=qsfp)
dgx = NvidiaDGX("dgx_h100", cx5)
infrastructure = Api().infrastructure()
# we need to append added devices
infrastructure.devices.append(dgx).append(cx5).append(qsfp)
infrastructure.instances.add(name=dgx.name, device=dgx.name, count=1)
service = InfraGraphService()
service.set_graph(infrastructure)
g = service.get_networkx_graph()

This creates a single DGX h100 variant composed of a CX5 100g single port which is composed of a QSFP 28 100g. Generated YAML:

DGX-CX5-QSFP composed device definition as yaml

devices:
- components:
  - choice: cpu
    count: 2
    description: AMD EPYC 9654 (Genoa)
    name: cpu
  - choice: xpu
    count: 8
    description: NVIDIA H100 / H200 SXM5
    name: xpu
  - choice: switch
    count: 4
    description: NVIDIA NVSwitch
    name: nvlsw
  - choice: switch
    count: 3
    description: Broadcom PCIe Gen5 Switch
    name: pciesw
  - choice: custom
    count: 8
    custom:
      type: pcie_slot
    description: PCIe Gen5 x16 slots (ConnectX / BlueField)
    name: pciesl
  - choice: device
    count: 8
    description: Mellanox ConnectX-5 100GbE NIC
    name: cx5_100gbe
  description: NVIDIA DGX System
  edges:
  - ep1:
      component: cpu
    ep2:
      component: cpu
    link: cpu_fabric
    scheme: many2many
  - ep1:
      component: cpu[0]
    ep2:
      component: pciesl[0:3]
    link: pcie
    scheme: many2many
  - ep1:
      component: cpu[1]
    ep2:
      component: pciesl[4:7]
    link: pcie
    scheme: many2many
  - ep1:
      component: cpu[0]
    ep2:
      component: pciesw[0]
    link: pcie
    scheme: one2one
  - ep1:
      component: cpu[1]
    ep2:
      component: pciesw[1]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[0]
    ep2:
      component: xpu[0]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[1]
    ep2:
      component: xpu[1]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[2]
    ep2:
      component: xpu[2]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[3]
    ep2:
      component: xpu[3]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[4]
    ep2:
      component: xpu[4]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[5]
    ep2:
      component: xpu[5]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[6]
    ep2:
      component: xpu[6]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[7]
    ep2:
      component: xpu[7]
    link: pcie
    scheme: one2one
  - ep1:
      component: nvlsw[0:4]
    ep2:
      component: pciesw[2]
    link: pcie
    scheme: many2many
  - ep1:
      component: cpu[0]
    ep2:
      component: pciesw[2]
    link: pcie
    scheme: one2one
  - ep1:
      component: nvlsw[0:4]
    ep2:
      component: xpu[0:8]
    link: pcie
    scheme: many2many
  - ep1:
      component: pcie_endpoint[0]
      device: cx5_100gbe[0:8]
    ep2:
      component: pciesl[0:8]
    link: pcie
    scheme: one2one
  links:
  - description: infinity_fabric
    name: cpu_fabric
  - description: nvlink_4
    name: xpu_fabric
  - description: PCI Express PCIE_GEN5 x16
    name: pcie
  name: dgx_h100
- components:
  - choice: custom
    count: 1
    custom:
      type: pcie_endpoint
    description: PCI Express GEN3 x16 endpoint
    name: pcie_endpoint
  - choice: cpu
    count: 1
    description: ConnectX-5 network processing ASIC
    name: asic
  - choice: port
    count: 1
    description: Ethernet port (100GbE)
    name: port
  - choice: device
    count: 1
    description: QSFP transceiver device
    name: qsfp28
  description: Mellanox ConnectX-5 100GbE NIC
  edges:
  - ep1:
      component: pcie_endpoint
    ep2:
      component: asic
    link: pcie_internal
    scheme: one2one
  - ep1:
      component: asic
    ep2:
      component: port
    link: serdes
    scheme: many2many
  - ep1:
      component: port[0]
    ep2:
      component: electrical_port[0]
      device: qsfp28[0]
    link: electrical
    scheme: one2one
  - ep1:
      component: port[0]
    ep2:
      component: electrical_port[0]
      device: qsfp28[0]
    link: electrical
    scheme: one2one
  links:
  - description: Internal PCIe GEN3 fabric
    name: pcie_internal
  - description: High-speed SerDes lanes
    name: serdes
  - description: Electrical interface to transceiver
    name: electrical
  name: cx5_100gbe
- components:
  - choice: port
    count: 1
    description: Electrical host interface (4 lanes)
    name: electrical_port
  - choice: port
    count: 1
    description: Optical media interface (100Gbps)
    name: optical_port
  description: QSFP28 100Gbps Transceiver
  edges:
  - ep1:
      component: electrical_port
    ep2:
      component: optical_port
    link: internal_binding
    scheme: one2one
  links:
  - description: Electrical-to-optical signal binding
    name: internal_binding
  name: qsfp28
instances:
- count: 1
  device: dgx_h100
  name: dgx_h100
name: dgx_cx5_qsfp_composed

Example: DGX with CX5 as a NIC

In some scenarios, a user may prefer to model cx5 as an abstracted NIC within a dgx device rather than as a fully instantiated device.

To add cx5 as a NIC component, the user can initialize the dgx device by passing the CX5 variant directly, as shown below:

DGX with CX5 as NIC

dgx_profile = "dgx_h100"
cx5_variant = "cx5_100g_single"
device = NvidiaDGX(dgx_profile, cx5_variant)
infrastructure = Api().infrastructure()
infrastructure.devices.append(device)
infrastructure.instances.add(name=device.name, device=device.name, count=1)
service = InfraGraphService()
service.set_graph(infrastructure)
g = service.get_networkx_graph()

This creates a cx5 as a nic component in dgx device. Generated YAML:

DGX with CX5 NIC definition as yaml

devices:
- components:
  - choice: cpu
    count: 2
    description: AMD EPYC 9654 (Genoa)
    name: cpu
  - choice: xpu
    count: 8
    description: NVIDIA H100 / H200 SXM5
    name: xpu
  - choice: switch
    count: 4
    description: NVIDIA NVSwitch
    name: nvlsw
  - choice: switch
    count: 3
    description: Broadcom PCIe Gen5 Switch
    name: pciesw
  - choice: custom
    count: 8
    custom:
      type: pcie_slot
    description: PCIe Gen5 x16 slots (ConnectX / BlueField)
    name: pciesl
  - choice: nic
    count: 8
    description: cx5_100g_single
    name: cx5_100g_single
  description: NVIDIA DGX System
  edges:
  - ep1:
      component: cpu
    ep2:
      component: cpu
    link: cpu_fabric
    scheme: many2many
  - ep1:
      component: cpu[0]
    ep2:
      component: pciesl[0:3]
    link: pcie
    scheme: many2many
  - ep1:
      component: cpu[1]
    ep2:
      component: pciesl[4:7]
    link: pcie
    scheme: many2many
  - ep1:
      component: cpu[0]
    ep2:
      component: pciesw[0]
    link: pcie
    scheme: one2one
  - ep1:
      component: cpu[1]
    ep2:
      component: pciesw[1]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[0]
    ep2:
      component: xpu[0]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[1]
    ep2:
      component: xpu[1]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[2]
    ep2:
      component: xpu[2]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[3]
    ep2:
      component: xpu[3]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[4]
    ep2:
      component: xpu[4]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[5]
    ep2:
      component: xpu[5]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[6]
    ep2:
      component: xpu[6]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[7]
    ep2:
      component: xpu[7]
    link: pcie
    scheme: one2one
  - ep1:
      component: nvlsw[0:4]
    ep2:
      component: pciesw[2]
    link: pcie
    scheme: many2many
  - ep1:
      component: cpu[0]
    ep2:
      component: pciesw[2]
    link: pcie
    scheme: one2one
  - ep1:
      component: nvlsw[0:4]
    ep2:
      component: xpu[0:8]
    link: pcie
    scheme: many2many
  - ep1:
      component: pciesl[0]
    ep2:
      component: cx5_100g_single[0]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[1]
    ep2:
      component: cx5_100g_single[1]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[2]
    ep2:
      component: cx5_100g_single[2]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[3]
    ep2:
      component: cx5_100g_single[3]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[4]
    ep2:
      component: cx5_100g_single[4]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[5]
    ep2:
      component: cx5_100g_single[5]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[6]
    ep2:
      component: cx5_100g_single[6]
    link: pcie
    scheme: one2one
  - ep1:
      component: pciesl[7]
    ep2:
      component: cx5_100g_single[7]
    link: pcie
    scheme: one2one
  links:
  - description: infinity_fabric
    name: cpu_fabric
  - description: nvlink_4
    name: xpu_fabric
  - description: PCI Express PCIE_GEN5 x16
    name: pcie
  name: dgx_h100
instances:
- count: 1
  device: dgx_h100
  name: dgx_h100
name: dgx_cx5_nic

Fabric Blueprints

Fabric blueprints allow users to define network fabric topologies by combining multiple devices and specifying their interconnections. They provide an intuitive way to model complex infrastructure setups such as datacenter tiers, clusters, or multi-device fabrics.

All fabric blueprints are located in:

src/infragraph/blueprints/fabrics/

Please note that fabric builders only work with non-composed devices.

Creating a Single Tier Fabric with Multiple DGX Hosts

The following example demonstrates how to use the SingleTierFabric class to create a simple fabric connecting two DGX devices via a generic switch:

from infragraph.blueprints.devices.nvidia.dgx import NvidiaDGX
from infragraph.blueprints.fabrics.single_tier_fabric import SingleTierFabric

Instantiate a DGX device
dgx = NvidiaDGX()

# Create a single-tier fabric connecting two DGX devices via a single switch
fabric = SingleTierFabric(dgx, 2) # 2 DGX devices
# 'fabric' now contains the infrastructure graph with two DGX devices and the connecting switch

The SingleTierFabric blueprint returns an Infragraph object that includes two DGX devices along with a generic switch defined in the device blueprints, connected to form a simple topology.

Creating a CLOS Fat Tree Fabric with DGX Hosts

The following example demonstrates how to use the ClosFatTreeFabric class to create a clos fat tree fabric connecting dgx devices via a generic switch:

from infragraph.blueprints.fabrics.clos_fat_tree_fabric import ClosFatTreeFabric
from infragraph.blueprints.devices.dgx import NvidiaDGX
from infragraph.blueprints.devices.generic.generic_switch import Switch

# Instantiate a DGX device
dgx = NvidiaDGX()

# Instantiate a Switch
switch = Switch(port_count=16)

# Create a clos fat tree fabric with switch radix as 16 and with two levels:
clos_fat_tree = ClosFatTreeFabric(switch, dgx, 2, [])
# 'fabric' now contains the infrastructure graph with two tier clos fat tree fabric

CLOS Fat Tree Fabric Overview

The CLOS Fat Tree fabric builds a scalable network using multiple levels of identical switches connected in a tree pattern. It is defined by FT(k, L), where: - k = number of ports per switch (switch radix)
- L = number of levels in the fabric (depth)

Inputs

Switch device: Defines the switch type and port count (k).
Host device: Defines the servers or hosts (e.g., DGX) connected to the fabric edge.
Levels (L): Number of switch layers (tiers) in the network.
Bandwidth array: Link speeds at each level, e.g., host-to-edge (tier_0), edge-to-aggregation (tier_0 -> tier_1), aggregation-to-spine (tier_1 -> tier_2).

Two-Level Fat-Tree: Network Sizing Computation

switch downlink: k/2
Number of Hosts: = (2 * (switch_downlink) ^ Levels)/ (total ports in host)
tier_0 (rack switches): Number of switches facing hosts.
= (2 * (switch_downlink) ^ Levels - 1)
tier_1 (spine switches): Uplink switches connecting tier_0 switches.
= ((switch_downlink) ^ Levels - 1)

Connections: Hosts → tier_0 (rack) → tier_1 (spines).

Three-Level Fat-Tree Topology: Network Sizing Computation

switch downlink: k/2
Number of Hosts: = (2 * (switch_downlink) ^ Levels)/ (total ports in host)
Tier_0 (rack switches):
= (2 * (switch_downlink) ^ Levels - 1)
Tier_1 (aggregation switches): Same number as tier_0:
= (2 * (switch_downlink) ^ Levels - 1)
Pods: Derived by dividing tier_0 by the downlinks per switch (k/2), matching pod size:
= switch downlink = k/2
Spines (core switches): Number of spines is:
= ((switch_downlink) ^ Levels - 1)
Spine Sets: Spine switches grouped to connect to tier_0 switches in pods.
= ((switch_downlink/2) * (tier_0 to tier_1 bandwidth)/(tier_1 to tier_2 bandwidth))

Example FT(32, 2):
- Switch Radix = 32 - Levels = 2 - Host per switch = 2 - Total Hosts = 64 - tier_0 switches = 32 - tier_1 switches = 16 - Pods = 2 - Spine sets = 16

Connections: Hosts → tier_0 → tier_1 (spine switches).