Add AI to Your Node

Add AI inference to an existing transcoding node with one hardware check, one aiModels.json file, and one startup command change.

This quickstart extends an existing transcoding node. It covers the minimum AI runner setup needed to load one model and verify local inference before moving to the fuller workload guides.

Prerequisites

Confirm these prerequisites before you change the node:

go-livepeer is installed and running as a transcoding orchestrator on Arbitrum mainnet
the node already serves video workloads successfully
Docker is installed with nvidia-container-toolkit enabled
the GPU has at least 4 GB of free VRAM for a small AI pipeline
the model directory is available at ~/.lpData/models

Start with Install go-livepeer when you are building a node from scratch.

Check your hardware

AI inference runs in a separate Docker container alongside the transcoding process. Shared GPUs divide VRAM between video work and AI workloads, so verify available memory first.

nvidia-smi --query-gpu=index,name,memory.total,memory.free --format=csv

Expected output looks like this:

index, name, memory.total [MiB], memory.free [MiB]
0, NVIDIA GeForce RTX 3090, 24576 MiB, 22000 MiB

Use these tiers as a quick routing rule:

4 GB: image-to-text
6 GB: segment-anything-2
8 GB: llm
12 GB: audio-to-text
16 GB+: image-to-video
20 GB: image-to-image
24 GB: text-to-image

Nodes without enough free VRAM to cover both transcoding and the selected AI pipeline will fail to start the AI runner container. Pick a lighter pipeline, dedicate a second GPU to AI, or pause transcoding on that GPU first.

For fuller workload planning, see Workload Options.

Pull the AI runner image

Pull the default AI runner image before changing the startup command:

docker pull livepeer/ai-runner:latest

Pull the pipeline-specific image as well when you want segment-anything-2:

docker pull livepeer/ai-runner:segment-anything-2

Configure aiModels.json

Create the file at ~/.lpData/aiModels.json:

touch ~/.lpData/aiModels.json

Add one warm model as the minimum working configuration:

[
  {
    "pipeline": "text-to-image",
    "model_id": "ByteDance/SDXL-Lightning",
    "price_per_unit": 4768371,
    "warm": true
  }
]

Use these fields:

pipeline: pipeline name such as text-to-image, audio-to-text, or llm
model_id: Hugging Face model identifier
price_per_unit: integer wei price or supported USD string
warm: loads the model into VRAM on startup
url: external runner endpoint when the model is hosted outside go-livepeer
token: bearer token for an authenticated external runner

Keep one warm model per GPU for the initial setup. Keep additional models cold until demand justifies them.

For more pricing and multi-model examples, see AI Inference Operations.

Update the startup command

Add three flags to the existing startup command:

-aiWorker
-aiModels
-aiModelsDir

Transcoding-only startup:

livepeer \
  -network arbitrum-one-mainnet \
  -ethUrl <ETH_URL> \
  -orchestrator \
  -transcoder \
  -nvidia 0 \
  -pricePerUnit <PRICE> \
  -serviceAddr <SERVICE_ADDR>

Transcoding plus AI startup:

livepeer \
  -network arbitrum-one-mainnet \
  -ethUrl <ETH_URL> \
  -orchestrator \
  -transcoder \
  -nvidia 0 \
  -pricePerUnit <PRICE> \
  -serviceAddr <SERVICE_ADDR> \
  -aiWorker \
  -aiModels ~/.lpData/aiModels.json \
  -aiModelsDir ~/.lpData/models

Docker deployments also need the Docker socket mount:

docker run \
  --name livepeer_orchestrator \
  -v ~/.lpData/:/root/.lpData/ \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --network host \
  --gpus all \
  livepeer/go-livepeer:master \
  -network arbitrum-one-mainnet \
  -ethUrl <ETH_URL> \
  -orchestrator \
  -transcoder \
  -nvidia 0 \
  -pricePerUnit <PRICE> \
  -serviceAddr <SERVICE_ADDR> \
  -aiWorker \
  -aiModels /root/.lpData/aiModels.json \
  -aiModelsDir ~/.lpData/models

-aiModelsDir must point at the host path. go-livepeer passes that path directly into the runner containers it spawns.

Verify AI is active

Within a few seconds of startup, warm models should trigger a managed-container log line:

2024/05/01 09:01:39 INFO Starting managed container gpu=0 name=text-to-image_ByteDance_SDXL-Lightning modelID=ByteDance/SDXL-Lightning

When the managed-container line is missing, check:

aiModels.json is valid JSON
the model weights exist under -aiModelsDir
the Docker socket is mounted in Docker deployments

Send a direct test request to the local AI runner:

curl -X POST "http://localhost:8000/text-to-image" \
  -H "Content-Type: application/json" \
  -d '{"model_id":"ByteDance/SDXL-Lightning","prompt":"A cool cat on the beach","width":512,"height":512}'

A successful response returns JSON with an images array. After on-chain capability advertisement is configured, the AI pipeline also appears on the node profile in the Livepeer Explorer.

Choose your AI path

The AI runner is active. Choose the workload path you want to specialise in next.

Set up batch AI inference

Configure image, audio, and video generation pipelines with model downloads, pricing, and activation guidance.

Set up Cascade AI

Configure ComfyStream for persistent video processing with the GPU allocation needed for live workloads.

Workload Options - compare workloads before choosing a path
AI Inference Operations - review advanced aiModels.json options, multi-GPU setups, and external runners

Start Here

Concepts

Quickstart

Setup

Guides

Resources

Prerequisites

Check your hardware

Pull the AI runner image

Configure aiModels.json

Update the startup command

Verify AI is active

Choose your AI path

Set up batch AI inference

Set up Cascade AI

Start Here

Concepts

Quickstart

Setup

Guides

Resources

​Prerequisites

​Check your hardware

​Pull the AI runner image

​Configure aiModels.json

​Update the startup command

​Verify AI is active

​Choose your AI path

Set up batch AI inference

Set up Cascade AI

​Related

Prerequisites

Check your hardware

Pull the AI runner image

Configure aiModels.json

Update the startup command

Verify AI is active

Choose your AI path

Related