

Innovation
From Setup to Chat: Running NVIDIA’s Multi‑Agent Chatbot on Dell Pro Max with GB10
Building and running a multi‑agent AI chatbot no longer needs a data center or a complex cluster. With the Dell Pro Max with GB10, you can stand up a powerful, GPU‑accelerated chatbot locally in a matter of minutes—then actually see how your GPU is being used as you chat.
This article walks through the experience of deploying NVIDIA’s “Build and Deploy a Multi‑Agent Chatbot” playbook on a GB10, explains what’s happening under the hood in simple terms, and shows you how to clean up resources after the demo. It’s written for technically curious readers: you’re comfortable with basic commands and web UIs, but you don’t need to be a DevOps engineer.
What You’ll Build
By the end, you’ll have:
-
- A local web UI (Spark Chat) running on the GB10
- A multi‑agent chatbot using a local LLM, not a cloud API
- A view into GPU utilization so you can see how your hardware is being exercised in real time
- Clearing the memory and resources.
Prerequisites: Getting the GB10 Ready
Before you start the chatbot playbook, make sure your GB10 environment is ready:
- GB10 setup and updates
-
- The GB10 should be fully set up and updated via DGX Dashboard. This ensures you have the latest drivers and components the playbook expects.
- Internet connectivity (initially)
-
- The system needs Internet access only during the initial phase to:
-
-
- Download the LLM models
- Build and pull the necessary containers
-
-
- After that, you can run the demo offline if you want.
- Local vs. remote access
-
- The steps below assume you’re working directly on the GB10. If you’re connecting remotely from a Windows laptop using NVIDIA Sync, there’s a short section later that shows you how to tunnel the UI over SSH.
Step 1: Launch DGX Dashboard on the GB10
On the GB10 desktop:
- Click “Show apps” at the bottom‑left of the desktop, then click on DGX Dashboard
DGX Dashboard is a web-based dashboard to monitor the GB10, install updates, and launch Jupyter notebooks, all without needing to SSH and type Linux commands.

Figure 1: DGX Dashboard home screen on the GB10 desktop.
Step 2: Open the Multi‑Agent Chatbot Playbook
-
- Navigate to https://build.nvidia.com/spark.
- Scroll down and look for the “Build and Deploy a Multi‑Agent Chatbot” Playbook.
- Click into it and switch to the Instructions tab.
This playbook is a guided set of steps for deploying a multi‑agent chatbot using containers, models, and configuration files curated by NVIDIA.

Figure 2: A view of the “Build and Deploy a Multi‑Agent Chatbot” Playbook instructions.
Step 3: Run the Playbook Commands on the GB10
On the GB10:
-
- Open a Terminal window.
- For each step in the playbook, copy‑paste the commands from the instructions into your terminal and run them in order.
Behind the scenes, these commands:
-
- Clone the playbook repository (if not already present)
- Fetch and prepare models needed for the chatbot
- Build and start containers that implement the various services and agents
Once the model download step is complete, you can verify the models on disk in:
home/dgx-spark-playbooks/nvidia/multi-agent-chatbot/assets/models
This directory will contain the LLM and related assets used by the chatbot.
Step 4: Wait for All Containers to Become Healthy
When you reach Step 4 of the playbook, you’ll run a watch command that continuously displays the status of the containers, something like:
watch ‘docker ps –format “table {{.ID}}\t{{.Names}}\t{{.Status}}”‘
You’ll see a table listing container IDs, names, and statuses. At first, some containers may be starting or downloading images; give it time until they all show as “healthy” or “up” in the status column.

Figure 3: View of the status command output confirming that all services required for the chatbot are running and healthy.
Step 5: Open the Spark Chat UI
Once the containers are healthy, move to Step 5 of the playbook and open the UI:
- On the GB10, open a browser and go to: http://localhost:3000
You should see the Spark Chat UI, a web interface where you can interact with the multi‑agent chatbot and switch between different modes, including “Chat / Local LLM” and image processing.

Figure 5: Spark Chat UI home screen running locally on the GB10, served by the containers you just launched.
Optional: Accessing the UI Remotely from a Laptop
If you’re not sitting at the GB10 and instead are connected from a Windows laptop using NVIDIA Sync, you can still reach the same UI through SSH port forwarding.
On your Windows laptop:
- Open PowerShell and Run:
-
- ssh -L 3000:localhost:3000-L 8000:localhost:8000 username@IP-address
- Replace username and IP-address with the GB10’s actual username and IP.
- ssh -L 3000:localhost:3000-L 8000:localhost:8000 username@IP-address
If successful, your prompt will change to indicate you’re now logged into the GB10 over SSH. (example run is shown below)

Figure 6: PowerShell window on a remote laptop showing the SSH command with port forwarding to the GB10.
Now on your laptop’s browser, you can visit: http://localhost:3000
This forwards your local port 3000 to the GB10’s port 3000, so you see the same Spark Chat UI as if you were on the GB10 itself.
Step 6: Chat with the Local LLM and Watch the GPU
Back in the Spark Chat UI:
- Click on “Chat / Local LLM”.
- Type a prompt—for example:
- “Explain how a multi‑agent chatbot works in simple terms.”
- While the chatbot responds, observe GPU utilization using your preferred monitoring method (such as DGX Dashboard or nvidia-smi).
You’ll see that as you send prompts and receive responses, the GB10’s GPU usage jumps, reflecting the LLM inference workload.

Figure 7: Spark Chat interface in Chat / Local LLM mode with a prompt, model response and memory/GPU utilization are visible in real-time.
You can also:
-
- Click the three dashes (hamburger menu) on the top left to start a new chat or explore additional features.
- Switch between modes and explore how different agents or tools behave.

Figure 8: Spark Chat Main Menu Selections.
Image Processing: Another Way to See Multi‑Agent Power
The Spark Chat UI also includes an Image Processor mode. Here, the model analyzes images instead of (or in addition to) text prompts.
Try:
-
- Select Image Processor from the UI.
- Use the default prompt (or slightly customize it) to analyze an example image.

Figure 8: Image Processor view with a sample image loaded and a default analysis prompt.

Figure 9: Image Processor mode, where the model analyzes the content of an image.
This mode again drives GPU utilization and demonstrates how the same underlying infrastructure can support multi‑modal (text + image) use cases.
Cleaning Up Containers After the Demo
Once you’re done with the demo, it’s good practice to clean up containers and free resources. This also helps if you plan to present the demo multiple times and want to start with a clean slate.
First, check what’s currently running:
docker ps
Then, from the assets directory:
cd dgx-spark-playbooks/nvidia/multi-agent-chatbot/assets
docker compose -f docker-compose.yml -f docker-compose-models.yml down
docker volume rm “$(basename “$PWD”)_postgres_data”
These commands:
-
- Stop and remove the containers
- Remove the associated PostgreSQL volume (used for state)

Figure 9: Terminal showing the multi‑agent chatbot containers being brought down and the Postgres volume removed.
After this, docker ps should show no containers from the chatbot stack running.
To rerun the demo, simply go over the same steps above.
Why This Matters
Running a multi‑agent chatbot locally on the GB10 isn’t just a cool demo; it’s a practical pattern:
-
- Data control: Keep prompts and responses on‑prem, which is critical for sensitive domains.
- Performance: Leverage the GB10’s GPU to get low‑latency responses without relying on cloud endpoints.
- Experimentation: Easily iterate on prompts, agents, and workflows, then tear down and rerun as needed.
- Education and enablement: This is a concrete way to show teams how containers, models, and orchestration come together to deliver AI applications.
Whether you’re preparing a live demo, testing local workloads, or just exploring what your GB10 can do, this playbook gives you a repeatable, GPU‑accelerated starting point.
Summary
We walked through:
-
- Preparing the GB10 and launching DGX Dashboard
- Using NVIDIA’s “Build and Deploy a Multi‑Agent Chatbot” playbook
- Monitoring container health and GPU usage
- Interacting with the chatbot in Chat / Local LLM and Image Processor modes
- Cleaning up containers and freeing up resources.
With these steps, your GB10 becomes a compact, powerful platform for hands‑on AI experimentation using multi‑agent workflows and local LLMs.
For on-demand hands-on labs, interactive demos, and video shorts specific to Dell Pro Max with GB10, feel free to access the Dell Demo Center Room at https://democenter.dell.com/demoroom/39915c64-b483-4add-8e5b-ac7d9f1052e0
Experience the setup and capabilities of Dell Pro Max with GB10
Jump into the Dell Pro Max with GB10 Demo Room on Dell Demo Center to see the technology in action, from unboxing and setup to exploring the Blueprints and capabilities for on-the-box developers, including how to run the multi-agent demo above.
About Dell Demo Center
Explore a product for the first time with a self-guided demo or step through complex scenarios in a fully guided virtual lab. Demo Center puts you in the driver’s seat to experience the latest products and solutions from Dell Technologies.
