Vantage Expands Kubernetes Agent GPU Support to Neocloud Providers

Today, Vantage is announcing expanded GPU support in the Vantage Kubernetes agent for neocloud providers, including DigitalOcean, CoreWeave, Nebius, and other emerging GPU-focused cloud platforms. Customers can now track, allocate, and optimize GPU-backed workloads running on neocloud Kubernetes clusters with the same visibility they have for traditional cloud providers.

Previously, Vantage's GPU cost monitoring capabilities were limited to major cloud providers, like AWS and Azure. Customers running GPU workloads on neocloud platforms had to rely on provider-native billing without the ability to perform detailed cost allocation at the cluster, namespace, or pod level. This meant that teams running training or inference workloads on neoclouds couldn't accurately attribute GPU costs to specific projects or teams.

Now, with expanded neocloud GPU support, Vantage customers can deploy the Kubernetes agent to clusters on DigitalOcean, CoreWeave, Nebius, and similar providers to monitor GPU consumption and perform cost allocation at the cluster, namespace, pod, or label level. This enables detailed reporting in Cost Reports, allocation via Virtual Tags, and utilization analysis in Kubernetes Efficiency Reports, including GPU memory utilization, idle costs, and rightsizing recommendations to optimize neocloud GPU spend.

Neocloud GPU support is available to all Vantage customers starting today. To get started, deploy the Vantage Kubernetes agent in your neocloud clusters, enable GPU metrics collection, and configure GPU rate annotations on your nodes. For detailed setup instructions, see the Neocloud GPU Support documentation.

Frequently Asked Questions

1. What is being launched today?

Vantage is launching expanded GPU support for the Kubernetes agent on neocloud providers, like DigitalOcean, CoreWeave, and Nebius, allowing customers to view and allocate GPU-backed workloads in the Vantage console.

2. Who is the customer?

Any customers running GPU Kubernetes workloads on neocloud providers, particularly those with AI/ML training and inference workloads.

3. How much does this cost?

There is no additional cost for neocloud GPU cost monitoring.

4. How does it work?

Customers specify GPU cost rate annotations on each node in their Kubernetes cluster, similar to the on-premises approach, as shown in the example below:

metadata:
  annotations:
    gpu_gb_hourly_rate: "0.004"
    ram_gb_hourly_rate: "0.0012"
    vcpu_hourly_rate: "0.0025"

The Vantage Kubernetes agent ingests these annotations and applies them when calculating costs. The agent also needs to be configured to collect GPU metrics from the DCGM exporter, which many neocloud providers deploy natively. If your provider has a pre-deployed exporter, you can configure the agent to point to it using the exporter namespace, service name, port, and path settings.

5. How are GPU costs calculated?

When a node includes GPUs, the agent uses the gpu_gb_hourly_rate annotation to allocate costs. The number of GPUs requested by the pod dictates how much of the total GPU memory is allocated to the pod. Idle costs are calculated using the difference between allocated and used GPU memory from the DCGM exporter metrics.

6. How do I calculate the hourly rates for my annotations?

If you know the total hourly cost of your neocloud host but need to derive per-resource rates (GPU, CPU, RAM), you can use a normalization formula that distributes the total cost proportionally across resources based on standard cloud pricing ratios. See the On-Premises Rate Calculation section in the documentation for step-by-step instructions and examples.

7. Which neocloud providers are supported?

DigitalOcean, CoreWeave, Nebius, and any neocloud provider where you can deploy the Vantage Kubernetes agent, configure custom node annotations, and have access to a DCGM exporter (either provider-deployed or self-installed).

8. Which GPU manufacturers are supported?

At this time, only metrics exported by NVIDIA/dcgm-exporter are supported. If you have a different manufacturer you want to see supported, contact support@vantage.sh.

9. What needs to be installed on my cluster?

Usage data is collected via the NVIDIA/dcgm-exporter. Many neocloud providers deploy this natively, but it can also be installed as part of the NVIDIA/gpu-operator or independently. The agent scrapes the exporter directly. The exporter must expose the DCGM_FI_DEV_FB_USED and DCGM_FI_DEV_FB_FREE metrics. Optionally, if DCGM_FI_DEV_FB_TOTAL is available, it will be used; otherwise, the agent will calculate total memory using DCGM_FI_DEV_FB_FREE and DCGM_FI_DEV_FB_RESERVED as a fallback.

10. How can I see the GPU costs in Vantage?

Navigate to any Kubernetes Efficiency Report and set the Category filter equal to gpu. This will show GPU idle and total costs by cluster. You can add additional filters and group the report for more specific views. GPU costs also appear in Cost Reports and can be used for allocation via Virtual Tags.

11. Are there any GPU-specific settings I need to configure on the agent?

Yes, you need to enable GPU metrics collection by setting --set agent.gpu.usageMetrics=true when installing or upgrading the agent. If your provider has a pre-deployed DCGM exporter in a non-standard location, you may also have to configure the exporter namespace, service name, port name, and path. See the Neocloud GPU Support documentation for details.

12. What is the minimum version number required for GPU tracking?

You must be on Vantage Kubernetes agent version 1.2.0 (Helm Chart v1.4.0) or higher.

13. Can I combine neocloud GPU costs with my other cloud costs?

Yes, you can view all Kubernetes workloads across providers in both Kubernetes Efficiency Reports and Cost Reports for unified reporting. This allows you to see neocloud GPU costs alongside your other costs, such as AWS or Azure, in a single view.

14. Are fractional GPU requests supported?

At this time, the Vantage Kubernetes agent collects only whole GPU requests. If you use fractional GPUs and want those represented in cost monitoring, please contact support@vantage.sh.

Frequently Asked Questions

Read more

TakeCtrlof YourCloud Costs