Vantage Launches LLM Token Allocation in Private Preview

Today, Vantage announces the private preview of LLM Token Allocation, a new capability that allocates model provider costs to metadata about its usage, such as Team, User, or Application, to more granularly track and allocate AI costs. Customers can connect token observability data, or upload their own token consumption data via CSV, and Vantage automatically enriches cost data with application-level metadata, making it available throughout Vantage, such as in Cost Reports and Virtual Tags.

LLM Token Allocation enriches AI cost data with metadata such as Model and Team

As developers increasingly use AI-powered tools like Cursor, AWS Bedrock, Claude Code, and Codex, tokenized LLM usage is creating new, high-variance engineering spend that contributes not only to Cost of Goods Sold (COGS) but also R&D budgets. To create transparency around these costs, engineering and FinOps teams need to surface token consumption by granular measures such as developer and feature, measure developer and agent ROI, and apply FinOps tooling to R&D spending. Without this visibility, organizations struggle to defend token spend and optimize for cost-effective developer productivity. The data needed for this allocation exists across billing systems, observability platforms, LLM gateways, and manual mappings, forcing teams to build custom data pipelines to piece together a comprehensive allocation strategy.

Now, with the launch of LLM Token Allocation, customers connect consumption data sources to Vantage, and costs are automatically allocated to attributes not available within provider billing data, such as user, purpose, or team, and any other custom values desired. Customers can add the enrichment data sources through the "Token Allocation" integration within the Integrations page, using either native observability connections (e.g., Datadog, Amazon CloudWatch logs) or by uploading a Custom Provider dataset via CSV. Vantage reads per-request LLM observability records and joins them to corresponding cost rows, and enriches these rows with any provided metadata from the inference monitoring metrics. Enriched cost and usage data flows directly into Cost Reports, Virtual Tags, Segments, and Alerts, with no additional tooling required.

LLM Token Allocation is available today in private preview for OpenAI and AWS Bedrock integrations. More model providers will be added in the future. To request access, contact your Vantage account team.

Frequently Asked Questions

1. What is being launched today?

Vantage is launching LLM Token Allocation, allowing customers to enrich AI billing data from any provider with observability metrics necessary to perform cost observability and allocation. This feature is launching in Private Preview, with support for OpenAI and AWS Bedrock cost allocation, with more providers available in the future.

2. Who is the customer?

This feature is designed for organizations using AI at scale who need accurate cost allocation back to teams, applications, and users.

3. How much does the functionality for LLM Token Allocation cost?

There is no additional cost for LLM Token Allocation.

4. How does LLM Token Allocation work?

Customers link an inference consumption data source (e.g., Datadog, CloudWatch logs) to their existing cost integration in Vantage such as OpenAI or AWS (Bedrock). Vantage joins per-request token consumption data to relevant cost rows in order to make the inference metadata available to applicable cost rows. These pieces of metadata then become available when looking at AI costs in Cost Reports, defining cost allocation rules with Virtual Tags, and other Vantage reports.

5. What token consumption data sources are supported?

At launch, Vantage supports three ingestion methods:

Amazon CloudWatch logs stored in S3 (native integration): Customers using AWS Bedrock can connect directly to CloudWatch, where AWS publishes per-model invocation metrics. Users must have Model Invocation Logging turned on. Vantage automatically pulls and joins this data without manual export.
Datadog (native integration): Customers using Datadog for observability monitoring can directly connect Datadog by providing the query for accessing metrics.
Custom Provider upload: Customers can upload a structured CSV containing per-request token consumption—date, model, API key, cost, and optional metadata fields—via the existing Vantage Custom Provider integration.

We also plan to launch support for LiteLLM and OpenRouter. If there are other AI Gateways you are interested in, please contact support@vantage.sh.

6. How do I create a connection with an S3 bucket that contains CloudWatch Logs?

Vantage consumes CloudWatch Logs by granting the Vantage-owned AWS IAM role access to an S3 bucket that stores your CloudWatch Logs. In the event that you already have an existing S3 bucket containing CloudWatch Logs, Vantage will automatically detect and list the buckets available for integration.

The integration must be completed for each AWS account that owns an S3 bucket containing CloudWatch Logs. You can complete this integration via the AWS CLI, AWS Management Console, or the Vantage Terraform provider.

7. How do I create a connection with Datadog logging metrics?

Vantage can connect to Datadog by utilizing raw queries to query logs. You can get the query by clicking on the code icon in the top right corner. You can also extract the query from any graph in Datadog by clicking the edit button on a dashboard graph, then getting the metric from there. You can find more information about the structure of Datadog metric queries in their documentation.

8. What format should a CSV have to upload as a Custom Provider?

Custom Provider uploads should follow the FinOps FOCUS Schema with associated inference metadata mapped as key value tag pairs. Any matching fields from the FOCUS spec will be used to filter to the corresponding rows from the provider. For instance service, model and project will be matched for OpenAI rows if they have the corresponding values in the FOCUS uploaded file.

9. Does this require changes to my existing OpenAI or AWS integrations?

No. LLM Token Allocation is an additive feature that does not impact existing integrations. OpenAI does not have a native logging of inference metrics, so you will need to create dedicated logging and storage of these metrics in order to feed into Vantage.

10. What happens to costs that have no or partial matching consumption data?

Cost rows with no matching data in the token consumption source are left unenriched, with no enrichment metadata added. If there are leftover tokens in the cost data not present in the enrichment data they will be left as-is.

11. How often is Token Allocation data updated?

Enrichment of cost data will happen daily. At time of processing, Vantage will join the most recent data available (through native connection or Custom Provider).

12. Does Vantage see my prompt content?

No. Enrichment is metadata-only from your data source. Prompt and completion text are never collected or stored.

Frequently Asked Questions

Read more

TakeCtrlof YourCloud Costs