Vantage Launches Ephemeral Workload Monitoring

by Vantage Team


Vantage Launches Ephemeral Workload Monitoring

Today, Vantage is launching Ephemeral Workload Monitoring, providing customers with the ability to configure custom polling intervals as short as 5 seconds for the Vantage Kubernetes Agent. Customers can now achieve more granular monitoring and improved visibility into short lived Kubernetes workloads, such as data processing or pipeline build tasks, for use in chargeback and utilization reporting.

Customers use the Vantage Kubernetes Agent to gain detailed visibility into in-cluster costs, allowing them to monitor cost by container, service, namespace, and label, and to optimize resource allocation across their Kubernetes environments. Previously, the Vantage Kubernetes Agent polled cluster metrics at one-minute intervals, which prevented customers with short-lived workloads from accurately monitoring their Kubernetes usage, or unable to monitor pod usage completely if it the agent did not poll during its lifecycle.

Now, with the launch of Ephemeral Workload Monitoring, customers can specify polling frequencies to capture shorter-lived job executions or ephemeral containers, and achieve finer-grained insights into resource utilization. Vantage customers can chose their polling interval, with polling options between 5 and 60 seconds, with the agent.pollingInterval parameter of the Helm chart, and the Vantage agent will send requests to kubelet at the indicated interval.

To enable Ephemeral Workload Monitoring, Vantage customers can upgrade their Kubernetes Agent with the image.tag=custom-poll and set their polling interval through agent.pollingInterval. Detailed instructions are available in the Vantage Kubernetes Agent documentation.

Frequently Asked Questions

1. What is being launched today?

Ephemeral Workload Monitoring allows the Vantage Kubernetes Agent to collect metrics at intervals as short as 5 seconds, providing more detailed monitoring of cluster activity.

2. Who is the customer?

The customer is any Vantage user who has has the Vantage Kubernetes Agent deployed on a Kubernetes cluster.

3. How much does this cost?

There is no additional cost for using Ephemeral Workload Monitoring. It is included in the cost of the Vantage subscription, including users in the free tier.

4. Do I need to update my existing agent to use this feature?

Currently, you will need to upgrade to the custom-poll image tag of the Kubernetes Agent. In the future, this feature will be available in the main upgrade path of the Vantage Kubernetes Agent. You can perform this upgrade using the following command:

helm repo update && helm upgrade -n vantage vka vantage/vantage-kubernetes-agent --set image.tag=custom-poll --reuse-values

See Agent Upgrading documentation for additional instructions.

5. How do I configure my Kubernetes agent polling interval?

You can change the polling period using agent.pollingInterval parameter of the Helm chart using the --set flag. The value of the polling interval should be an integer of the number of seconds you want your polling interval to be. You can do this at the same time as you upgrade your agent if you like. Refer to the documentation for additional instructions.

For example: helm upgrade -n vantage vka vantage/vantage-kubernetes-agent --set agent.pollingInterval=30 --reuse-values

6. What intervals can I set my polling interval?

The allowed polling intervals are 5, 10, 15, 30, and 60 seconds.

7. What happens if I enter a polling interval that is not in the list of allowed intervals?

The agent will fail to start and an error message is returned.

8. Are there any performance considerations I should be aware of?

Reducing the polling interval will increase the load on both the Kubernetes API server and memory consumption on the Vantage Kubernetes Agent. The vantage_last_node_scrape_timestamp_seconds metric is provided by the agent, which can tell users if the scrape takes longer than the polling interval. It is recommended to monitor system performance and adjust the interval as needed to balance granularity with resource usage.

9. How should I determine the polling interval I should be using?

Your polling interval should be based on the shortest lived task within your cluster.

10. How do I check what my polling interval is for a running agent?

You can view the polling interval using the kubectl describe pod/<pod_name> -n vantage command. If using our helm chart, the polling interval is shown in the Environment as the VANTAGE_POLLING_INTERVAL environment variable.

11. Can I set the polling intervals for specific Kubernetes namespaces or labels?

When applying the polling interval, the polling interval applies to the entire cluster.

12. What permissions do I need to configure the polling interval for the Kubernetes agent?

In order to configure the Kubernetes agent polling interval, you need to have a Vantage API token with READ and WRITE scopes enabled (it’s recommended to use a service token rather than a personal access token). You must have Owner permissions to create an API token.

13. How quickly does a new polling interval take effect?

As soon as the polling interval is applied to the agent, it will start using the new agent.pollingInterval defined. It is not possible to retroactively monitor past usage.

14. Can I configure my polling interval via API or Terraform?

No, polling interval can only be configured as a run time flag to the agent, via helm or whichever tools you use when installing or upgrading the Vantage Kubernetes Agent.

15. How often are the results of the Kubernetes Agent updated in the Vantage console?

Agent reporting occurs once per hour at the start of the hour, regardless of polling interval.

16. Where were costs previously attributed if they were not collected by the Vantage Kubernetes Agent?

If a pod was previously unmonitored because the agent did not poll during its lifecycle, it’s costs were previously to the __idle__ namespace.

17. Where can I learn more or get help?

You can read more in our Kubernetes Agent documentation, or reach out to support@vantage.sh if you require any troubleshooting.