Kubernetes DRA in v1.36: 5 Critical Upgrades for Smarter Resource Management

The Dynamic Resource Allocation (DRA) framework in Kubernetes has revolutionized how platform administrators manage hardware accelerators and specialized resources. With the v1.36 release, DRA takes a significant leap forward, introducing a suite of feature graduations and usability enhancements that make resource scheduling more flexible, reliable, and hardware-agnostic. From prioritized fallback options to native support for partitionable devices, these updates address real-world cluster challenges—whether you’re handling GPU fleets, migrating from legacy extended resources, or taming hardware heterogeneity. Below, we break down the five most impactful features you need to know about this release.

1. Prioritized Lists: Smarter Fallback for Heterogeneous Hardware

One of the biggest pain points in managing diverse hardware is specifying exact device models. The Prioritized list feature (now stable) lets you define an ordered set of preferences when requesting devices. Instead of hardcoding a requirement for a specific accelerator like an H100, you can create a fallback chain: “give me an H100 first; if none are available, fall back to an A100.” The scheduler evaluates this list in order, dramatically improving flexibility and cluster utilization. This means you can accommodate hardware heterogeneity without compromising on performance or wasting capacity on rigid allocations.

Kubernetes DRA in v1.36: 5 Critical Upgrades for Smarter Resource Management

2. Extended Resource Support: Smooth Bridge to DRA Adoption

Migrating existing clusters to DRA can be challenging, especially when teams still rely on traditional extended resources. The Extended resource feature (now in beta) allows Pods to request resources using the familiar extended resource API while gradually adopting the ResourceClaim model. This gives cluster operators the freedom to roll out DRA incrementally—application developers can keep using their current patterns until they’re ready to switch. The result is a seamless transition path that reduces friction and accelerates the move to a more dynamic, standardized resource allocation system.

3. Partitionable Devices: Share Expensive Accelerators Safely

Modern hardware accelerators like GPUs often hold far more capacity than a single workload requires. The Partitionable devices feature (beta) brings native DRA support for dynamically carving physical hardware into smaller logical instances—think Multi-Instance GPUs (MIG) or similar slicing techniques. Administrators can now define resource claims that request a fraction of a device, and the scheduler will allocate only the needed portion. This enables efficient sharing of costly accelerators across multiple Pods, reducing waste and lowering total cost of ownership while maintaining strong isolation boundaries.

4. Device Taints: Precise Hardware Governance

Just as node taints control Pod placement on servers, Device taints and tolerations (beta) let you apply the same logic to individual DRA devices. If a GPU starts showing errors, you can taint it to prevent accidental allocation to production workloads. Likewise, you can reserve certain devices for dedicated teams or experimental workloads by requiring matching tolerations. This granular control helps cluster administrators manage hardware health, enforce resource quotas, and separate critical jobs from less sensitive ones—all without changing application code.

5. Device Binding Conditions: Reliable Scheduling Under Uncertainty

Scheduling reliability often suffers when devices are in high demand or subject to rapid changes. The Device binding conditions feature (beta) improves this by enabling Pods to specify conditions that must be met before a device is bound. This prevents premature allocations that might later fail due to resource contention or policy conflicts. By delaying the final binding decision until all prerequisites are verified, the scheduler reduces retries and wasted cycles, leading to higher overall success rates for resource-intensive workloads—especially in large, multi-tenant clusters with complex hardware profiles.

These five enhancements mark a major step forward for DRA in Kubernetes v1.36. Whether you are just starting your DRA journey or managing a mature fleet of heterogeneous devices, these features provide the tools to optimize utilization, simplify migration, and enforce hardware governance at scale. Keep an eye on the growing driver ecosystem—support now spans beyond compute accelerators into networking and other specialized hardware, making DRA the foundation for a truly hardware-agnostic infrastructure. Embrace these upgrades to unlock the next era of dynamic resource management.

Tags: