Location
https://www.kennesaw.edu/ccse/events/computing-showcase/fa25-cday-program.php
Document Type
Event
Start Date
24-11-2025 4:00 PM
Description
Learning-based schedulers such as Decima can optimize directed acyclic graph (DAG) workloads, yet their robustness under changing workload conditions is not well understood. This project evaluates how a Decima-trained policy transfers across different workload scenarios using an automated training and testing pipeline. Results show that the scheduler generalizes well to a workload with the same job scale, achieving a 1.9% improvement in average job completion time. Performance remains stable under a larger workload, but a shift in arrival pattern leads to an 83.7% increase in completion time and reduced fairness. These findings highlight both the potential and the limitations of learned scheduling policies, emphasizing the need for adaptive methods such as fine-tuning for reliable use in dynamic cluster environments.
Included in
GRP-1231 Evaluating Generalization and Adaptation of Learning-Based Schedulers for Directed Acyclic Graph Workloads
https://www.kennesaw.edu/ccse/events/computing-showcase/fa25-cday-program.php
Learning-based schedulers such as Decima can optimize directed acyclic graph (DAG) workloads, yet their robustness under changing workload conditions is not well understood. This project evaluates how a Decima-trained policy transfers across different workload scenarios using an automated training and testing pipeline. Results show that the scheduler generalizes well to a workload with the same job scale, achieving a 1.9% improvement in average job completion time. Performance remains stable under a larger workload, but a shift in arrival pattern leads to an 83.7% increase in completion time and reduced fairness. These findings highlight both the potential and the limitations of learned scheduling policies, emphasizing the need for adaptive methods such as fine-tuning for reliable use in dynamic cluster environments.