Presentation Type
Article
Location
Kennesaw, Georgia
Start Date
1-4-2026 12:30 PM
End Date
1-4-2026 1:45 PM
Description
Deploying Deep Neural Networks (DNNs) on edge devices requires balancing task performance with strict memory and computational constraints. Conventional approaches to multi-task learning often rely on replicating heavy backbones, which is infeasible for resource-constrained environments. In this work, we propose and rigorously analyze a parameter-efficient multi-branch architecture based on a shared, ImageNet-pretrained ResNet-18 backbone. Instead of deploying independent models for distinct visual tasks, we consolidate feature extraction for multiple tasks into a single, unified framework. We first conduct an ablation study to determine the optimal depth of task-specific fully connected (FC) classification heads, maximizing learning capacity while minimizing computational overhead. Subsequently, we systematically investigate architectural branching points, diverging the network at different branching point of the model to identify the optimal balance between early feature sharing and late-stage task specialization that balances accuracy and parameter reuse. Our empirical results demonstrate that an optimized branching strategy significantly reduces the total parameter count compared to independent baseline models, while maintaining highly competitive task-specific accuracy.
Enabling Parameter-Efficient Multi-Tasking via Multi-Head Architecture for Edge Device Deployment
Kennesaw, Georgia
Deploying Deep Neural Networks (DNNs) on edge devices requires balancing task performance with strict memory and computational constraints. Conventional approaches to multi-task learning often rely on replicating heavy backbones, which is infeasible for resource-constrained environments. In this work, we propose and rigorously analyze a parameter-efficient multi-branch architecture based on a shared, ImageNet-pretrained ResNet-18 backbone. Instead of deploying independent models for distinct visual tasks, we consolidate feature extraction for multiple tasks into a single, unified framework. We first conduct an ablation study to determine the optimal depth of task-specific fully connected (FC) classification heads, maximizing learning capacity while minimizing computational overhead. Subsequently, we systematically investigate architectural branching points, diverging the network at different branching point of the model to identify the optimal balance between early feature sharing and late-stage task specialization that balances accuracy and parameter reuse. Our empirical results demonstrate that an optimized branching strategy significantly reduces the total parameter count compared to independent baseline models, while maintaining highly competitive task-specific accuracy.