Presentation Type

Article

Location

Kennesaw, Georgia

Start Date

1-4-2026 12:30 PM

End Date

1-4-2026 1:45 PM

Description

Deploying Deep Neural Networks (DNNs) on edge devices requires balancing task performance with strict memory and computational constraints. Conventional approaches to multi-task learning often rely on replicating heavy backbones, which is infeasible for resource-constrained environments. In this work, we propose and rigorously analyze a parameter-efficient multi-branch architecture based on a shared, ImageNet-pretrained ResNet-18 backbone. Instead of deploying independent models for distinct visual tasks, we consolidate feature extraction for multiple tasks into a single, unified framework. We first conduct an ablation study to determine the optimal depth of task-specific fully connected (FC) classification heads, maximizing learning capacity while minimizing computational overhead. Subsequently, we systematically investigate architectural branching points, diverging the network at different branching point of the model to identify the optimal balance between early feature sharing and late-stage task specialization that balances accuracy and parameter reuse. Our empirical results demonstrate that an optimized branching strategy significantly reduces the total parameter count compared to independent baseline models, while maintaining highly competitive task-specific accuracy.

Share

COinS
 
Apr 1st, 12:30 PM Apr 1st, 1:45 PM

Enabling Parameter-Efficient Multi-Tasking via Multi-Head Architecture for Edge Device Deployment

Kennesaw, Georgia

Deploying Deep Neural Networks (DNNs) on edge devices requires balancing task performance with strict memory and computational constraints. Conventional approaches to multi-task learning often rely on replicating heavy backbones, which is infeasible for resource-constrained environments. In this work, we propose and rigorously analyze a parameter-efficient multi-branch architecture based on a shared, ImageNet-pretrained ResNet-18 backbone. Instead of deploying independent models for distinct visual tasks, we consolidate feature extraction for multiple tasks into a single, unified framework. We first conduct an ablation study to determine the optimal depth of task-specific fully connected (FC) classification heads, maximizing learning capacity while minimizing computational overhead. Subsequently, we systematically investigate architectural branching points, diverging the network at different branching point of the model to identify the optimal balance between early feature sharing and late-stage task specialization that balances accuracy and parameter reuse. Our empirical results demonstrate that an optimized branching strategy significantly reduces the total parameter count compared to independent baseline models, while maintaining highly competitive task-specific accuracy.