Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Abstract
Neural network pruning offers an effective method for compressing a multilingual automatic speech recognition (ASR) model with minimal performance loss. However, it entails several rounds of pruning and re-training needed to be run for each language. In this work, we propose the use of an adaptive masking approach in two scenarios for pruning a multilingual ASR model efficiently, each resulting in sparse monolingual models or a sparse multilingual model (named as Dynamic ASR Pathways). Our approach dynamically adapts the sub-network, avoiding premature decisions about a fixed sub-network structure. We show that our approach outperforms existing pruning methods when targeting sparse monolingual models. Further, we illustrate that Dynamic ASR Pathways jointly discovers and trains better sub-networks (pathways) of a single multilingual model by adapting from different sub-network initializations, thereby reducing the need for language-specific pruning.
Community
Objective
The paper proposes an adaptive masking approach for efficiently pruning a multilingual automatic speech recognition (ASR) model to obtain sparse monolingual models or a sparse multilingual model.
Insights
- Adaptive masking allows sub-networks to align better with training data compared to fixed masking.
- For monolingual pruning, adapted masks achieve lower WER than fixed masks and multilingual models.
- For multilingual pruning, adaptive masking reduces the need for separate language-specific pruning.
- Initializing from middle sparsity masks and adapting gives better performance than fixed high sparsity masks.
- Adaptive masking converts language-agnostic masks to language-specific, improving performance.
Results
The proposed adaptive masking approach consistently outperforms current pruning methods, achieving up to 5.8% relative WER reduction for multilingual pruning.
Implementation
- Initialize a dense multilingual ASR model.
- For monolingual pruning:
- Train the model with monolingual data and apply iterative magnitude pruning to get language-specific masks.
- Introduce mask adaptation during training by re-evaluating and adjusting the masks dynamically.
- For multilingual pruning:
- Obtain initial language-specific masks using pruning methods.
- Train language-specific pathways using monolingual batches.
- Introduce mask adaptation and pruning during training to adjust masks and promote parameter sharing.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper