Papers
arxiv:2309.13018

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Published on Sep 22, 2023
· Submitted by akhaliq on Sep 25, 2023
#2 Paper of the day
Authors:
,
,
,
,
,
,
,

Abstract

Neural network pruning offers an effective method for compressing a multilingual automatic speech recognition (ASR) model with minimal performance loss. However, it entails several rounds of pruning and re-training needed to be run for each language. In this work, we propose the use of an adaptive masking approach in two scenarios for pruning a multilingual ASR model efficiently, each resulting in sparse monolingual models or a sparse multilingual model (named as Dynamic ASR Pathways). Our approach dynamically adapts the sub-network, avoiding premature decisions about a fixed sub-network structure. We show that our approach outperforms existing pruning methods when targeting sparse monolingual models. Further, we illustrate that Dynamic ASR Pathways jointly discovers and trains better sub-networks (pathways) of a single multilingual model by adapting from different sub-network initializations, thereby reducing the need for language-specific pruning.

Community

Objective
The paper proposes an adaptive masking approach for efficiently pruning a multilingual automatic speech recognition (ASR) model to obtain sparse monolingual models or a sparse multilingual model.

Insights

  • Adaptive masking allows sub-networks to align better with training data compared to fixed masking.
  • For monolingual pruning, adapted masks achieve lower WER than fixed masks and multilingual models.
  • For multilingual pruning, adaptive masking reduces the need for separate language-specific pruning.
  • Initializing from middle sparsity masks and adapting gives better performance than fixed high sparsity masks.
  • Adaptive masking converts language-agnostic masks to language-specific, improving performance.

Results
The proposed adaptive masking approach consistently outperforms current pruning methods, achieving up to 5.8% relative WER reduction for multilingual pruning.

Implementation

  • Initialize a dense multilingual ASR model.
  • For monolingual pruning:
    • Train the model with monolingual data and apply iterative magnitude pruning to get language-specific masks.
    • Introduce mask adaptation during training by re-evaluating and adjusting the masks dynamically.
  • For multilingual pruning:
    • Obtain initial language-specific masks using pruning methods.
    • Train language-specific pathways using monolingual batches.
    • Introduce mask adaptation and pruning during training to adjust masks and promote parameter sharing.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2309.13018 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2309.13018 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2309.13018 in a Space README.md to link it from this page.

Collections including this paper 1