DSD Survey

Notice

Recent Posts

Archives

관리 메뉴

둔비의 공부공간

Papers/Compression

Doonby 2023. 4. 12. 12:24

DSD 논문을 인용한 후속논문들에 대한 간단한 요약

Monarch: Expressive Structured Matrices for Efficient and Accurate Training

기존 compute/memory를 줄이는 방법들은 여러 문제가 있었다.
- (방법들)
  - replace dense weight matrices with structured ones, such as sparse & low-rank matrices and the Fourier transform.
- 비효율적인 efficiency <-> quality trade-off
- dense-to-sparse fine-tuning 할때, dense weight matrix의 approximate를 다루기 쉬운 알고리즘의 부족
Monarch라는 hardware-efficient and expressive matrices를 제시했다.
- hardware-efficent - parameterized as products of two block-diagonal matrices for better hardware utillzation
- expressive - represent many commonly used transforms
Monarch matrices를 approximating dense weight matrix에 적용했을때, analytical optimal solution이 됐다.

Sparse Double Descent: Where Network Pruning Aggravates Overfitting

Pruning On-the-Fly: A Recoverable Pruning Method without Fine-tuning

Propose a retraining-free pruning method based on hyperspherical learning and loss penalty terms
Model weights를 0에서 멀어지게 함과 동시에, 필요없는 weights는 0으로 보내면서 안전하게 pruning하여 재학습이 필요없게 loss를 구성

HOW I LEARNED TO STOP WORRYING AND LOVE RETRAINING

대부분의 pruning 학습에서 iterative training / pruning step을 진행한다.
이때, pruning step에서 performance가 떨어지고, retraining step에서 회복하는 것을 반복한다.
최근 연구에서는 retraining phase에서 learning rate schedule을 하고, IMP schedule을 위한 specific heuristics를 선택했다.
simple linear learning rate schedule로도 retraining phase를 효과적으로 줄일 수 있다.

Linear Mode Connectivity and the Lottery Ticket Hypothesis (0)	2023.04.14
DEEP ENSEMBLING WITH NO OVERHEAD FOR EITHER TRAINING OR TESTING: THE ALL-ROUND BLESSINGS OF DYNAMIC SPARSITY (0)	2023.04.12
AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks (0)	2023.04.11
DENSE-SPARSE-DENSE TRAINING FOR DEEP NEURAL NETWORKS (0)	2023.04.11
DYNAMIC MODEL PRUNING WITH FEEDBACK (0)	2023.03.16

'Papers/Compression' Related Articles

Comments