목록전체 글 (45)
둔비의 공부공간

https://github.com/alooow/fantastic_weights_paper GitHub - alooow/fantastic_weights_paper: Repository for the paper: "Fantastic Weights and How to Find Them: Where to Prune in DyRepository for the paper: "Fantastic Weights and How to Find Them: Where to Prune in Dynamic Sparse Training" - alooow/fantastic_weights_papergithub.comNeurIPS 2023 Accepted AbstractDynamic Sparse Training은 학습과정에서 adapti..

https://arxiv.org/abs/2403.13512 Scale Decoupled DistillationLogit knowledge distillation attracts increasing attention due to its practicality in recent studies. However, it often suffers inferior performance compared to the feature knowledge distillation. In this paper, we argue that existing logit-based methods marxiv.org Abstract기존 logits distillation 방법은 sub-optimal한 문제가 있다고 주장한다.- 다양한 sem..

https://arxiv.org/abs/2307.08500 Cumulative Spatial Knowledge Distillation for Vision Transformers Distilling knowledge from convolutional neural networks (CNNs) is a double-edged sword for vision transformers (ViTs). It boosts the performance since the image-friendly local-inductive bias of CNN helps ViT learn faster and better, but leading to two prob arxiv.org DKD저자와 같은 저자로, 코드는 아직 올라오지 않은 것 ..

https://arxiv.org/abs/2310.19444 One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation Knowledge distillation~(KD) has proven to be a highly effective approach for enhancing model performance through a teacher-student training scheme. However, most existing distillation methods are designed under the assumption that the teacher and student m arxiv.org https://..