목록전체 글 (45)
둔비의 공부공간

https://arxiv.org/abs/2006.10726 Tent: Fully Test-time Adaptation by Entropy Minimization A model must adapt itself to generalize to new and different data during testing. In this setting of fully test-time adaptation the model has only the test data and its own parameters. We propose to adapt by test entropy minimization (tent): we optimize th arxiv.org abstract Model은 새롭거나 다른 data에 대해 testing하..

https://arxiv.org/abs/2109.14960 Prune Your Model Before Distill ItKnowledge distillation transfers the knowledge from a cumbersome teacher to a small student. Recent results suggest that the student-friendly teacher is more appropriate to distill since it provides more transferable knowledge. In this work, we propose thearxiv.orghttps://github.com/ososos888/prune-then-distill GitHub - ososos888..

https://arxiv.org/abs/2305.15975 Triplet Knowledge DistillationIn Knowledge Distillation, the teacher is generally much larger than the student, making the solution of the teacher likely to be difficult for the student to learn. To ease the mimicking difficulty, we introduce a triplet knowledge distillation mechanismarxiv.orgXijun Wang et al. KD에서, teacher가 student보다 크기 때문에, teacher의 solution을..