Improving Knowledge Distillation via Regularizing Feature Direction and Norm

doi:10.1007/978-3-031-72691-0_2

UM > INSTITUTE OF COLLABORATIVE INNOVATION

Residential College	false
Status	即將出版Forthcoming
	Improving Knowledge Distillation via Regularizing Feature Direction and Norm
	Wang, Yuzhu 1; Cheng, Lechao 2; Duan, Manni 1; Wang, Yongheng 1; Feng, Zunlei 3; Kong, Shu 4,5,6
	2025
Conference Name	18th European Conference on Computer Vision, ECCV 2024
Source Publication	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	15082 LNCS
Pages	20-37
Conference Date	29 September 2024through 4 October 2024
Conference Place	Milan
Publisher	Springer Science and Business Media Deutschland GmbH
Abstract	Knowledge distillation (KD) is a particular technique of model compression that exploits a large well-trained teacher neural network to train a small student network. Treating teacher’s feature as knowledge, prevailing methods train student by aligning its features with the teacher’s, e.g., by minimizing the KL-divergence or L2-distance between their (logits) features. While it is natural to assume that better feature alignment helps distill teacher’s knowledge, simply forcing this alignment does not directly contribute to the student’s performance, e.g., classification accuracy. For example, minimizing the L2 distance between the penultimate-layer features (used to compute logits for classification) does not necessarily help learn a better student classifier. We are motivated to regularize student features at the penultimate layer using teacher towards training a better student classifier. Specifically, we present a rather simple method that uses teacher’s class-mean features to align student features w.r.t their direction. Experiments show that this significantly improves KD performance. Moreover, we empirically find that student produces features that have notably smaller norms than teacher’s, motivating us to regularize student to produce large-norm features. Experiments show that doing so also yields better performance. Finally, we present a simple loss as our main technical contribution that regularizes student by simultaneously (1) aligning the direction of its features with the teacher class-mean feature, and (2) encouraging it to produce large-norm features. Experiments on standard benchmarks demonstrate that adopting our technique remarkably improves existing KD methods, achieving the state-of-the-art KD performance through the lens of image classification (on ImageNet and CIFAR100 datasets) and object detection (on the COCO dataset).
Keyword	Feature Direction Knowledge Distillation Large-norm
DOI	10.1007/978-3-031-72691-0_2
URL	View the original
Indexed By	CPCI-S
Language	英語English
WOS Research Area	Computer Science
WOS Subject	Computer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications ; Computer Science, Theory & Methods
WOS ID	WOS:001353689800002
Scopus ID	2-s2.0-85208574332
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	INSTITUTE OF COLLABORATIVE INNOVATION
Corresponding Author	Cheng, Lechao
Affiliation	1.Zhejiang Lab, Hangzhou, China 2.Hefei University of Technology, Hefei, China 3.Zhejiang University, Hangzhou, China 4.University of Macau, Taipa, China 5.Institute of Collaborative Innovation, Taipa, China 6.Texas A&M University, College Station, United States
Recommended Citation GB/T 7714	Wang, Yuzhu,Cheng, Lechao,Duan, Manni,et al. Improving Knowledge Distillation via Regularizing Feature Direction and Norm[C]:Springer Science and Business Media Deutschland GmbH, 2025, 20-37.
APA	Wang, Yuzhu., Cheng, Lechao., Duan, Manni., Wang, Yongheng., Feng, Zunlei., & Kong, Shu (2025). Improving Knowledge Distillation via Regularizing Feature Direction and Norm. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 15082 LNCS, 20-37.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh