G-2020-23-EIW10
Uncertainty transfer with knowledge distillation
and BibTeX reference
Knowledge distillation is a technique that consists in training a student network, usually of a low capacity, to mimic the representation space and the performance of a pre-trained teacher network, often cumbersome, large and very high capacity. Starting from the observation that a student can learn about the teacher’s ability in providing predictions, we examine the idea of uncertainty transfer from teacher to student network. We show that through distillation, the distilled network does not only mimic the teacher’s performance but somehow captures the original network’s uncertainty behavior. We provide experiments validating our hypothesis on the MNIST dataset.
Published April 2020 , 9 pages
Document
G2023-EIW10.pdf (400 KB)