DistillBERT

tags
Transformers, BERT, NLP
paper
(Sanh et al. 2020)

Architecture

It is a distilled version of BERT that is much more efficient.

Parameter count

66M

Bibliography

  1. . . February 29, 2020DOI.
Last changed | authored by

Comments

Loading comments...

Leave a comment

Back to Notes