- tags
- Transformers, BERT, NLP
- paper
- (Sanh et al. 2020)
Architecture
It is a distilled version of BERT that is much more efficient.
Parameter count
66M
Bibliography
- Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf. . February 29, 2020DOI.
It is a distilled version of BERT that is much more efficient.
66M
Loading comments...