- tags
- GPT, Transformers, NLP
- paper
- (Zhang et al. 2020)
Architecture
It is exactly like a GPT-2 architecture but trained on dialog data.
Parameter count
1.5B
Bibliography
- Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan. . May 2, 2020DOI.
Loading comments...