SeeKer

tags: Transformers, GPT
paper: (Shuster et al. 2022)

Architecture

This is an extension that can be applied to any Transformer model by introducing “search”, “knowledge”, and “response” modules during pre-training of the model. It has the same applications as the base model it extends.

Parameter count

Depends on the base model being extended.

Bibliography

Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston. March 29, 2022. March 29, 2022DOI.

Architecture

Parameter count

Bibliography

Comments

Leave a comment