Vision transformer

tags: Transformers, Computer vision, BERT
paper: (Dosovitskiy et al. 2021)

Architecture

It is an extension of the BERT architecture that can be trained on patches of images.

Parameter count

86M to 632M

Bibliography

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, et al.. June 3, 2021. June 3, 2021DOI.

Links to this note

Last changed 2022.07.27 | authored by Hugo Cisneros

Comments

Loading comments...

Back to Notes