Interactive Quiz

Test your knowledge!

1
What is the main difference between an autoregressive model (like GPT) and an encoder model (like BERT) in natural language processing?
2
What is the role of the Query (Q), Key (K), and Value (V) matrices in the self-attention mechanism of a transformer?
3
In the transformer architecture, what is the purpose of residual connections between attention and feed-forward layers?
4
In the implementation of a bigram language model, what is the main limitation that explains the poor quality of generated texts?
5
What is the main difference between the self-attention layer used in a decoder and the one in a transformer encoder?
6
In the Vision Transformer (ViT), how are images processed before being passed into the transformer?
7
What is the purpose of the 'class token' in the Vision Transformer?
8
What is the main innovation of the Swin Transformer compared to the Vision Transformer?
9
What is the advantage of relative position embedding in the Swin Transformer?
10
What is the training principle of the CLIP model that associates text and image?
Score: 0/10
Score: 0/10