Transformer: Attention is all you need for Keras
Published:
Implementation of the Transformer architecture described by Vaswani et al. in “Attention Is All You Need” using the Keras Utility & Layer Collection (kulc).
This repository contains the code the create the model, train and evaluate it. Furthermore, it contains utility code to load a translation dataset (en2de
) to run the experiments on it.