Transformer: Attention is all you need for Keras

Published:

Implementation of the Transformer architecture described by Vaswani et al. in “Attention Is All You Need” using the Keras Utility & Layer Collection (kulc).

This repository contains the code the create the model, train and evaluate it. Furthermore, it contains utility code to load a translation dataset (en2de) to run the experiments on it.