Natural Language Processing with RNNs and Attention

An Encoder-Decoder Network for Machine Translation

Let's take a look at a simple neural machine translation model that will translate English sentences to French. In short, the English sentences are fed to the encoder, and the decoder outputs the French translations.

machine_translation_training

Note

The French translations are also used as inputs to the decoder, but shifted back by one step.

In other words, the decoder is given as input the word that it should have output at the previous step (regardless of what it actually output). For the very first word, it is given the start-of-sequence (SOS) token. The decoder is expected to end the sentence with an end-of-sequence (EOS) token.

Note that at inference time (after training), you will not have the target sentence to feed to the encoder. Instead, simply feed the decoder the word that it output at the previous step.

machine_translation_inference

Reference

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition