The global Luong variant computes a score between the current decoder state and each encoder state, normalises scores with softmax, and forms the context vector as a weighted sum of encoder states. The local variant first predicts a central source position and then attends only within a window around that position. The score function can be dot, general or concat.
It reduces cost and simplifies attention construction in seq2seq models while also enabling a local variant that restricts the number of source positions considered at each step.
Time complexity: O(T_x · T_y · d).
Global attention uses all positions; local attention uses a subset of positions.
Like Bahdanau, it usually runs inside an RNN decoder, so generation is sequential.