Abstract
Many real-world applications require the prediction of long sequence
time-series, such as electricity consumption planning. Long sequence
time-series forecasting (LSTF) demands a high prediction capacity of the model,
which is the ability to capture precise long-range dependency coupling between
output and input efficiently. Recent studies have shown the potential of
Transformer to increase the prediction capacity. However, there are several
severe issues with Transformer that prevent it from being directly applicable
to LSTF, including quadratic time complexity, high memory usage, and inherent
limitation of the encoder-decoder architecture. To address these issues, we
design an efficient transformer-based model for LSTF, named Informer, with
three distinctive characteristics: (i) a $ProbSparse$ self-attention mechanism,
which achieves $O(L L)$ in time complexity and memory usage, and has
comparable performance on sequences' dependency alignment. (ii) the
self-attention distilling highlights dominating attention by halving cascading
layer input, and efficiently handles extreme long input sequences. (iii) the
generative style decoder, while conceptually simple, predicts the long
time-series sequences at one forward operation rather than a step-by-step way,
which drastically improves the inference speed of long-sequence predictions.
Extensive experiments on four large-scale datasets demonstrate that Informer
significantly outperforms existing methods and provides a new solution to the
LSTF problem.
Description
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
Links and resources
Tags
community