Department

Statistics and Analytical Sciences

Document Type

Article

Submission Date

2019

Abstract

Long Short-Term Memory (LSTM) units are a family of Recurrent Neural Network (RNN) architectures that have proven incredibly effective at learning from sequence data. They are also extremely complex, making them expensive to train and difficult to understand. A recent trend towards simplification has produced the Gated Recurrent Unit (GRU) and the Minimal Gated Unit (MGU), both of which perform as well as the LSTM (or better) on a variety of tasks. The MGU is one of the simplest gated recurrent architectures at the moment. Our study demonstrates that it is possible to radically simplify the MGU without significant loss of performance for some tasks and datasets. For the gun violence data used here, an extraordinarily simple Forget Gate (FG) architecture (as well as many other simplified architectures) performs just as well as an MGU on the given task. While more complex architectures such as the MGU, GRU, or LSTM may be needed in some situations, they are likely overkill for many real-world datasets, and the marginal performance benefit may come with a very large price tag.

Download

Find in your library

Included in

Artificial Intelligence and Robotics Commons, Data Science Commons, Theory and Algorithms Commons

COinS

Published and Grey Literature from PhD Candidates

Radically Simplifying Gated Recurrent Architectures Without Loss of Performance

Department

Document Type

Submission Date

Abstract

Included in

Search

Authors

Browse

Links

Useful Links

Published and Grey Literature from PhD Candidates

Radically Simplifying Gated Recurrent Architectures Without Loss of Performance

Authors

Department

Document Type

Submission Date

Abstract

Included in

Share

Search

Authors

Browse

Links

Useful Links