Gating

Vanishing/Exploding Gradient Problem

In deep neural networks, gradients (which guide learning) can either shrink to near zero (vanishing) or grow uncontrollably large (exploding) as they propagate back through many layers. This makes it difficult for the network to learn long-term dependencies in sequences