Shape autotuning activation function [Formula presented]

Yuan Zhou, Dandan Li, Shuwei Huo, Sun Yuan Kung

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

The choice of activation function is essential for building state-of-the-art neural networks. At present, the most widely-used activation function with effectiveness is ReLU. However, ReLU suffers from the weakness including non-zero mean, negative missing, and unbounded output, thus it has potential disadvantages in the optimization process. In this paper, we propose a novel activation function, namely “Shape Autotuning Activation Function” (SAAF), to overcome these three challenges simultaneously. The SAAF inherits merits of smooth activation functions (such as Sigmoid and Tanh) and piecewise activation functions (such as ReLU and its variants), and avoids their deficiencies. Specifically, the SAAF adaptively adjusts a pair of independent trainable parameters to capture negative information and provide a near-zero mean output, resulting in better generalization performance and faster learning speed. At the same time, it provides bounded outputs to ensure a more stable distribution of output during network training. We evaluated SAAF on deep networks applied to a variety of tasks, including image classification, machine translation, and generative modeling. Comprehensive comparison study shows that the proposed SAAF is superior to state-of-the-art activation functions.

Original languageEnglish (US)
Article number114534
JournalExpert Systems with Applications
Volume171
DOIs
StatePublished - Jun 1 2021

All Science Journal Classification (ASJC) codes

  • General Engineering
  • Computer Science Applications
  • Artificial Intelligence

Keywords

  • Activation functions
  • Deep learning

Fingerprint

Dive into the research topics of 'Shape autotuning activation function [Formula presented]'. Together they form a unique fingerprint.

Cite this