Shawn's blog Shawn's blog
About Me
  • Category
  • Tag
  • Archive
GitHub (opens new window)

Shawn Jin

I am not a creator of knowledge, I am just a porter of knowledge.
About Me
  • Category
  • Tag
  • Archive
GitHub (opens new window)
  • linear-algebra

  • statistic

  • data-mining

  • machine-learning

    • linear-regression

    • linear-modelling

    • nerual-networks

      • Activation Function in Neural Network
        • Common Activation Functions
          • 1. Sigmoid
          • 2.Tanh or hyperbolic tangent Activation Function
          • 3. ReLU (Rectified Linear Unit) Activation Function
          • 4. Leaky ReLU
          • 5. Softmax
        • Cheetsheet of Common Activation Functions
        • References
    • Difference between Training set, Validation set and Test Set
    • Regularization in Machine Learning
    • Learning Feedforward Neural Network Through XOR
    • Backprop Algorithm in Machine Learning
  • Data Science or Information Science
  • Data-Science
  • machine-learning
  • nerual-networks
Shawn Jin
2021-09-14

Activation Function in Neural Network

# Activation Function in Neural Network

Activation Function

It’s just a thing function that you use to get the output of node. It is also known as Transfer Function. It could be basically divided into 2 types:

  1. Linear Activation Function

  2. Non-linear Activation Function

    1. The main terminologies needed to understand for nonlinear functions are:

    2. Derivative or Differential: Change in y-axis w.r.t. change in x-axis.It is also known as slope.

      Monotonic function: A function which is either entirely non-increasing or non-decreasing.

# Why we use activation functions with Neural Networks?

It is used to determine the output of neural network like yes or no. It maps the resulting values in between 0 to 1 or -1 to 1 etc. (depending upon the function).

# Common Activation Functions

# 1. Sigmoid

The Sigmoid Function curve looks like a S-shape.

The main reason why we use sigmoid function is because it exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output. Since probability of anything exists only between the range of 0 and 1, sigmoid is the right choice.

The function is differentiable. That means, we can find the slope of the sigmoid curve at any two points.

The function is monotonic but function’s derivative is not.

The softmax function is a more generalized logistic activation function which is used for multiclass classification. The logistic sigmoid function can cause a neural network to get stuck at the training time.

# 2.Tanh or hyperbolic tangent Activation Function

tanh is also like logistic sigmoid but better. The range of the tanh function is from (-1 to 1). tanh is also sigmoidal (s - shaped).

The advantage is that the negative inputs will be mapped strongly negative and the zero inputs will be mapped near zero in the tanh graph.

The function is differentiable.

The function is monotonic while its derivative is not monotonic.

The tanh function is mainly used classification between two classes.

Both tanh and logistic sigmoid activation functions are used in feed-forward nets.

# 3. ReLU (Rectified Linear Unit) Activation Function

The ReLU is the most used activation function in the world right now. Since, it is used in almost all the convolutional neural networks or deep learning.

The function and its derivative both are monotonic.

But the issue is that all the negative values become zero immediately which decreases the ability of the model to fit or train from the data properly. That means any negative input given to the ReLU activation function turns the value into zero immediately in the graph, which in turns affects the resulting graph by not mapping the negative values appropriately.

# 4. Leaky ReLU

It is an attempt to solve the dying ReLU problem.

The leak helps to increase the range of the ReLU function. Usually, the value of a is 0.01 or so.

When a is not 0.01 then it is called Randomized ReLU.

The range of the Leaky ReLU is (-infinity to infinity).

Notes

Theoretically, Leaky ReLU has all advances of ReLU and

# 5. Softmax

Softmax is different with max function.

The main issue of softmax is it not differentiable at origin.

# Cheetsheet of Common Activation Functions

# References

Activation Functions in Neural Networks | by SAGAR SHARMA | Towards Data Science (opens new window)

深度学习领域最常用的10个激活函数,一文详解数学原理及优缺点|深度学习|梯度_新浪科技_新浪网 (sina.com.cn) (opens new window)

#Neural Network
Updated: 2021/09/15, 20:43:56
Cross-validation
Difference between Training set, Validation set and Test Set

← Cross-validation Difference between Training set, Validation set and Test Set→

最近更新
01
Python import files from different directories
12-31
02
Classmethod in Python
09-15
03
Single/Double Star (/*) Parameters in Python
09-15
更多文章>
Theme by Vdoing | Copyright © 2019-2021 Shawn Jin | MIT License
  • 跟随系统
  • 浅色模式
  • 深色模式
  • 阅读模式