ArgMax and SoftMax

Mohit Mishra
3 min readMar 29, 2023

Understanding ArgMax and SoftMax in Minutes

Hello Everyone, Today I came up with a new blog on one of the easy stuck points for the beginner in the field of neural networks. This blog will be small and informative. Feel free to ask any question regarding the blog and do clap and follow if you like the blog for future updates. It really motivates me to work more hard. Enough with this talk let’s get into the main deal

Introduction

  • Let’s just say with some input data we are getting raw output values, not between 0 and 1. Sometimes it can be more than 1 and less than 0.
  • Due to this, these outputs are sent to ArgMax or SoftMax layer first before the final decision is made.

ArgMax

  • It simply takes any set of output values and set the largest output value as 1 & other as 0.
  • So, when we use ArgMax, the neural network’s prediction is simply the output with a 1 in it.
  • This makes the output of the network very easy to interpret.
  • The biggest problem with ArgMax is that we can’t use it to optimize the weights and biases in the Neural Network. Because the output here is constant.
  • This also concludes that we can’t use the ArgMax function for the backpropagation.

Note: People wants to use ArgMax for output but opposite to this they want to use SoftMax for training.

Softmax

  • Softmax function does change the value of Raw Output Values but preserves the original order of it.
  • All output from the softmax function will be between 0 and 1.
  • Regardless of how many raw output values there are, softmax output is always between 0 and 1.
  • The sum of all of the softmax output will always be equal to 1.

Note: Unlike the ArgMax function which has derivative always equal to zero or undefined but the derivative of the SoftMax function is not always 0 and we can use it for Gradient Descent.

With all this, let’s end this blog as this is of ArgMax and SoftMax. If you will find any issue regarding the concept or code, you can message me on my Twitter or LinkedIn. The next blog will be published on 02 April 2023.

Some words about me

I’m Mohit.❤️ You can also call me Chessman. I’m a Machine learning Developer and a competitive programmer. Most of my time is spent staring at a computer screen. During the day, I am usually programming, working to derive insight from large datasets. My skills include Data Analysis, Data Visualization, Machine learning, Deep Learning, DevOps and working toward Full Stack. I have developed a strong acumen for problem-solving, and I enjoy occasional challenges.

My Portfolio and Github.

--

--

Mohit Mishra

My skills include Data Analysis, Data Visualization, Machine learning, and Deep Learning. I have developed a strong acumen for problem-solving, and I enjoy ML.