Softmax를 계산할 때 max 값을 빼는 이유.

카테고리 없음 2018. 5. 28. 13:30

참고 : 원문 : https://jamesmccaffrey.wordpress.com/2016/03/04/the-max-trick-when-computing-softmax/

softmax 식은 아래와 같은데,

실제 구현을 보면, 아래와 같이 max값을 빼서 처리하는 것을 볼 수 있다.

이유는 e 가 1보다 크기 때문에 지수승을 계속하면 매우 큰 수가 되어 계산에 문제가 되기 때문이라고 한다.

결과.

-max를 하지 않으면 값이 2000만 되어도 nan 처리된다.

x : [1. 1. 2.]
softmax with -max :  [0.21194156 0.21194156 0.57611688] 1.0
softmax without -max :  [0.21194156 0.21194156 0.57611688] 1.0
x : [ 1.  1. 20.]
softmax with -max :  [5.60279637e-09 5.60279637e-09 9.99999989e-01] 1.0
softmax without -max :  [5.60279637e-09 5.60279637e-09 9.99999989e-01] 1.0
x : [  1.   1. 200.]
softmax with -max :  [3.76182078e-87 3.76182078e-87 1.00000000e+00] 1.0
softmax without -max :  [3.76182078e-87 3.76182078e-87 1.00000000e+00] 1.0
x : [1.e+00 1.e+00 2.e+03]
softmax with -max :  [0. 0. 1.] 1.0
softmax without -max :  [ 0.  0. nan] nan
x : [1.e+00 1.e+00 2.e+04]
softmax with -max :  [0. 0. 1.] 1.0
softmax without -max :  [ 0.  0. nan] nan

Posted by poterius

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

서서히 그러다 갑자기

Softmax를 계산할 때 max 값을 빼는 이유.

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

글 보관함

달력

링크

티스토리툴바


	by poterius