123 Views
March 23, 18
スライド概要
2018/03/02
Deep Learning JP:
http://deeplearning.jp/seminar-2/
DL輪読会資料
DEEP LEARNING JP [DL Papers] Deep Learning Reiji Hatsugai, DeepX http://deeplearning.jp/ 1
M\kxgn[UFW
• -6@@:3<YH,:@96?(.<7=?;3A:=<(03A?:B[UFW"O^P
• \'
–
–
–
–
–
–
^S]R\
$[
Ec
L\mqliXP
+/\&
tgjf{ow|xyplmz|i
hvls
*3A3@A?=>9:4(7=?86AA:<8
11\b
11\rxliulik
• -YH,.0\![UFW"O^P
• d:2352?=XFWF^P
–
VaHTSaP_^QeCC
• ISKNeWJ^PDD
– EYX %P`G[HKFW^PCX]d#Pc
[Zb^P
)
$ $"&# $!%&#
!% %#'$ %"& '$
• •
F+,/
• RUSXOIL CK
•
BDEWN?A@GK L>K
• HRUSXO!ITXMVPQ")
– *.JHL CK $3'5'#170;;64610<698%
– I-.&26=3:53813L>K
,
, https://www.jstage.jst.go.jp/article/sicejl1962/40/10/40_10_735/_pdf
(
& ' • #+/,*0-/.1'&$!' # )%"(
$ $"&# $!%&#
• $$!#.
• ",-,-3+(<
• 07.")&&(,-3+(< 2*0&/&'4%5
• $$!#6+5 1 .+5:8;98
• 11
D.P.Kingma, Adam: A Method for Stochastic Optimization, https://arxiv.org/pdf/1412.6980.pdf
• "/201!&'(-, -*($5.1(+(6#1(-,
– ,3&/0& 3&$1-/./-%2$1! "# $@
<
;
>?
• /-,&$)&/#$1-/&%../-4(+#1&2/3#12/&
– @CDBA
• /-,&$)&/#$1-/&%!&$2/0(3&../-4(+#1(-,
– @CDBA
• 7<;
989=:
John Schulman et al, Trust Region Policy Optimization, https://arxiv.org/pdf/1502.05477.pdf
Yuhuai Wu et al, Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation,
https://arxiv.org/pdf/1708.05144.pdf
Roger Grose & James Martens, A Kronecker-factored approximate Fisher matrix for convolutional layers, https://arxiv.org/pdf/1602.01407.pdf
James Martens & Roger Grose, Optimizing neural network with kronecker-factored approximation curvature,
https://arxiv.org/pdf/1503.05671.pdf
^RTQeZafbc[WYdfS
• ^RTQeDCL<2
– @I\b_fVE CP?H;D 9=J><
– H?1%*@3M 3?1CP?:MEFHPB782
• b]bU
• Θ∗ FDG>?2ME@1'-00.,/OHMA
• `Xc48I IHKNM
• E)&+#O><5($*+ "D>?2M
5@6Mg
C.M.Bishop, PRML () 213p
Hippolyt Ritter et al, A Scaleble Laplace Approximation for Neural Networks, https://openreview.net/pdf?id=Skdvd2xAZ
!
$#"!'% &"!
• 987305/)*+., ((.
– $#"!
– &"!
• :<;=6 -214
-&$!""&-
-
Nitish Shirish Keskar et al, On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima,
https://arxiv.org/pdf/1609.04836.pdf
$5'1&/-*.(!%3%231/0)*&"/1('33*.(
• ##EBGLKJIHG;AB:
=GD9
– 6GLKJI <>CLKJ I =GALKJC=GI
– ?DLKJIC8G-4,3*3%2+,'%1.*.(BF
• LKJ@DMONPLDD
7
I<B9FLKJ I
James Kirkpatrick et al, Overcoming catastrophic forgetting in neural networks, https://arxiv.org/pdf/1612.00796.pdf
6;A<>745@A1*
• +8<9A3#
–
– 9:=0
• *745@A1", (8<9A30 -%.
• .8<9A3*#?2 )'/&$ 0!."
–
*()
Yann Le Cun et al, Optimal Brain Damage, http://yann.lecun.com/exdb/publis/pdf/lecun-90b.pdf
Yoojin Choi et al, Towards the Limit of Network Quantization, https://arxiv.org/pdf/1612.01543.pdf
•
HTWPMUPMN
– L;D<E9
– $4%01!0)!, /%0230"!2).-
•
•
– 8KRZO6L 9C@BVRXF:>K7
8KQNSRZO!"#$" DGYNG
(22/$%%/,%!0-)-'*/3-$%012!-$)-'",!#+".5/0%$)#2).-14)!)-&,3%-#%&3-#2).-1
–
??G
F@=9C8JIA
Pang Wei Koh, Understanding Black-box Predictions via Influence Functions, https://arxiv.org/pdf/1703.04730.pdf
$ $"&# $!%&#
#++&"(&+%#*($)*'",&)(",*&-:EFH
• =?:
5
<>6CGEIA ::EFH1
– !#+ #,:CGEIA
–
8EFH.≈ 10$% J
• B@ID9=? <>:;2430
• 87:
/8
• (
)!" ('
–
– &#! %$%#!" !#%%
– "% #
• $$
C.M.Bishop, PRML p251
D.P.Kingma, Adam: A Method for Stochastic Optimization, https://arxiv.org/pdf/1412.6980.pdf
James Kirkpatrick et al, Overcoming catastrophic forgetting in neural networks, https://arxiv.org/pdf/1612.00796.pdf
Yann Le Cun et al, Optimal Brain Damage, http://yann.lecun.com/exdb/publis/pdf/lecun-90b.pdf
!"# $ (& "# $). *, • &#'!"# $ () & "# $ )-( • 1/012$ .!"+. * %! "# $ (& "# $). 3
?6!":=
• !*)(-&#,%"+#$'%),/
•
C :5A C3 !" = $<.= !%& $C@A
• <FDEGx;708()C5A4:192B>!%& $C
http://www.slis.tsukuba.ac.jp/~fujisawa.makoto.fu/cgibin/wiki/index.php?%CF%A2%CE%A91%BC%A1%CA%FD%C4%F8%BC%B0%A1%A7%B6%A6%CC%F2%B8%FB%C7%DB%CB
%A1
!"# $ (& "# $)=
9; ,#)"
• 526!"# $ () & "# $. 8<7• @>?@A3 =01:= 9/4! "# $ (& "# $)=
• !((#$ !)%' '% *) +&
B
import tensorflow as tf
grads = tf.gradients(loss, params)
hvp = tf.gradients(tf.reduce_sum(grads*x), params)
•
%#" % &#%$$%#)!&'%(&'% • 178652 %#" %4*. • %#" %1 ! "# 4 0 -3,/+ 4*3/
&$#!& ($&&)&' *%%&$+ "( $#
•
1.
• 3-
2
69:87
2/,0
45
.>@6 5!"7
859
•
– .
–
$**%#&/207/-= ;2.
• !"# $ %& ' "# $?
– CDF/.
9=
7%+$)#+%'&741 ,(?3=/-=
• HIEB
–
.
– :<
– GA.
92 7:<b[c
• NXVoqprm%,`K*^
– ."K*a_
SiYKh
NK
• ;7>52/;756_!Z)PYjg[$]92/7:<kcfiheL^]XV
– 0dL
bXYhQ\1/Sf^I]*^jiYP[L
–
'`UZ^jiYKh_Z85=[MD@H? C@?FEBEA[M
• nlqr +`URK
• 9@GGB?E_(&`bWbWJh
• [_
– 7:<N
]-jg
#^(&ZOheL^]XYOV_Z]^MT^]f]KMs
43