(5주차) 3월30일
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import tensorflow.experimental.numpy as tnp
tnp.experimental_enable_numpy_behavior()
- $loss=(\frac{1}{2}\beta-1)^2$
- 기존에 했던 방법은 수식을 알고 있어야 한다는 단점이 있음
alpha=0.01/6
opt = tf.keras.optimizers.SGD(learning_rate=alpha)
opt.lr
- opt에 전달할 입력값을 정리해보자
beta= tf.Variable(-10.0)
beta
with tf.GradientTape(persistent=True) as tape:
loss = (beta/2-1)**2
tape.gradient(loss,beta)
slope= tape.gradient(loss,beta)
- iter1: opt.apply_gradients() 에 값을 전달하여 beta를 1회 업데이트
- 주의점:
opt.apply_gradients()의 입력으로 pair의 list를 전달해야함.
opt.apply_gradients([(slope,beta)])
beta
- iter2
with tf.GradientTape(persistent=True) as tape:
loss = (beta/2-1)**2
slope= tape.gradient(loss,beta)
opt.apply_gradients([(slope,beta)])
beta
- for문을 이용한 반복 (정리)
alpha=0.01/6
opt = tf.keras.optimizers.SGD(alpha)
beta= tf.Variable(-10.0)
for epoc in range(10000):
with tf.GradientTape(persistent=True) as tape:
loss = (beta/2-1)**2
slope= tape.gradient(loss,beta)
opt.apply_gradients([(slope,beta)])
beta
alpha=0.01/6
opt = tf.keras.optimizers.SGD(alpha)
beta= tf.Variable(-10.0)
loss_fn = lambda: (beta/2-1)**2
- iter1
opt.minimize(loss_fn,beta)
beta
- iter2
opt.minimize(loss_fn,beta)
beta
- for문을 구하는 코드로 정리
alpha=0.01/6
opt = tf.keras.optimizers.SGD(alpha)
beta= tf.Variable(-10.0)
loss_fn = lambda: (beta/2-1)**2
for epoc in range(10000):
opt.minimize(loss_fn,beta)
beta
- tf.keras.optimizers.SGD와 tf.optimizers.SGD의 차이? 없음
(증거1)
_opt1=tf.keras.optimizers.SGD()
_opt2=tf.optimizers.SGD()
type(_opt1),type(_opt2)
똑같다..?
(증거2)
alpha=0.01/6
opt = tf.optimizers.SGD(alpha)
beta= tf.Variable(-10.0)
loss_fn = lambda: (beta/2-1)**2
for epoc in range(10000):
opt.minimize(loss_fn,beta)
beta
(증거3) 모듈위치가 같다.
tf.optimizers?
tf.keras.optimizers?
- ${\bf y} \approx 4 + 2.5 {\bf x}$
tnp.random.seed(43052)
N = 200
x = tnp.linspace(0,1,N)
epsilon = tnp.random.randn(N)*0.5
y= 2.5+4*x+epsilon
y_true = 2.5+4*x
plt.plot(x,y,'.')
plt.plot(x,y_true,'--r')
Sxx = sum((x-x.mean())**2)
Sxy = sum((x-x.mean())*(y-y.mean()))
beta1_hat = Sxy/Sxx
beta0_hat = y.mean() - beta1_hat*x.mean()
beta0_hat,beta1_hat
X=tf.stack([tf.ones(N,dtype='float64'),x],axis=1)
y=y.reshape(N,1)
X.shape,y.shape
tf.linalg.inv(X.T@X)@ X.T @y
X.shape,y.shape
beta= tnp.array([-5.0,10.0]).reshape(2,1)
slope = -2*X.T@y + 2*X.T@X@beta
slope
alpha = 0.001
step = slope * alpha
step
- 풀이3을 완성하라. 즉 경사하강법을 이용하여 적절한 beta를 추정하라.
- iteration 횟수는 1000번으로 설정
- 학습률은 0.001로 설정
- beta의 초기값은
beta= tnp.array([-5.0,10.0]).reshape(2,1)