강의영상

- (1/3): 9월14-16일 강의노트의 일부내용 추가설명

- (2/3): torch.nn.Linear()를 사용하여 yhat을 계산하기, torch.nn.MSELoss()를 이용하여 loss를 계산하기

- (3/3): 과제설명

Import

import torch 
import numpy as np 

Data

- model: yi=w0+w1xi+ϵi=2.5+4xi+ϵi,i=1,2,,ny_i= w_0+w_1 x_i +\epsilon_i = 2.5 + 4x_i +\epsilon_i, \quad i=1,2,\dots,n

- model: y=XW+ϵ{\bf y}={\bf X}{\bf W} +\boldsymbol{\epsilon}

  • y=[y1y2yn],X=[1x11x21xn],W=[2.54],ϵ=[ϵ1ϵn]{\bf y}=\begin{bmatrix} y_1 \\ y_2 \\ \dots \\ y_n\end{bmatrix}, \quad {\bf X}=\begin{bmatrix} 1 & x_1 \\ 1 & x_2 \\ \dots \\ 1 & x_n\end{bmatrix}, \quad {\bf W}=\begin{bmatrix} 2.5 \\ 4 \end{bmatrix}, \quad \boldsymbol{\epsilon}= \begin{bmatrix} \epsilon_1 \\ \dots \\ \epsilon_n\end{bmatrix}
torch.manual_seed(43052)
n=100
ones= torch.ones(n)
x,_ = torch.randn(n).sort()
X = torch.vstack([ones,x]).T
W = torch.tensor([2.5,4])
ϵ = torch.randn(n)*0.5
y = X@W + ϵ
ytrue = X@W

이전방법요약

- step1: yhat

- step2: loss

- step3: derivation

- step4: update

step1: yhat

- feedforward 신경망을 설계하는 과정

- 이 단계가 잘 완료되었다면, 임의의 W^{\bf\hat{W}}을 넣었을 때 y^\bf\hat{y}를 계산할 수 있어야 함

방법1: 직접선언 (내가 공식을 알고 있어야 한다)

What=torch.tensor([-5.0,10.0],requires_grad=True)
yhat1=X@What
yhat1
tensor([-29.8211, -28.6215, -24.9730, -21.2394, -19.7919, -19.6354, -19.5093,
        -19.4352, -18.7223, -18.0793, -16.9040, -16.0918, -16.0536, -15.8746,
        -14.4690, -14.3193, -13.6426, -12.8578, -12.5486, -12.4213, -11.9484,
        -11.1034, -10.8296, -10.6210, -10.5064, -10.0578,  -9.8063,  -9.7380,
         -9.7097,  -9.6756,  -8.8736,  -8.7195,  -8.6880,  -8.1592,  -7.7752,
         -7.7716,  -7.7339,  -7.7208,  -7.6677,  -7.1551,  -7.0004,  -6.8163,
         -6.7081,  -6.5655,  -6.4480,  -6.3612,  -6.0566,  -5.6031,  -5.5589,
         -5.2137,  -4.3446,  -4.3165,  -3.8047,  -3.5801,  -3.4793,  -3.4325,
         -2.3545,  -2.3440,  -1.8434,  -1.7799,  -1.5386,  -1.0161,  -0.8103,
          0.4426,   0.5794,   0.9125,   1.1483,   1.4687,   1.4690,   1.5234,
          1.6738,   2.0592,   2.1414,   2.8221,   3.1536,   3.6682,   4.2907,
          4.8037,   4.8531,   4.9414,   5.3757,   5.3926,   5.6973,   6.0239,
          6.1261,   6.5317,   7.2891,   8.4032,   8.4936,   9.2794,   9.9943,
         10.0310,  10.4369,  11.7886,  15.8323,  17.4440,  18.9350,  21.0560,
         21.0566,  21.6324], grad_fn=<MvBackward>)

방법2: torch.nn.Linear() 사용

net = torch.nn.Linear(in_features=2 ,out_features=1, bias=False) 
net.weight.data
tensor([[0.3320, 0.1982]])
net.weight.data=torch.tensor([[-5.0,10.0]])
net.weight.data
tensor([[-5., 10.]])
net(X)
tensor([[-29.8211],
        [-28.6215],
        [-24.9730],
        [-21.2394],
        [-19.7919],
        [-19.6354],
        [-19.5093],
        [-19.4352],
        [-18.7223],
        [-18.0793],
        [-16.9040],
        [-16.0918],
        [-16.0536],
        [-15.8746],
        [-14.4690],
        [-14.3193],
        [-13.6426],
        [-12.8578],
        [-12.5486],
        [-12.4213],
        [-11.9484],
        [-11.1034],
        [-10.8296],
        [-10.6210],
        [-10.5064],
        [-10.0578],
        [ -9.8063],
        [ -9.7380],
        [ -9.7097],
        [ -9.6756],
        [ -8.8736],
        [ -8.7195],
        [ -8.6880],
        [ -8.1592],
        [ -7.7752],
        [ -7.7716],
        [ -7.7339],
        [ -7.7208],
        [ -7.6677],
        [ -7.1551],
        [ -7.0004],
        [ -6.8163],
        [ -6.7081],
        [ -6.5655],
        [ -6.4480],
        [ -6.3612],
        [ -6.0566],
        [ -5.6031],
        [ -5.5589],
        [ -5.2137],
        [ -4.3446],
        [ -4.3165],
        [ -3.8047],
        [ -3.5801],
        [ -3.4793],
        [ -3.4325],
        [ -2.3545],
        [ -2.3440],
        [ -1.8434],
        [ -1.7799],
        [ -1.5386],
        [ -1.0161],
        [ -0.8103],
        [  0.4426],
        [  0.5794],
        [  0.9125],
        [  1.1483],
        [  1.4687],
        [  1.4690],
        [  1.5234],
        [  1.6738],
        [  2.0592],
        [  2.1414],
        [  2.8221],
        [  3.1536],
        [  3.6682],
        [  4.2907],
        [  4.8037],
        [  4.8531],
        [  4.9414],
        [  5.3757],
        [  5.3926],
        [  5.6973],
        [  6.0239],
        [  6.1261],
        [  6.5317],
        [  7.2891],
        [  8.4032],
        [  8.4936],
        [  9.2794],
        [  9.9943],
        [ 10.0310],
        [ 10.4369],
        [ 11.7886],
        [ 15.8323],
        [ 17.4440],
        [ 18.9350],
        [ 21.0560],
        [ 21.0566],
        [ 21.6324]], grad_fn=<MmBackward>)
yhat2=net(X)

방법3: torch.nn.Linear()사용, bias=True

net = torch.nn.Linear(in_features=1 ,out_features=1, bias=True) 
net.weight.data
tensor([[0.3480]])
net.weight.data=torch.tensor([[10.0]])
net.bias.data=torch.tensor([-5.0])
net.weight,net.bias
(Parameter containing:
 tensor([[10.]], requires_grad=True),
 Parameter containing:
 tensor([-5.], requires_grad=True))
net(x.reshape(100,1))
tensor([[-29.8211],
        [-28.6215],
        [-24.9730],
        [-21.2394],
        [-19.7919],
        [-19.6354],
        [-19.5093],
        [-19.4352],
        [-18.7223],
        [-18.0793],
        [-16.9040],
        [-16.0918],
        [-16.0536],
        [-15.8746],
        [-14.4690],
        [-14.3193],
        [-13.6426],
        [-12.8578],
        [-12.5486],
        [-12.4213],
        [-11.9484],
        [-11.1034],
        [-10.8296],
        [-10.6210],
        [-10.5064],
        [-10.0578],
        [ -9.8063],
        [ -9.7380],
        [ -9.7097],
        [ -9.6756],
        [ -8.8736],
        [ -8.7195],
        [ -8.6880],
        [ -8.1592],
        [ -7.7752],
        [ -7.7716],
        [ -7.7339],
        [ -7.7208],
        [ -7.6677],
        [ -7.1551],
        [ -7.0004],
        [ -6.8163],
        [ -6.7081],
        [ -6.5655],
        [ -6.4480],
        [ -6.3612],
        [ -6.0566],
        [ -5.6031],
        [ -5.5589],
        [ -5.2137],
        [ -4.3446],
        [ -4.3165],
        [ -3.8047],
        [ -3.5801],
        [ -3.4793],
        [ -3.4325],
        [ -2.3545],
        [ -2.3440],
        [ -1.8434],
        [ -1.7799],
        [ -1.5386],
        [ -1.0161],
        [ -0.8103],
        [  0.4426],
        [  0.5794],
        [  0.9125],
        [  1.1483],
        [  1.4687],
        [  1.4690],
        [  1.5234],
        [  1.6738],
        [  2.0592],
        [  2.1414],
        [  2.8221],
        [  3.1536],
        [  3.6682],
        [  4.2907],
        [  4.8037],
        [  4.8531],
        [  4.9414],
        [  5.3757],
        [  5.3926],
        [  5.6973],
        [  6.0239],
        [  6.1261],
        [  6.5317],
        [  7.2891],
        [  8.4032],
        [  8.4936],
        [  9.2794],
        [  9.9943],
        [ 10.0310],
        [ 10.4369],
        [ 11.7886],
        [ 15.8323],
        [ 17.4440],
        [ 18.9350],
        [ 21.0560],
        [ 21.0566],
        [ 21.6324]], grad_fn=<AddmmBackward>)

step2: loss

방법1: 손실함수를 직접정의하는 방법

loss=torch.mean((y-yhat1)**2)
loss
tensor(85.8769, grad_fn=<MeanBackward0>)
loss=torch.mean((y-yhat2)**2)
loss
tensor(176.2661, grad_fn=<MeanBackward0>)
  • 176.2661? 이건 잘못된 결과임
loss=torch.mean((y.reshape(100,1)-yhat2)**2)
loss
tensor(85.8769, grad_fn=<MeanBackward0>)

방법2: torch.nn.MSELoss()를 사용하여 손실함수를 정의하는 방법

lossfn=torch.nn.MSELoss()
loss=lossfn(y,yhat1)
loss
tensor(85.8769, grad_fn=<MseLossBackward>)
loss=lossfn(y.reshape(100,1),yhat2)
loss
tensor(85.8769, grad_fn=<MseLossBackward>)

- model: yi=w0+w1xi1+w2xi2+ϵi=2.5+4x1i+2x2i+ϵi,i=1,2,,ny_i= w_0+w_1 x_{i1}+w_2 x_{i2} +\epsilon_i = 2.5 + 4x_{1i} + -2x_{2i}+\epsilon_i, \quad i=1,2,\dots,n

torch.manual_seed(43052)
n=100
ones= torch.ones(n)
x1,_ = torch.randn(n).sort()
x2,_ = torch.randn(n).sort()
X = torch.vstack([ones,x1,x2]).T
W = torch.tensor([2.5,4,-2])
ϵ = torch.randn(n)*0.5
y = X@W + ϵ
ytrue = X@W
X
tensor([[ 1.0000, -2.4821, -2.3721],
        [ 1.0000, -2.3621, -2.3032],
        [ 1.0000, -1.9973, -2.2271],
        [ 1.0000, -1.6239, -2.0301],
        [ 1.0000, -1.4792, -1.9157],
        [ 1.0000, -1.4635, -1.8241],
        [ 1.0000, -1.4509, -1.6696],
        [ 1.0000, -1.4435, -1.6675],
        [ 1.0000, -1.3722, -1.4723],
        [ 1.0000, -1.3079, -1.4405],
        [ 1.0000, -1.1904, -1.4111],
        [ 1.0000, -1.1092, -1.3820],
        [ 1.0000, -1.1054, -1.3803],
        [ 1.0000, -1.0875, -1.3456],
        [ 1.0000, -0.9469, -1.3255],
        [ 1.0000, -0.9319, -1.2860],
        [ 1.0000, -0.8643, -1.2504],
        [ 1.0000, -0.7858, -1.2095],
        [ 1.0000, -0.7549, -1.1498],
        [ 1.0000, -0.7421, -1.1151],
        [ 1.0000, -0.6948, -1.0980],
        [ 1.0000, -0.6103, -1.0609],
        [ 1.0000, -0.5830, -0.9825],
        [ 1.0000, -0.5621, -0.9672],
        [ 1.0000, -0.5506, -0.9396],
        [ 1.0000, -0.5058, -0.9208],
        [ 1.0000, -0.4806, -0.8768],
        [ 1.0000, -0.4738, -0.7517],
        [ 1.0000, -0.4710, -0.7091],
        [ 1.0000, -0.4676, -0.7027],
        [ 1.0000, -0.3874, -0.6918],
        [ 1.0000, -0.3719, -0.6561],
        [ 1.0000, -0.3688, -0.6153],
        [ 1.0000, -0.3159, -0.5360],
        [ 1.0000, -0.2775, -0.4784],
        [ 1.0000, -0.2772, -0.3936],
        [ 1.0000, -0.2734, -0.3763],
        [ 1.0000, -0.2721, -0.3283],
        [ 1.0000, -0.2668, -0.3227],
        [ 1.0000, -0.2155, -0.2860],
        [ 1.0000, -0.2000, -0.2842],
        [ 1.0000, -0.1816, -0.2790],
        [ 1.0000, -0.1708, -0.2472],
        [ 1.0000, -0.1565, -0.2199],
        [ 1.0000, -0.1448, -0.2170],
        [ 1.0000, -0.1361, -0.1952],
        [ 1.0000, -0.1057, -0.1886],
        [ 1.0000, -0.0603, -0.1829],
        [ 1.0000, -0.0559, -0.1447],
        [ 1.0000, -0.0214, -0.0723],
        [ 1.0000,  0.0655, -0.0667],
        [ 1.0000,  0.0684, -0.0625],
        [ 1.0000,  0.1195, -0.0539],
        [ 1.0000,  0.1420, -0.0356],
        [ 1.0000,  0.1521,  0.0306],
        [ 1.0000,  0.1568,  0.0783],
        [ 1.0000,  0.2646,  0.1328],
        [ 1.0000,  0.2656,  0.1925],
        [ 1.0000,  0.3157,  0.2454],
        [ 1.0000,  0.3220,  0.2519],
        [ 1.0000,  0.3461,  0.3517],
        [ 1.0000,  0.3984,  0.3816],
        [ 1.0000,  0.4190,  0.3831],
        [ 1.0000,  0.5443,  0.3850],
        [ 1.0000,  0.5579,  0.4247],
        [ 1.0000,  0.5913,  0.4431],
        [ 1.0000,  0.6148,  0.4589],
        [ 1.0000,  0.6469,  0.4709],
        [ 1.0000,  0.6469,  0.4711],
        [ 1.0000,  0.6523,  0.4944],
        [ 1.0000,  0.6674,  0.4969],
        [ 1.0000,  0.7059,  0.5234],
        [ 1.0000,  0.7141,  0.5614],
        [ 1.0000,  0.7822,  0.5874],
        [ 1.0000,  0.8154,  0.5899],
        [ 1.0000,  0.8668,  0.6259],
        [ 1.0000,  0.9291,  0.6296],
        [ 1.0000,  0.9804,  0.7098],
        [ 1.0000,  0.9853,  0.7154],
        [ 1.0000,  0.9941,  0.7437],
        [ 1.0000,  1.0376,  0.7786],
        [ 1.0000,  1.0393,  0.8346],
        [ 1.0000,  1.0697,  0.8432],
        [ 1.0000,  1.1024,  0.8558],
        [ 1.0000,  1.1126,  0.8803],
        [ 1.0000,  1.1532,  0.9951],
        [ 1.0000,  1.2289,  1.0430],
        [ 1.0000,  1.3403,  1.0580],
        [ 1.0000,  1.3494,  1.0685],
        [ 1.0000,  1.4279,  1.1723],
        [ 1.0000,  1.4994,  1.2669],
        [ 1.0000,  1.5031,  1.3621],
        [ 1.0000,  1.5437,  1.3738],
        [ 1.0000,  1.6789,  1.4183],
        [ 1.0000,  2.0832,  1.4193],
        [ 1.0000,  2.2444,  1.5095],
        [ 1.0000,  2.3935,  1.6424],
        [ 1.0000,  2.6056,  1.8131],
        [ 1.0000,  2.6057,  2.0058],
        [ 1.0000,  2.6632,  2.2810]])

- torch.nn.Linear() 를 이용하여 W^=[111]\bf{\hat{W}}=\begin{bmatrix}1 \\ 1 \\ 1 \end{bmatrix} 에 대한 y^\hat{y}를 구하라.