강의영상

- (1/10) 드랍아웃, 배치추가 (1)

- (2/10) 드랍아웃, 배치추가 (2)

- (3/10) resnet34

- (4/10) 모형을 뜯어보는 방법 (직접만든모델)

- (5/10) resnet34 의 구조 (1)

- (6/10) resnet34 의 구조 (2)

- (7/10) yy의 형태에 따라 달라지는 오차항가정/활성화함수/손실함수, 딥러닝의 연구의 4가지 축, 설명가능한 딥러닝의 필요성

- (8/10) CAM (1)

- (9/10) CAM (2)

- (10/10) CAM (3)

import

import torch 
from fastai.vision.all import * 
import graphviz
def gv(s): return graphviz.Source('digraph G{ rankdir="LR"'+ s + ';}')

data

- download data

path = untar_data(URLs.MNIST_SAMPLE)
path.ls()
(#3) [Path('/home/cgb4/.fastai/data/mnist_sample/labels.csv'),Path('/home/cgb4/.fastai/data/mnist_sample/train'),Path('/home/cgb4/.fastai/data/mnist_sample/valid')]

- list

threes=(path/'train'/'3').ls()
sevens=(path/'train'/'7').ls()

- list \to image

Image.open(threes[4])

- image \to tensor

tensor(Image.open(threes[4]))
tensor([[  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   1,  72, 156,
         241, 254, 255, 188,   9,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,  17, 168, 250, 232,
         147,  79, 143, 254,  25,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0, 109, 231, 164,  39,   0,
           0,   0,  86, 251,  24,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,  81,  40,   0,   0,   0,
           0,   4, 200, 157,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,  92, 249,  27,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           5, 221, 128,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         147, 185,  19,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0, 137,
         224,  20,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0, 137, 239,
          68,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,  83, 239,  95,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,  83, 245, 104,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0, 179, 254, 224, 217,
         147,  36,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,  15,  44, 117, 117,
         196, 237, 104,   7,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           6, 117, 246,  95,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,  85, 241,  22,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0, 225, 102,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0, 170, 131,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,  28, 104,   0,   0,
           0,   0,  17, 234,  87,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   1, 198, 179,  29,
           0,  42, 199, 235,  17,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,  14, 154, 236,
         250, 252, 163,  12,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
           0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0]],
       dtype=torch.uint8)
  • 여기에서 tensor는 파이토치가 아니라 fastai에서 구현한 함수임

- 여러개의 리스트를 모두 텐서로 바꿔보자.

seven_tensor = torch.stack([tensor(Image.open(i)) for i in sevens]).float()/255
three_tensor = torch.stack([tensor(Image.open(i)) for i in threes]).float()/255

- XXyy를 만들자.

seven_tensor.shape, three_tensor.shape
(torch.Size([6265, 28, 28]), torch.Size([6131, 28, 28]))
y=torch.tensor([0.0]*6265+ [1.0]*6131).reshape(12396,1)
X=torch.vstack([seven_tensor,three_tensor]).reshape(12396,-1)
X.shape, y.shape
(torch.Size([12396, 784]), torch.Size([12396, 1]))
X=X.reshape(12396,1,28,28)
X.shape
torch.Size([12396, 1, 28, 28])

1. 지난시간까지의 모형 (직접네트워크설계, pytorch)

2d convolution with windowsize=5

c1=torch.nn.Conv2d(1,16,5) # 입력채널=1 (흑백이므로), 출력채널=16, 윈도우크기5 
X.shape, c1(X).shape
(torch.Size([12396, 1, 28, 28]), torch.Size([12396, 16, 24, 24]))

MaxPool2d

m1=torch.nn.MaxPool2d(2)
X.shape,c1(X).shape,m1(c1(X)).shape
(torch.Size([12396, 1, 28, 28]),
 torch.Size([12396, 16, 24, 24]),
 torch.Size([12396, 16, 12, 12]))

ReLU

a1=torch.nn.ReLU()
X.shape,c1(X).shape, m1(c1(X)).shape, a1(m1(c1(X))).shape
(torch.Size([12396, 1, 28, 28]),
 torch.Size([12396, 16, 24, 24]),
 torch.Size([12396, 16, 12, 12]),
 torch.Size([12396, 16, 12, 12]))

flatten

class Flatten(torch.nn.Module):
    def forward(self,x): 
        return x.reshape(12396,-1)
flatten=Flatten()
X.shape,c1(X).shape, m1(c1(X)).shape, a1(m1(c1(X))).shape, flatten(a1(m1(c1(X)))).shape
(torch.Size([12396, 1, 28, 28]),
 torch.Size([12396, 16, 24, 24]),
 torch.Size([12396, 16, 12, 12]),
 torch.Size([12396, 16, 12, 12]),
 torch.Size([12396, 2304]))

linear

l1=torch.nn.Linear(in_features=2304,out_features=1) 
X.shape,\
c1(X).shape, \
m1(c1(X)).shape, \
a1(m1(c1(X))).shape, \
flatten(a1(m1(c1(X)))).shape, \
l1(flatten(a1(m1(c1(X))))).shape
(torch.Size([12396, 1, 28, 28]),
 torch.Size([12396, 16, 24, 24]),
 torch.Size([12396, 16, 12, 12]),
 torch.Size([12396, 16, 12, 12]),
 torch.Size([12396, 2304]),
 torch.Size([12396, 1]))
plt.plot(l1(flatten(a1(m1(c1(X))))).data)
[<matplotlib.lines.Line2D at 0x7f3f9b27b310>]

networks 설계

net = nn.Sequential(c1,m1,a1,flatten,l1)
## 마지막의 sigmoid는 생략한다. torch.nn..BCEWithLogitsLoss()에 내장되어 있을것이므로 

- 손실함수와 옵티마이저 정의

loss_fn=torch.nn.BCEWithLogitsLoss()
optimizer= torch.optim.Adam(net.parameters())

- step1~4

for epoc in range(200): 
    ## 1 
    yhat=net(X)
    ## 2 
    loss=loss_fn(yhat,y) 
    ## 3 
    loss.backward()
    ## 4 
    optimizer.step()
    net.zero_grad()
a2= torch.nn.Sigmoid()
plt.plot(y)
plt.plot(a2(yhat.data),'.')
[<matplotlib.lines.Line2D at 0x7f3f9b25f7c0>]
ypred=a2(yhat.data)>0.5 
sum(ypred==y)/12396
tensor([0.9938])

2. 드랍이웃, 배치추가 (직접네트워크설계, pytorch+fastai)

step1: dls를 만들자.

ds=torch.utils.data.TensorDataset(X,y)
ds.tensors[0].shape
torch.Size([12396, 1, 28, 28])
ds1,ds2 = torch.utils.data.random_split(ds,[10000,2396]) 
dl1 = torch.utils.data.DataLoader(ds1,batch_size=500) 
dl2 = torch.utils.data.DataLoader(ds2,batch_size=2396) 
dls=DataLoaders(dl1,dl2) 

step2: 아키텍처, 손실함수, 옵티마이저

class Flatten(torch.nn.Module):
    def forward(self,x): 
        return x.reshape(x.shape[0],-1)
net=torch.nn.Sequential(
    torch.nn.Conv2d(1,16,5), 
    torch.nn.MaxPool2d(2), 
    torch.nn.ReLU(),
    torch.nn.Dropout2d(), 
    Flatten(),
    torch.nn.Linear(2304,1))
loss_fn=torch.nn.BCEWithLogitsLoss()
#optimizer= torch.optim.Adam(net.parameters())

step3: lrnr 생성 후 적합

lrnr1 = Learner(dls,net,opt_func=Adam,loss_func=loss_fn) 
lrnr1.fit(10)
epoch train_loss valid_loss time
0 0.430748 0.218082 00:00
1 0.262507 0.093608 00:00
2 0.178563 0.070512 00:00
3 0.132288 0.061353 00:00
4 0.104299 0.055355 00:00
5 0.086001 0.050486 00:00
6 0.073517 0.047254 00:00
7 0.064433 0.044630 00:00
8 0.057706 0.042601 00:00
9 0.053289 0.040278 00:00

- 결과를 시각화하면 아래와 같다.

plt.plot(a2(net(X.to("cuda:0")).to("cpu").data),'.')
[<matplotlib.lines.Line2D at 0x7f3f9b66d9d0>]

- 빠르고 적합결과도 좋음

3. resnet34 (기존의 네트워크 사용, 순수 fastai)

- 데이터로부터 새로운 데이터로더스를 만들고 이를 dls2라고 하자.

path=untar_data(URLs.MNIST_SAMPLE) 
path
Path('/home/cgb4/.fastai/data/mnist_sample')
dls2=ImageDataLoaders.from_folder(
    path,
    train='train',
    valid_pct=0.2)     

- 러너오브젝트를 생성하고 학습하자.

lrnr2=cnn_learner(dls2,resnet34,metrics=error_rate)
lrnr2.fine_tune(1)
epoch train_loss valid_loss error_rate time
0 0.284949 0.159780 0.055787 00:08
epoch train_loss valid_loss error_rate time
0 0.042842 0.016358 0.006584 00:09

- 결과관찰

lrnr2.show_results()

모형을 뜯어보는 방법 (lrnr1.model)

- 우선 방법2로 돌아가자.

net(X.to("cuda:0"))
tensor([[-8.1382],
        [-6.9877],
        [ 0.7937],
        ...,
        [12.1038],
        [15.0634],
        [ 7.9055]], device='cuda:0', grad_fn=<AddmmBackward>)

- 네트워크 구조

net
Sequential(
  (0): Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1))
  (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (2): ReLU()
  (3): Dropout2d(p=0.5, inplace=False)
  (4): Flatten()
  (5): Linear(in_features=2304, out_features=1, bias=True)
)

- 층별변환과정

print(X.shape, '--> input image')
print(net[0](X.to("cuda:0")).shape, '--> 2dConv')
print(net[1](net[0](X.to("cuda:0"))).shape, '--> MaxPool2d')
print(net[2](net[1](net[0](X.to("cuda:0")))).shape, '--> ReLU')
print(net[3](net[2](net[1](net[0](X.to("cuda:0"))))).shape, '--> Dropout2d')
print(net[4](net[3](net[2](net[1](net[0](X.to("cuda:0")))))).shape, '--> Flatten')
print(net[5](net[4](net[3](net[2](net[1](net[0](X.to("cuda:0"))))))).shape, '--> Linear')
torch.Size([12396, 1, 28, 28]) --> input image
torch.Size([12396, 16, 24, 24]) --> 2dConv
torch.Size([12396, 16, 12, 12]) --> MaxPool2d
torch.Size([12396, 16, 12, 12]) --> ReLU
torch.Size([12396, 16, 12, 12]) --> Dropout2d
torch.Size([12396, 2304]) --> Flatten
torch.Size([12396, 1]) --> Linear

- 최종결과

net[5](net[4](net[3](net[2](net[1](net[0](X.to("cuda:0")))))))
tensor([[-8.1382],
        [-6.9877],
        [ 0.7937],
        ...,
        [12.1038],
        [15.0634],
        [ 7.9055]], device='cuda:0', grad_fn=<AddmmBackward>)
net(X.to("cuda:0"))
tensor([[-8.1382],
        [-6.9877],
        [ 0.7937],
        ...,
        [12.1038],
        [15.0634],
        [ 7.9055]], device='cuda:0', grad_fn=<AddmmBackward>)

- lrnr1자체를 활용해도 층별변환과정을 추적할수 있음. (lrnr1.model = net 임을 이용)

lrnr1.model
Sequential(
  (0): Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1))
  (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (2): ReLU()
  (3): Dropout2d(p=0.5, inplace=False)
  (4): Flatten()
  (5): Linear(in_features=2304, out_features=1, bias=True)
)
lrnr1.model[0]
Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1))
lrnr1.model(X.to("cuda:0"))
tensor([[-8.1382],
        [-6.9877],
        [ 0.7937],
        ...,
        [12.1038],
        [15.0634],
        [ 7.9055]], device='cuda:0', grad_fn=<AddmmBackward>)
print(X.shape, '--> input image')
print(lrnr1.model[0](X.to("cuda:0")).shape, '--> 2dConv')
print(lrnr1.model[1](lrnr1.model[0](X.to("cuda:0"))).shape, '--> MaxPool2d')
print(lrnr1.model[2](lrnr1.model[1](lrnr1.model[0](X.to("cuda:0")))).shape, '--> ReLU')
print(lrnr1.model[3](lrnr1.model[2](lrnr1.model[1](lrnr1.model[0](X.to("cuda:0"))))).shape, '--> Dropout2d')
print(lrnr1.model[4](lrnr1.model[3](lrnr1.model[2](lrnr1.model[1](lrnr1.model[0](X.to("cuda:0")))))).shape, '--> Flatten')
print(lrnr1.model[5](lrnr1.model[4](lrnr1.model[3](lrnr1.model[2](lrnr1.model[1](lrnr1.model[0](X.to("cuda:0"))))))).shape, '--> Linear')
torch.Size([12396, 1, 28, 28]) --> input image
torch.Size([12396, 16, 24, 24]) --> 2dConv
torch.Size([12396, 16, 12, 12]) --> MaxPool2d
torch.Size([12396, 16, 12, 12]) --> ReLU
torch.Size([12396, 16, 12, 12]) --> Dropout2d
torch.Size([12396, 2304]) --> Flatten
torch.Size([12396, 1]) --> Linear

- 정리: 모형은 항상 아래와 같이 2d-part 와 1d-part로 나누어진다.

torch.Size([12396, 1, 28, 28]) --> input image
torch.Size([12396, 16, 24, 24]) --> 2dConv
torch.Size([12396, 16, 12, 12]) --> MaxPool2d
torch.Size([12396, 16, 12, 12]) --> ReLU
torch.Size([12396, 16, 12, 12]) --> Dropout2d
===============================================================
torch.Size([12396, 2304]) --> Flatten
torch.Size([12396, 1]) --> Linear

- 2d-part:

  • 2d선형변환: nn.torch.nn.Conv2d()
  • 2d비선형변환: torch.nn.MaxPool2d(), torch.nn.ReLU()

- 1d-part:

  • 1d선형변환: torch.nn.Linear()
  • 1d비선형변환: torch.nn.ReLU()
_net1=torch.nn.Sequential(
    net[0],
    net[1],
    net[2],
    net[3])
_net2=torch.nn.Sequential(
    net[4],
    net[5])
_net1
Sequential(
  (0): Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1))
  (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (2): ReLU()
  (3): Dropout2d(p=0.5, inplace=False)
)
_net2
Sequential(
  (0): Flatten()
  (1): Linear(in_features=2304, out_features=1, bias=True)
)
_net=torch.nn.Sequential(_net1,_net2)
_net[1](_net[0](X.to('cuda:0')))
tensor([[-8.1382],
        [-6.9877],
        [ 0.7937],
        ...,
        [12.1038],
        [15.0634],
        [ 7.9055]], device='cuda:0', grad_fn=<AddmmBackward>)

lrnr2.model 분석

- 아래의 모형은 현재 가장 성능이 좋은 모형(state of the art)중 하나인 resnet이다.

lrnr2.model
Sequential(
  (0): Sequential(
    (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (4): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (5): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (6): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (4): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (5): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (7): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (1): Sequential(
    (0): AdaptiveConcatPool2d(
      (ap): AdaptiveAvgPool2d(output_size=1)
      (mp): AdaptiveMaxPool2d(output_size=1)
    )
    (1): Flatten(full=False)
    (2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.25, inplace=False)
    (4): Linear(in_features=1024, out_features=512, bias=False)
    (5): ReLU(inplace=True)
    (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.5, inplace=False)
    (8): Linear(in_features=512, out_features=2, bias=False)
  )
)

- 특징

  • 2d-part: 입력채널이3이다, Conv2d에 padding/stride의 옵션이 있다, 드랍아웃이 없다, 배치정규화가 있다.
  • 1d-part: 배치정규화가 있다, 출력의 차원이 2이다.

DLS, Networks

  • 네트워크의 형태에 따라서 dls의 형태도 다르게 만들어야 한다.
  • MLP모형: 입력이 784784, 첫 네트워크의 형태가 78430784 \to 30 인 torch.nn.Linear()
  • CNN모형: 입력이 1×28×281\times 28 \times 28, 첫 네트워크의 형태가 1×28×2816×24×241\times 28 \times 28 \to 16 \times 24 \times 24 인 torch.nn.Conv2d()
  • Resnet34: 입력이 3×28×283\times 28 \times 28, 첫 네트워크의 형태가 3×28×28??3\times 28 \times 28 \to ??

참고

yy 분포가정 마지막층의 활성화함수 손실함수(파이토치)
3.45, 4.43, ... (연속형) 정규분포 Linear MSEloss
0 or 1 이항분포(베르누이) Sigmoid BCEloss
[0,0,1], [0,1,0], [1,0,0] 다항분포 Softmax CrossEntropyLoss

딥러닝 연구의 네가지 축

(1) 아키텍처 ()(\star)

  • 한 영역의 전문적인 지식이 필요한 것이 아닌것 같다.
  • 끈기, 약간의 운, 직관, 좋은컴퓨터..

(2) 손실함수

  • 통계적지식필요 // 기존의 손실함수를 변형하는 형태 (패널티텀활용)

(3) 미분계산

  • 병렬처리등에 대한 지식 필요

(4) 옵티마이저

  • 최적화에 대한 이론적 토대 필요

- 딥러닝 이전까지의 아키텍처에 대한 연구

  • 파라메트릭 모형: 전문가
  • 넌파라메트릭 모형: 전문가
  • 딥러닝: 상대적으로 비전문가

- 특징: 비전문가도 만들수 있다 + 블랙박스 (내부연산을 뜯어볼 수는 있지만 우리가 해석하기 어려움)

- 설명가능한 딥러닝에 대한 요구 (XAI)

설명가능한 CNN모형

- 현재까지의 모형

  • 1단계: 2d선형변환 \to 2d비선형변환
  • 2단계: Flatten \to MLP

- lrnr1(제가만들었던 모형)의 모형을 다시 복습

lrnr1.model
Sequential(
  (0): Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1))
  (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (2): ReLU()
  (3): Dropout2d(p=0.5, inplace=False)
  (4): Flatten()
  (5): Linear(in_features=2304, out_features=1, bias=True)
)
net1=torch.nn.Sequential(
    lrnr1.model[0],
    lrnr1.model[1],
    lrnr1.model[2],
    lrnr1.model[3])
net1(X.to('cuda:0')).shape
torch.Size([12396, 16, 12, 12])

- 1단계까지의 출력결과를 시각화

fig, axs = plt.subplots(4,4) 
k=0
for i in range(4):
    for j in range(4):
        axs[i,j].imshow(net1(X.to("cuda:0"))[0][k].to("cpu").data)
        k=k+1
fig.set_figheight(8)
fig.set_figwidth(8)
fig.tight_layout()

net1은 유지+ net2의 구조를 변경!!

lrnr1.model
Sequential(
  (0): Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1))
  (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (2): ReLU()
  (3): Dropout2d(p=0.5, inplace=False)
  (4): Flatten()
  (5): Linear(in_features=2304, out_features=1, bias=True)
)

- 계획

  • 변경전net2: (n,16,12,12)flatten(n,?)Linear(?,1)(n,1)(n,16,12,12) \overset{flatten}{\Longrightarrow} (n,?) \overset{Linear(?,1)}{\Longrightarrow} (n,1)
  • 변경후net2: (n,16,12,12)gap+flatten(n,16)Linear(16,1)(n,1)(n,16,12,12) \overset{gap+flatten}{\Longrightarrow} (n,16) \overset{Linear(16,1)}{\Longrightarrow} (n,1)

- gap: 12×\times12 픽셀을 평균내서 하나의 값으로 대표하자 (왜?)

ap=torch.nn.AdaptiveAvgPool2d(output_size=1)
ap(net1(X.to("cuda:0"))).shape
torch.Size([12396, 16, 1, 1])

--

보충학습:ap는 그냥 평균

torch.tensor([[0.1,0.2],[0.3,0.4]])
tensor([[0.1000, 0.2000],
        [0.3000, 0.4000]])
ap(torch.tensor([[0.1,0.2],[0.3,0.4]]))
tensor([[0.2500]])

--

- flatten

flatten(ap(net1(X.to("cuda:0")))).shape
torch.Size([12396, 16])

- linear

_l1=torch.nn.Linear(16,1,bias=False) 
_l1.to("cuda:0")
Linear(in_features=16, out_features=1, bias=False)
_l1(flatten(ap(net1(X.to("cuda:0"))))).shape
torch.Size([12396, 1])

- 이걸 net2로 구성하자. \to (net1,net2)를 묶어서 하나의 새로운 네트워크를 만들자.

net2=torch.nn.Sequential(
    torch.nn.AdaptiveAvgPool2d(1),
    Flatten(),
    torch.nn.Linear(16,1,bias=False))
net=torch.nn.Sequential(net1,net2) 
net
Sequential(
  (0): Sequential(
    (0): Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1))
    (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (2): ReLU()
    (3): Dropout2d(p=0.5, inplace=False)
  )
  (1): Sequential(
    (0): AdaptiveAvgPool2d(output_size=1)
    (1): Flatten()
    (2): Linear(in_features=16, out_features=1, bias=False)
  )
)

- 수정된 네트워크로 lrnr3을 만들고 재학습

ds=torch.utils.data.TensorDataset(X,y)
ds1,ds2=torch.utils.data.random_split(ds,[10000,2396])
dl1=torch.utils.data.DataLoader(ds1,batch_size=1000)
dl2=torch.utils.data.DataLoader(ds2,batch_size=2396)
dls=DataLoaders(dl1,dl2)
lrnr3=Learner(dls,net,opt_func=Adam,loss_func=loss_fn,lr=0.1)
lrnr3.fit(10)
epoch train_loss valid_loss time
0 0.224906 0.109848 00:00
1 0.224250 0.108319 00:00
2 0.224349 0.101010 00:00
3 0.225390 0.109090 00:00
4 0.223415 0.098581 00:00
5 0.219940 0.092961 00:00
6 0.217830 0.105528 00:00
7 0.215317 0.097136 00:00
8 0.212853 0.094199 00:00
9 0.212849 0.100530 00:00

CAM: observation을 1개로 고정하고 net2에서 layer의 순서를 바꿔서 시각화

- 계획

  • 변경전net2: (n,16,12,12)flatten(n,?)Linear(?,1)(n,1)(n,16,12,12) \overset{flatten}{\Longrightarrow} (n,?) \overset{Linear(?,1)}{\Longrightarrow} (n,1)
  • 변경후net2: (n,16,12,12)gap+flatten(n,16)Linear(16,1)(n,1)(n,16,12,12) \overset{gap+flatten}{\Longrightarrow} (n,16) \overset{Linear(16,1)}{\Longrightarrow} (n,1)
  • CAM: (1,16,12,12)Linear(16,1)+flatten(12,12)gap1(1,16,12,12) \overset{Linear(16,1)+flatten}{\Longrightarrow} (12,12) \overset{gap}{\Longrightarrow} 1

- 준비과정1: 시각화할 샘플을 하나 준비하자.

x=X[100]
X.shape,x.shape
(torch.Size([12396, 1, 28, 28]), torch.Size([1, 28, 28]))
  • 차원이 다르므로 나중에 네트워크에 넣을때 문제가 생길 수 있음 \to 차원을 맞춰주자
x=x.reshape(1,1,28,28) 
plt.imshow(x.squeeze())
<matplotlib.image.AxesImage at 0x7f4058887250>

- 준비과정2: 계산과 시각화를 위해서 각 네트워크를 cpu로 옮기자. (fastai로 학습한 직후라 GPU에 있음)

net1.to('cpu')
net2.to('cpu')
Sequential(
  (0): AdaptiveAvgPool2d(output_size=1)
  (1): Flatten()
  (2): Linear(in_features=16, out_features=1, bias=False)
)

- forward확인: 이 값을 기억하자.

net2(net1(x)) ## 음수이므로 class=7 이라고 CNN이 판단 
tensor([[-5.3201]], grad_fn=<MmBackward>)

- net2를 수정하고 forward값 확인

net2
Sequential(
  (0): AdaptiveAvgPool2d(output_size=1)
  (1): Flatten()
  (2): Linear(in_features=16, out_features=1, bias=False)
)
  • net2에서 Linear와 AdaptiveAvgPool2d의 적용순서를 바꿔줌

차원확인

net1(x).squeeze().shape
torch.Size([16, 12, 12])
net2[2].weight.squeeze().shape
torch.Size([16])

Linear(in_features=16, out_features=1, bias=False) 를 적용: 16 ×\times (16,12,12) \to (12,12)

net2[2].weight.squeeze() @ net1(x).squeeze()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_56901/2879013373.py in <module>
----> 1 net2[2].weight.squeeze() @ net1(x).squeeze()

RuntimeError: mat1 and mat2 shapes cannot be multiplied (192x12 and 16x1)
  • 실패..
camimg=torch.einsum('i,ijk -> jk',net2[2].weight.squeeze(), net1(x).squeeze()) 
camimg.shape
torch.Size([12, 12])
  • 성공

AdaptiveAvgPool2d(output_size=1) 를 적용

ap(camimg)
tensor([[-5.3201]], grad_fn=<MeanBackward1>)

!!!! 똑같다?

- 아래의 값이 같다.

net2(net1(x)),ap(camimg)
(tensor([[-5.3201]], grad_fn=<MmBackward>),
 tensor([[-5.3201]], grad_fn=<MeanBackward1>))

- 왜냐하면 ap와 선형변환 모두 linear이므로 순서를 바꿔도 상관없음

- 아래와 결국 같은 이치

_x= np.array([1,2,3,4])
_x
array([1, 2, 3, 4])
np.mean(_x*2+1)
6.0
2*np.mean(_x)+1
6.0

- 이제 camimg 에 관심을 가져보자.

camimg
tensor([[   0.0000,    0.0000,    0.0000,    0.0000,    0.0000,    0.0000,
            0.0000,    0.0000,    0.0000,    0.0000,    0.0000,    0.0000],
        [   0.0000,    0.0000,    0.0000,    0.0000,    0.0000,    0.0000,
            0.0000,    0.0000,    0.0000,    0.0000,    0.0000,    0.0000],
        [   0.0000,    0.0000,    0.0000,    0.0000,    5.9098,    0.0000,
            0.0000,    0.0000,    0.0000,    0.0000,    0.0000,    0.0000],
        [   0.0000,  -19.2289,    0.0000,    0.0000,    0.0000,   29.9900,
           37.3474,    0.0000,    0.0000,    0.0000,    0.0000,    0.0000],
        [   0.0000,  -29.6655,    0.0000,    0.0000,    5.6040,   13.2785,
            1.8883,   30.8203,    4.3253,    0.0000,    0.0000,    0.0000],
        [   0.0000,  -21.9075,    0.0000,    0.0000,  -13.0436,   17.4574,
           30.4819,    6.4937,    0.0000,    0.0000,    0.0000,  -12.9101],
        [   0.0000,  -36.6223,    0.0000,    0.0000,   -3.6448,    0.0000,
            2.4763,   20.4865,    8.1199,    0.0000,    0.0000,   -7.6647],
        [  -2.8867, -175.0864, -110.9451,   -6.6493,    0.0000,    0.0000,
            0.0000,  -18.9344,  -43.5822,    0.0000,   -6.6248,    0.0000],
        [   2.1586,   -1.7907,   -9.5646,  -11.2632,    0.0000,    0.0000,
            0.0000,  -33.7909,    0.0000,    0.0000,  -14.1396,    0.0000],
        [   0.0000,    0.0000,   -0.4893,    0.0000,    0.0000,    0.0000,
          -15.7184,  -42.0344,    0.0000,   -3.9603,   -2.1219,    0.0000],
        [   0.0000,    0.0000,    0.0000,    0.0000,    0.0000,    0.0000,
          -34.1539,  -15.9513,    0.0000,  -12.8468,    0.0000,    0.0000],
        [   0.0000,    0.0000,    0.0000,    0.0000,    0.0000,    0.0000,
          -10.5112, -252.0293,   -1.3688,  -11.8025,    0.0000,    0.0000]],
       grad_fn=<ViewBackward>)
ap(camimg), torch.mean(camimg)
(tensor([[-5.3201]], grad_fn=<MeanBackward1>),
 tensor(-5.3201, grad_fn=<MeanBackward0>))
  • 이미지의 값은 대부분0이지만 궁극적으로는 평균을 내서 음수의 값이 나와야 한다.

- 결국 특정픽셀에서 큰 음의 값이 나오기 떄문에 궁극적으로는 평균이 음수가 된다.

  • 평균이 음수이다. \leftrightarrow 이미지가 의미하는것이 7이다.
  • 특정픽셀이 큰 음수값을 가진다. \leftrightarrow 그 픽셀에서 이미지가 7임을 뚜렷하게 알 수 있다.

- 그 특정픽셀이 어딘가?

plt.imshow(camimg.data)
<matplotlib.image.AxesImage at 0x7f4058863d00>
  • 초록색으로 표현된 부분은 CNN모형이 이 숫자를 7이라고 생각한 근거가 된다.

- 원래의 이미지와 비교

plt.imshow(x.squeeze()) 
<matplotlib.image.AxesImage at 0x7f40587ce4c0>

- 두 이미지를 겹쳐서 그리면 멋진 그림이 될 것 같다.

step1: 원래이미지를 흑백으로 그리자.

plt.imshow(x.squeeze(),cmap='gray',alpha=0.5)
<matplotlib.image.AxesImage at 0x7f40587a7e20>

- step2: 원래이미지는 (28,28)인데 camimg는 (12,12)픽셀 \to camimg의 픽셀을 늘리자.

plt.imshow(camimg.data,alpha=0.5, extent=(0,27,27,0),interpolation='bilinear',cmap='magma')
<matplotlib.image.AxesImage at 0x7f405870e970>

- step3: 합치자.

plt.imshow(x.squeeze(),cmap='gray',alpha=0.5)
plt.imshow(camimg.data,alpha=0.5, extent=(0,27,27,0),interpolation='bilinear',cmap='magma')
<matplotlib.image.AxesImage at 0x7f40587338e0>

숙제

- 숫자3이 그려진 이미지를 observation으로 선택하고 위와 같이 cam을 이용하여 시각화하라.