07wk-1: (합성곱신경망) – CNN 자랑, CNN 핵심레이어

Author

최규빈

Published

April 16, 2025

1. 강의영상

2. Imports

import torch
import torchvision
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (4.5, 3.0)

3. CNN 자랑

A. 성능좋음

Fashion MNIST

train_dataset = torchvision.datasets.FashionMNIST(root='./data', train=True, download=True)
test_dataset = torchvision.datasets.FashionMNIST(root='./data', train=False, download=True)
train_dataset = torch.utils.data.Subset(train_dataset, range(5000))
test_dataset = torch.utils.data.Subset(test_dataset, range(1000))
to_tensor = torchvision.transforms.ToTensor()
X = torch.stack([to_tensor(img) for img, lbl in train_dataset]).to("cuda:0")
y = torch.tensor([lbl for img, lbl in train_dataset])
y = torch.nn.functional.one_hot(y).float().to("cuda:0")
XX = torch.stack([to_tensor(img) for img, lbl in test_dataset]).to("cuda:0")
yy = torch.tensor([lbl for img, lbl in test_dataset])
yy = torch.nn.functional.one_hot(yy).float().to("cuda:0")

발악수준으로 설계한 신경망

torch.manual_seed(0)
net = torch.nn.Sequential(
    torch.nn.Flatten(),
    torch.nn.Linear(784,2048),
    torch.nn.ReLU(),
    torch.nn.Linear(2048,10)
).to("cuda")
loss_fn = torch.nn.CrossEntropyLoss()
optimizr = torch.optim.Adam(net.parameters())
for epoc in range(1,500):
    #1
    logits = net(X)
    #2
    loss = loss_fn(logits, y) 
    #3
    loss.backward()
    #4 
    optimizr.step()
    optimizr.zero_grad()
(net(X).argmax(axis=1) == y.argmax(axis=1)).float().mean()
tensor(1., device='cuda:0')
(net(XX).argmax(axis=1) == yy.argmax(axis=1)).float().mean()
tensor(0.8530, device='cuda:0')

대충대충 설계한 합성곱신경망

torch.manual_seed(0)
net = torch.nn.Sequential(
    torch.nn.Conv2d(1,16,2),
    torch.nn.ReLU(),
    torch.nn.MaxPool2d(2),
    torch.nn.Flatten(),
    torch.nn.Linear(2704,10),
).to("cuda")
loss_fn = torch.nn.CrossEntropyLoss()
optimizr = torch.optim.Adam(net.parameters())
for epoc in range(1,500):
    #1
    logits = net(X)
    #2
    loss = loss_fn(logits, y) 
    #3
    loss.backward()
    #4 
    optimizr.step()
    optimizr.zero_grad()
(net(X).argmax(axis=1) == y.argmax(axis=1)).float().mean()
tensor(0.9670, device='cuda:0')
(net(XX).argmax(axis=1) == yy.argmax(axis=1)).float().mean()
tensor(0.8730, device='cuda:0')

B. 파라메터적음

net1 = torch.nn.Sequential(
    torch.nn.Flatten(),
    torch.nn.Linear(784,2048),
    torch.nn.ReLU(),
    torch.nn.Linear(2048,10)
)
net2 = torch.nn.Sequential(
    torch.nn.Conv2d(1,16,2),
    torch.nn.ReLU(),
    torch.nn.MaxPool2d(2),
    torch.nn.Flatten(),
    torch.nn.Linear(2704,10),
)
net1_params = list(net1.parameters())
print(net1_params[0].shape)
print(net1_params[1].shape)
print(net1_params[2].shape)
print(net1_params[3].shape)
torch.Size([2048, 784])
torch.Size([2048])
torch.Size([10, 2048])
torch.Size([10])
2048*784 + 2048 + 10*2048 + 10 
1628170
net2_params = list(net2.parameters())
print(net2_params[0].shape)
print(net2_params[1].shape)
print(net2_params[2].shape)
print(net2_params[3].shape)
torch.Size([16, 1, 2, 2])
torch.Size([16])
torch.Size([10, 2704])
torch.Size([10])
16*1*2*2 + 16 + 10*2704 + 10 
27130
27130/1628170
0.01666287918337766

C. 유명함

- https://brunch.co.kr/@hvnpoet/109

4. CNN 핵심레이어

A. torch.nn.ReLU

(예시1) 연산방법

img = torch.randn(1,1,4,4) # (4,4) 흑백이미지 한장
relu = torch.nn.ReLU()
img
tensor([[[[ 0.0052,  1.1922,  0.7636,  0.0099],
          [-0.9365,  0.0695, -0.1974, -0.1691],
          [-1.9972,  0.9638, -0.8581, -1.1956],
          [ 1.2276,  0.9221,  1.3697, -0.2663]]]])
relu(img)
tensor([[[[0.0052, 1.1922, 0.7636, 0.0099],
          [0.0000, 0.0695, 0.0000, 0.0000],
          [0.0000, 0.9638, 0.0000, 0.0000],
          [1.2276, 0.9221, 1.3697, 0.0000]]]])

B. torch.nn.MaxPool2d

(예시1) 연산방법, kernel_size 의 의미

img = torch.rand(1,1,4,4)
mp = torch.nn.MaxPool2d(kernel_size=2)
img
tensor([[[[0.4601, 0.1505, 0.8785, 0.2573],
          [0.4426, 0.5923, 0.4630, 0.9225],
          [0.9051, 0.5439, 0.8494, 0.6388],
          [0.9822, 0.1382, 0.6126, 0.9961]]]])
mp(img)
tensor([[[[0.5923, 0.9225],
          [0.9822, 0.9961]]]])

(예시2) 이미지크기와 딱 맞지않는 커널일경우?

img = torch.rand(1,1,5,5)
mp = torch.nn.MaxPool2d(kernel_size=3)
img
tensor([[[[0.4661, 0.5162, 0.0087, 0.7542, 0.1391],
          [0.0969, 0.5140, 0.3865, 0.1853, 0.5127],
          [0.7183, 0.3710, 0.5541, 0.1578, 0.4765],
          [0.6287, 0.1574, 0.6492, 0.9207, 0.5921],
          [0.7354, 0.9558, 0.8880, 0.9573, 0.7333]]]])
mp(img)
tensor([[[[0.7183]]]])

(예시3) 정사각형이 아닌 커널

img = torch.rand(1,1,4,4)
mp = torch.nn.MaxPool2d(kernel_size=(4,2))
img
tensor([[[[0.9469, 0.4182, 0.0710, 0.1394],
          [0.2413, 0.7493, 0.0440, 0.7918],
          [0.9179, 0.8230, 0.0547, 0.4162],
          [0.9223, 0.5011, 0.8517, 0.9853]]]])
mp(img)
tensor([[[[0.9469, 0.9853]]]])

C. torch.nn.Conv2d

(예시1) 연산방법, stride=2

img = torch.rand(1,1,4,4)
conv = torch.nn.Conv2d(in_channels=1,out_channels=1,kernel_size=2,stride=2)
img
tensor([[[[0.0197, 0.3086, 0.0321, 0.0743],
          [0.5398, 0.4104, 0.7244, 0.0238],
          [0.9728, 0.4270, 0.2396, 0.1358],
          [0.1888, 0.2525, 0.1224, 0.5778]]]])
conv(img)
tensor([[[[-0.3077, -0.4760],
          [ 0.0550, -0.0650]]]], grad_fn=<ConvolutionBackward0>)

??

conv.weight.data, conv.bias.data
(tensor([[[[ 0.3095,  0.0207],
           [-0.3130,  0.2836]]]]),
 tensor([-0.2675]))
(img[:,  :,  :2,  :2] * conv.weight.data).sum()+conv.bias.data, conv(img)
(tensor([-0.3077]),
 tensor([[[[-0.3077, -0.4760],
           [ 0.0550, -0.0650]]]], grad_fn=<ConvolutionBackward0>))
(img[:,  :,  :2,  2:] * conv.weight.data).sum()+conv.bias.data, conv(img)
(tensor([-0.4760]),
 tensor([[[[-0.3077, -0.4760],
           [ 0.0550, -0.0650]]]], grad_fn=<ConvolutionBackward0>))
(img[:,  :,  2:,  :2] * conv.weight.data).sum()+conv.bias.data, conv(img)
(tensor([0.0550]),
 tensor([[[[-0.3077, -0.4760],
           [ 0.0550, -0.0650]]]], grad_fn=<ConvolutionBackward0>))
(img[:,  :,  2:,  2:] * conv.weight.data).sum()+conv.bias.data, conv(img)
(tensor([-0.0650]),
 tensor([[[[-0.3077, -0.4760],
           [ 0.0550, -0.0650]]]], grad_fn=<ConvolutionBackward0>))