Quiz-6 (2024.11.12) // 범위: 08wk-2 까지

Author

최규빈

Published

November 12, 2024

항목 허용 여부 비고
강의노트 참고 허용 수업 중 제공된 강의노트나 본인이 정리한 자료를 참고 가능
구글 검색 허용 인터넷을 통한 자료 검색 및 정보 확인 가능
생성 모형 사용 허용 안함 인공지능 기반 도구(GPT 등) 사용 불가
import torch
import datasets
import transformers

1. 모듈 – 10점

????를 적당히 채워서

from numpy import ???? as ???? 

아래와 같은 동작이 가능하도록 하라.

a = arr([1,2,3]) 
a
array([1, 2, 3])
type(a)
numpy.ndarray
b = np.array([1,2,3]) 
NameError: name 'np' is not defined

Note: 함수 arr은 기존의 np.array와 같은 효과를 주는 함수임

(풀이)

from numpy import array as arr 

2. 파이토치 – 70점

(1) 아래의 코드를 수정하여 에러를 고치고 계산가능한 코드로 만들라. – 10점

a = torch.tensor([1,2,3]).to("cuda:0")
b = torch.tensor([1,2,3])
a-b
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

(풀이)

a = torch.tensor([1,2,3]).to("cuda:0")
b = torch.tensor([1,2,3]).to("cuda:0")
a-b
tensor([0, 0, 0], device='cuda:0')

(2) 아래의 코드를 수정하여 에러를 고치고 변수 x를 requires_grad=True 옵션과 함께 올바르게 선언하라. – 10점

x = torch.tensor(3, requires_grad = True)
RuntimeError: Only Tensors of floating point and complex dtype can require gradients

(풀이)

x = torch.tensor(3.0, requires_grad = True)

(3) 파이토치의 미분기능을 이용하여 \(f(x)=\sin(x)\) 일때 \(f'(0)\)의 값을 구하여라. – 10점

hint: \(\sin(x)\)torch.sin(x)를 이용하여 구현하라.

(풀이)

x = torch.tensor(0.0, requires_grad = True)
y = torch.sin(x)
y.backward()
x.grad
tensor(1.)

(4) 아래와 같은 함수 \(l(x)\), \(a(x)\)를 고려하자.

  • \(l(x)= 0.5x -1\)
  • \(a(x)= \frac{\exp(x)}{1+\exp(x)}\)

함수 \(f(x)\)\(l\)\(a\)를 연속으로 합성한 함수이다. 즉

\[f(x) = a(l(x))\]

이다. 이러한 함수 \(f\)에 대하여 \(f'(\frac{1}{4})\) 를 계산하라. – 20점

hint: \(\exp(x)\)torch.exp(x)를 이용하여 구현하라.

(풀이1)

x = torch.tensor(1/4, requires_grad = True)
y = 0.5*x - 1 
z = torch.exp(y)/(1+torch.exp(y))
z.backward()
x.grad
tensor(0.1038)

(풀이2)

x = torch.tensor(1/4, requires_grad = True)
def l(x):
    return 0.5*x -1 
def a(x):
    return torch.exp(x) / (1+torch.exp(x))
y = a(l(x))
y.backward()
x.grad
tensor(0.1038)

(5) 아래의 코드를 수정하여 logitsgrad_fn=<AddmmBackward0> 이 포함되지 않도록하라. – 20점

tsr = torch.randn(10,16,3,224,224)
model = transformers.VideoMAEForVideoClassification.from_pretrained(
    "MCG-NJU/videomae-base",
)
model(tsr)
Some weights of VideoMAEForVideoClassification were not initialized from the model checkpoint at MCG-NJU/videomae-base and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
ImageClassifierOutput(loss=None, logits=tensor([[0.5141, 0.4869],
        [0.6019, 0.6288],
        [0.5439, 0.2437],
        [0.6385, 0.3787],
        [0.5531, 0.1795],
        [0.5203, 0.3516],
        [0.4432, 0.3479],
        [0.4908, 0.4129],
        [0.6014, 0.3230],
        [0.5718, 0.3518]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

(풀이)

tsr = torch.randn(10,16,3,224,224)
model = transformers.VideoMAEForVideoClassification.from_pretrained(
    "MCG-NJU/videomae-base"
)
torch.set_grad_enabled(False)
model(tsr)
Some weights of VideoMAEForVideoClassification were not initialized from the model checkpoint at MCG-NJU/videomae-base and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
ImageClassifierOutput(loss=None, logits=tensor([[-0.1791, -0.0549],
        [ 0.0663,  0.0705],
        [-0.2028, -0.0002],
        [-0.1335, -0.0124],
        [ 0.0198,  0.1779],
        [-0.0024, -0.0916],
        [-0.1839, -0.0414],
        [ 0.0090,  0.0989],
        [-0.1907,  0.0802],
        [ 0.0009, -0.0485]]), hidden_states=None, attentions=None)
torch.set_grad_enabled(False)
<torch.autograd.grad_mode.set_grad_enabled at 0x7bf7100bcc20>

3. 자료분석 – 20점

(1) 아래코드의 에러를 적절히 수정하라. – 10점

model = transformers.AutoModelForSequenceClassification.from_pretrained(
    "distilbert/distilbert-base-uncased", num_labels=2
)
model_input = {
    'input_ids': torch.tensor([[101, 2023, 3185, 2003, 6659, 2021, 2009, 2038, 2070, 2204, 3896, 1012, 102]]),
    'attention_mask': torch.tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]),
    'labels': torch.tensor([0.0])
}
model(**model_input)
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert/distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
ValueError: Target size (torch.Size([1])) must be the same as input size (torch.Size([1, 2]))

(풀이)

model = transformers.AutoModelForSequenceClassification.from_pretrained(
    "distilbert/distilbert-base-uncased", num_labels=2
)
model_input = {
    'input_ids': torch.tensor([[101, 2023, 3185, 2003, 6659, 2021, 2009, 2038, 2070, 2204, 3896, 1012, 102]]),
    'attention_mask': torch.tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]),
    'labels': torch.tensor([0])
}
model(**model_input)
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert/distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
SequenceClassifierOutput(loss=tensor(0.7860), logits=tensor([[-0.0562,  0.1216]]), hidden_states=None, attentions=None)

(2) 아래의 코드를 관찰하라.

tsr = torch.randn(10,3,224,224)
model = transformers.AutoModelForImageClassification.from_pretrained(
    "google/vit-base-patch16-224-in21k",
    num_labels=3
)
model(tsr)
Some weights of ViTForImageClassification were not initialized from the model checkpoint at google/vit-base-patch16-224-in21k and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
ImageClassifierOutput(loss=None, logits=tensor([[-0.0173, -0.1017, -0.0902],
        [-0.0138, -0.0895, -0.1029],
        [-0.0285, -0.1240, -0.0857],
        [-0.0270, -0.1114, -0.0732],
        [-0.0047, -0.0962, -0.1087],
        [ 0.0087, -0.1476, -0.0941],
        [ 0.0340, -0.1530, -0.0872],
        [-0.0320, -0.0898, -0.0756],
        [-0.0015, -0.0830, -0.0583],
        [ 0.0087, -0.1169, -0.0601]]), hidden_states=None, attentions=None)

위와 동일한 계산을 cuda에서 수행하라. 즉 위에서 선언한 tsrmodel을 cuda로 이동시킨 후 계산하라. – 10점

(풀이)

model.to("cuda")
model(tsr.to("cuda"))
ImageClassifierOutput(loss=None, logits=tensor([[-0.0173, -0.1017, -0.0902],
        [-0.0138, -0.0895, -0.1029],
        [-0.0285, -0.1240, -0.0857],
        [-0.0270, -0.1114, -0.0732],
        [-0.0047, -0.0962, -0.1087],
        [ 0.0087, -0.1476, -0.0941],
        [ 0.0340, -0.1530, -0.0872],
        [-0.0320, -0.0898, -0.0756],
        [-0.0015, -0.0830, -0.0583],
        [ 0.0087, -0.1169, -0.0601]], device='cuda:0'), hidden_states=None, attentions=None)