final

Author

최규빈

Published

December 13, 2022

기말고사

import torch 
from fastai.text.all import *

1. COVID19 tweets \(\to\) 텍스트생성 (30점)

아래의 코드를 이용하여 자료를 다운로드 하라.

df = pd.read_csv('https://raw.githubusercontent.com/guebin/STML2022/main/posts/Corona_NLP_train.csv',encoding="ISO-8859-1")
df
UserName ScreenName Location TweetAt OriginalTweet Sentiment
0 3799 48751 London 16-03-2020 @MeNyrbie @Phil_Gahan @Chrisitv https://t.co/iFz9FAn2Pa and https://t.co/xX6ghGFzCC and https://t.co/I2NlzdxNo8 Neutral
1 3800 48752 UK 16-03-2020 advice Talk to your neighbours family to exchange phone numbers create contact list with phone numbers of neighbours schools employer chemist GP set up online shopping accounts if poss adequate supplies of regular meds but not over order Positive
2 3801 48753 Vagabonds 16-03-2020 Coronavirus Australia: Woolworths to give elderly, disabled dedicated shopping hours amid COVID-19 outbreak https://t.co/bInCA9Vp8P Positive
3 3802 48754 NaN 16-03-2020 My food stock is not the only one which is empty...\r\r\n\r\r\nPLEASE, don't panic, THERE WILL BE ENOUGH FOOD FOR EVERYONE if you do not take more than you need. \r\r\nStay calm, stay safe.\r\r\n\r\r\n#COVID19france #COVID_19 #COVID19 #coronavirus #confinement #Confinementotal #ConfinementGeneral https://t.co/zrlG0Z520j Positive
4 3803 48755 NaN 16-03-2020 Me, ready to go at supermarket during the #COVID19 outbreak.\r\r\n\r\r\nNot because I'm paranoid, but because my food stock is litteraly empty. The #coronavirus is a serious thing, but please, don't panic. It causes shortage...\r\r\n\r\r\n#CoronavirusFrance #restezchezvous #StayAtHome #confinement https://t.co/usmuaLq72n Extremely Negative
... ... ... ... ... ... ...
41152 44951 89903 Wellington City, New Zealand 14-04-2020 Airline pilots offering to stock supermarket shelves in #NZ lockdown #COVID-19 https://t.co/cz89uA0HNp Neutral
41153 44952 89904 NaN 14-04-2020 Response to complaint not provided citing COVID-19 related delays. Yet prompt in rejecting policy before consumer TAT is over. Way to go ? Extremely Negative
41154 44953 89905 NaN 14-04-2020 You know it’s getting tough when @KameronWilds is rationing toilet paper #coronavirus #toiletpaper @kroger martinsville, help us out!! Positive
41155 44954 89906 NaN 14-04-2020 Is it wrong that the smell of hand sanitizer is starting to turn me on?\r\r\n\r\r\n#coronavirus #COVID19 #coronavirus Neutral
41156 44955 89907 i love you so much || he/him 14-04-2020 @TartiiCat Well new/used Rift S are going for $700.00 on Amazon rn although the normal market price is usually $400.00 . Prices are really crazy right now for vr headsets since HL Alex was announced and it's only been worse with COVID-19. Up to you whethe Negative

41157 rows × 6 columns

(1) TextDataLoaders.from_df을 이용하여 dls오브젝트를 만들어라.

  • text_col=‘OriginalTweet’ 로 설정
  • is_lm=True 로 설정
  • seq_len=64 로 설정
## 올바르게 dls를 생성하였을 경우 dls.show_batch()의 결과는 아래와 같음. 
text text_
0 xxbos xxmaj and xxmaj iâ’m not even talking about the xxunk alone . \r\r\n\r\r\n xxmaj iâ’m just saying everybody is about to be arguing with everybody cause a lot of folks have more time to be arguing with everybody and idle hands are the xxunk xxunk and i just … . https : / / t.co / xxunk xxbos xxmaj stop being so selfish xxmaj and xxmaj iâ’m not even talking about the xxunk alone . \r\r\n\r\r\n xxmaj iâ’m just saying everybody is about to be arguing with everybody cause a lot of folks have more time to be arguing with everybody and idle hands are the xxunk xxunk and i just … . https : / / t.co / xxunk xxbos xxmaj stop being so selfish when
1 is the big problem . \r\r\n\r\r\n xxmaj richa xxmaj arora of xxmaj tata xxmaj consumer xxmaj products says that xxunk of govt notifications have been a xxunk . \r\r\n\r\r\n▁ # xxmaj covid_19 # coronavirus https : / / t.co / xxunk xxbos xxmaj hey @borisjohnson \r\r\n xxmaj if you donâ’t put a measure in place for people to stop panic buying / hoarding food the big problem . \r\r\n\r\r\n xxmaj richa xxmaj arora of xxmaj tata xxmaj consumer xxmaj products says that xxunk of govt notifications have been a xxunk . \r\r\n\r\r\n▁ # xxmaj covid_19 # coronavirus https : / / t.co / xxunk xxbos xxmaj hey @borisjohnson \r\r\n xxmaj if you donâ’t put a measure in place for people to stop panic buying / hoarding food and
2 sold my bags gon na pick up when the prices drop . # coronavirus # cryptocurrency # btc # quarentinelife # xxmaj crypto # xxup hodl xxbos xxmaj online grocery shopping services have ground to a virtual halt in stricken xxmaj italy it xxbos xxmaj so @instacart is canceling orders without rescheduling them due to high demand despite putting in the orders a week my bags gon na pick up when the prices drop . # coronavirus # cryptocurrency # btc # quarentinelife # xxmaj crypto # xxup hodl xxbos xxmaj online grocery shopping services have ground to a virtual halt in stricken xxmaj italy it xxbos xxmaj so @instacart is canceling orders without rescheduling them due to high demand despite putting in the orders a week or
3 xxunk through the grocery store to get my share . xxmaj and i forgot xxup tp . xxmaj ugh . xxbos xxmaj epic fail alert . xxmaj person waiting in line at supermarket with mask to avoid # coronavirusuk but taking it off to puff on a cigarette . xxup wtf ? \r\r\n\r\r\n▁ # coronavirus , bad for lungs so wear mask . xxmaj through the grocery store to get my share . xxmaj and i forgot xxup tp . xxmaj ugh . xxbos xxmaj epic fail alert . xxmaj person waiting in line at supermarket with mask to avoid # coronavirusuk but taking it off to puff on a cigarette . xxup wtf ? \r\r\n\r\r\n▁ # coronavirus , bad for lungs so wear mask . xxmaj take
4 xxmaj where is that ? " # socialdistancing xxbos xxmaj just managed to get a slot for an online shopping delivery , the absolute xxunk rush of it , felt like i was watching my lottery numbers come up # xxmaj covid_19 # selfisolating https : / / t.co / xxunk xxbos xxunk xxmaj you are right . xxup covid-19 is xxunk attacking older where is that ? " # socialdistancing xxbos xxmaj just managed to get a slot for an online shopping delivery , the absolute xxunk rush of it , felt like i was watching my lottery numbers come up # xxmaj covid_19 # selfisolating https : / / t.co / xxunk xxbos xxunk xxmaj you are right . xxup covid-19 is xxunk attacking older people
5 social distancing . # xxmaj rant 1 / 2 xxbos xxup covid-19 xxmaj xxunk : xxmaj frankie 's xxmaj supermarket is stocked up on supplies , only shortage is chicken supply xxmaj frankie 's xxmaj supermarket chain one of the largest in the country informed xxmaj xxunk xxmaj xxunk at this stage they are stocked up with foodstuffs . \r\r\n xxmaj the only shortage distancing . # xxmaj rant 1 / 2 xxbos xxup covid-19 xxmaj xxunk : xxmaj frankie 's xxmaj supermarket is stocked up on supplies , only shortage is chicken supply xxmaj frankie 's xxmaj supermarket chain one of the largest in the country informed xxmaj xxunk xxmaj xxunk at this stage they are stocked up with foodstuffs . \r\r\n xxmaj the only shortage they
6 greed , stupidity , ignorance of older generations . \r\r\n xxmaj now , # coronavirus restricts their lives , to protect the old \r\r\n 1 ( 3 ) xxbos xxmaj best xxmaj online xxmaj shopping xxmaj sites for xxmaj womenâ’s xxmaj clothing and xxmaj accessories xxmaj that xxmaj are xxmaj giving xxmaj back xxmaj during xxup covid-19 # onlineshopping # xxmaj ecommerce [ video , stupidity , ignorance of older generations . \r\r\n xxmaj now , # coronavirus restricts their lives , to protect the old \r\r\n 1 ( 3 ) xxbos xxmaj best xxmaj online xxmaj shopping xxmaj sites for xxmaj womenâ’s xxmaj clothing and xxmaj accessories xxmaj that xxmaj are xxmaj giving xxmaj back xxmaj during xxup covid-19 # onlineshopping # xxmaj ecommerce [ video ]
7 xxmaj world in xxup covid-19 xxmaj deaths , xxmaj trump xxmaj pushes to re - open xxmaj country in xxmaj early xxmaj may https : / / t.co / xxunk # mondaythoughts # coronavirus # xxup covid2019 xxbos xxmaj in the wake of xxup covid 19 , xxmaj lets xxmaj share xxmaj burden . \r\r\n\r\r\n xxmaj enjoy xxmaj flat 10 % xxmaj off + world in xxup covid-19 xxmaj deaths , xxmaj trump xxmaj pushes to re - open xxmaj country in xxmaj early xxmaj may https : / / t.co / xxunk # mondaythoughts # coronavirus # xxup covid2019 xxbos xxmaj in the wake of xxup covid 19 , xxmaj lets xxmaj share xxmaj burden . \r\r\n\r\r\n xxmaj enjoy xxmaj flat 10 % xxmaj off + xxmaj
8 paper is being purchased online \r\r\n https : / / t.co / xxunk # covid19 # supplychain # ecommerce xxbos # coronavirus xxmaj do n't forget that with any deliveries you receive make sure that you wipe them down with sanitizer wipes or disinfectant . xxmaj outer packaging gets xxunk straight away . xxmaj wipe down door handles and keep washing hands . xxbos is being purchased online \r\r\n https : / / t.co / xxunk # covid19 # supplychain # ecommerce xxbos # coronavirus xxmaj do n't forget that with any deliveries you receive make sure that you wipe them down with sanitizer wipes or disinfectant . xxmaj outer packaging gets xxunk straight away . xxmaj wipe down door handles and keep washing hands . xxbos #

(2) language_model_learner를 이용하여 오브젝트를 생성하라. lrnr.fine_tune(3,1e-1)을 이용하여 학습하라.

  • arch= AWD_LSTM 이용
  • metrics = [accuracy,perplexity]

(3) “the price of” 이후에 이어질 단어들을 생성하라. (n_words=20 으로 설정할 것)

## 생성예시
'the price of stuff increases in other states as a result of the # coronavirus pandemic . So it makes alternatives .'

2. COVID19 tweets \(\to\) 분류 (30점)

아래의 코드를 이용하여 자료를 다운로드 하라.

df = pd.read_csv('https://raw.githubusercontent.com/guebin/STML2022/main/posts/Corona_NLP_train.csv',encoding="ISO-8859-1")
df
UserName ScreenName Location TweetAt OriginalTweet Sentiment
0 3799 48751 London 16-03-2020 @MeNyrbie @Phil_Gahan @Chrisitv https://t.co/iFz9FAn2Pa and https://t.co/xX6ghGFzCC and https://t.co/I2NlzdxNo8 Neutral
1 3800 48752 UK 16-03-2020 advice Talk to your neighbours family to exchange phone numbers create contact list with phone numbers of neighbours schools employer chemist GP set up online shopping accounts if poss adequate supplies of regular meds but not over order Positive
2 3801 48753 Vagabonds 16-03-2020 Coronavirus Australia: Woolworths to give elderly, disabled dedicated shopping hours amid COVID-19 outbreak https://t.co/bInCA9Vp8P Positive
3 3802 48754 NaN 16-03-2020 My food stock is not the only one which is empty...\r\r\n\r\r\nPLEASE, don't panic, THERE WILL BE ENOUGH FOOD FOR EVERYONE if you do not take more than you need. \r\r\nStay calm, stay safe.\r\r\n\r\r\n#COVID19france #COVID_19 #COVID19 #coronavirus #confinement #Confinementotal #ConfinementGeneral https://t.co/zrlG0Z520j Positive
4 3803 48755 NaN 16-03-2020 Me, ready to go at supermarket during the #COVID19 outbreak.\r\r\n\r\r\nNot because I'm paranoid, but because my food stock is litteraly empty. The #coronavirus is a serious thing, but please, don't panic. It causes shortage...\r\r\n\r\r\n#CoronavirusFrance #restezchezvous #StayAtHome #confinement https://t.co/usmuaLq72n Extremely Negative
... ... ... ... ... ... ...
41152 44951 89903 Wellington City, New Zealand 14-04-2020 Airline pilots offering to stock supermarket shelves in #NZ lockdown #COVID-19 https://t.co/cz89uA0HNp Neutral
41153 44952 89904 NaN 14-04-2020 Response to complaint not provided citing COVID-19 related delays. Yet prompt in rejecting policy before consumer TAT is over. Way to go ? Extremely Negative
41154 44953 89905 NaN 14-04-2020 You know it’s getting tough when @KameronWilds is rationing toilet paper #coronavirus #toiletpaper @kroger martinsville, help us out!! Positive
41155 44954 89906 NaN 14-04-2020 Is it wrong that the smell of hand sanitizer is starting to turn me on?\r\r\n\r\r\n#coronavirus #COVID19 #coronavirus Neutral
41156 44955 89907 i love you so much || he/him 14-04-2020 @TartiiCat Well new/used Rift S are going for $700.00 on Amazon rn although the normal market price is usually $400.00 . Prices are really crazy right now for vr headsets since HL Alex was announced and it's only been worse with COVID-19. Up to you whethe Negative

41157 rows × 6 columns

(1) TextDataLoaders.from_df을 이용하여 dls오브젝트를 만들어라.

  • text_col=’OriginalTweet’로 설정
  • label_col=’Sentiment’로 설정
  • seq_len=64 로 설정
## 올바르게 dls를 생성하였을 경우 dls.show_batch()의 결과는 아래와 같음. 
text category
0 xxbos xxrep 5 ? ? ? xxrep 7 ? ? ? xxrep 7 ? xxrep 4 ? xxrep 4 ? xxrep 11 ? ? ? xxrep 6 ? xxrep 4 ? , xxrep 3 ? xxrep 3 ? ? ? xxrep 3 ? xxrep 4 ? xxrep 3 ? ? ? ? ? xxrep 4 ? ? ? xxrep 3 ? , xxrep 4 ? ? ? ? ? xxrep 6 ? xxrep 3 ? xxrep 3 ? xxrep 3 ? ? ? xxrep 3 ? \r\r\n▁ xxrep 5 ? xxrep 6 ? ? ? xxrep 3 ? xxrep 4 ? xxrep 4 ? ? ? xxrep 4 ? xxrep 6 ? xxrep 4 ? xxrep 8 ? ? ? xxrep 6 ? ? ? xxrep 5 ? ? ? xxrep 3 ? xxrep 4 ? ? ? xxrep 7 ? xxrep 5 ? - xxrep 8 ? xxrep 5 Neutral
1 xxbos xxrep 5 ? xxrep 5 ? ? ? xxrep 6 ? xxrep 5 ? xxrep 3 ? ? ? xxrep 3 ? ? ? xxrep 4 ? xxrep 3 ? xxrep 3 ? xxrep 5 ? xxrep 10 ? xxrep 5 ? xxrep 5 ? xxrep 3 ? xxrep 5 ? ? ? xxrep 4 ? xxrep 7 ? xxrep 3 ? xxrep 3 ? \r\r\n▁ # sindh government spokesman @murtazawahab1 terms # xxmaj quarantine facilities at # xxmaj xxunk border a joke . xxmaj watch the exclusive visuals of criminal negligence ? ? https : / / t.co / xxunk Negative
2 xxbos xxunk xxup very xxup soon xxup the xxup food xxup will xxup be xxup their xxup leverage xxup to xxup control xxup people . xxup hungry xxup people xxup are xxup easy xxup to xxup lead xxup if xxup you xxup promise xxup them xxup food . xxup they xxup are xxup not xxup just xxup killing xxup people xxup with xxup this xxup covid-19 , xxup but xxup the xxup big xxup farmers , xxup processors , xxup and xxup the xxup endless xxup chain xxup of xxup supply xxup and xxup dema Extremely Positive
3 xxbos xxmaj when xxmaj disneyland xxup reopens it will xxup feature a xxup harrowing xxup new xxup death - defying xxup xxunk -- xxup going to the xxup grocery xxup store & & xxup interacting w / xxup people w / in 6 ' when your xxup facemask xxup suddenly xxup slips , your xxup gloves xxup fall xxup off & & xxup you xxup forgot your xxup hand xxup sanitizer ! ! # xxmaj disneyland # xxunk # xxmaj covid_19 # xxup covid https : / / t.co / xxunk Positive
4 xxbos @gavinnewsom @govmurphy https : / / t.co / xxunk \r\r\n xxup this xxup is xxup why xxup the # xxup coronavirus xxup is xxup so xxup contagious , a xxup single xxup cough xxup can xxup spread xxup across a xxup supermarket xxup aisle xxup right xxup over xxup the xxup aisle xxup and xxup into xxup the xxup next xxup aisle , xxup gross xxrep 4 ! xxup what xxup took xxup you xxup so xxup long xxup to xxup sign xxup an xxup eo xxup making xxup people Extremely Negative
5 xxbos # xxup xxunk : # xxup worldwide , xxup y' all xxup alright xxup out xxup there ? xxup the # xxup xxunk xxup is xxup indoors xxup xxunk ' xxup care xxup of # xxup xxunk xxup behind xxup the xxup scenes , xxup xxunk ' xxup this # xxup coronavirus xxup outbreak … supermarket xxup xxunk ' xxup daily : xxup xxunk ' xxup like # xxup xxunk xxup out xxup here … . xxup bought 20 xxup xxunk … https : / / t.co / xxunk Extremely Positive
6 xxbos # xxunk ? \r\r\n\r\r\n xxup with xxup no xxup sports xxup in xxup our xxup lives i xxup wanna xxup provide xxup y all xxup with xxup quality xxup xxunk xxup with xxup lower xxup prices xxup than xxup retail xxup all xxup xxunk xxup now xxup are 40 $ xxup each xxup until xxup sports xxup returns xxup due xxup to # xxmaj covid_19 \r\r\n\r\r\n xxup dms xxup are xxup always xxup open xxrep 4 ! xxup hmu xxrep 3 ! https : / / t.co / xxunk Extremely Negative
7 xxbos # xxup cbd xxup can not xxup cure xxup the # xxup coronavirus xxup but xxup it xxup can \r\r\n▁ ? xxup cure xxup coronavirus xxup symptoms \r\r\n▁ ? xxup ease xxup your xxup anxiety \r\r\n▁ ? xxup boost xxup immune xxup system \r\r\n▁ ? xxup act xxup as a xxup natural xxup painkiller ! \r\r\n * prices have been reduced at this difficult time to help everyone ? ? \r\r\n https : / / t.co / wrlhyzizaa # keep safe https : / / t.co / xxunk Extremely Positive
8 xxbos xxmaj what # coronavirus taught us : \r\r\n 1 . xxmaj to stay at home and be with family . \r\r\n 2 . xxmaj to eat home made , healthy , food . \r\r\n 3 . xxmaj to maintain hygiene . \r\r\n 4 . xxmaj to meditate . \r\r\n 5 . xxmaj to give up junk food . \r\r\n 6 . xxmaj to avoid unnecessary travel . \r\r\n 7 . xxmaj to stockup groceries on time . \r\r\n 8 . xxmaj help your spouse in daily xxunk . Positive

(2) text_classifier_learner를 이용하여 오브젝트를 생성하라. lrnr.fine_tune(5,1e-2)을 이용하여 학습하라.

  • arch= AWD_LSTM 이용
  • metrics = accuracy 이용

(3) 아래의 텍스트에 대한 분류결과를 확인하라.

  • “the government’s approach to the pendemic has been a complete disaster”
  • “the new vaccines hold the promise of a quick return to economic growth”

hint “the government’s approach to the pendemic has been a complete disaster” 에 대하여서는 부정으로, “the new vaccines hold the promise of a quick return to economic growth”에 대하여서는 긍정으로 예측되어야 적절하다.

3. human numbers 5 (40점)

아래와 같은 데이터가 있다고 하자.

txt = (['one',',','two',',','three',',','four',',','five',',']*100)[:-1]
mapping = {',':0, 'one':1, 'two':2, 'three':3, 'four':4, 'five':5} 
txt_x = txt[:-1]
txt_y = txt[1:] 
txt_x[:5], txt_y[:5]
(['one', ',', 'two', ',', 'three'], [',', 'two', ',', 'three', ','])

(1) torch.nn.RNNCell()을 이용하여 다음단어를 예측하는 신경망을 설계하고 학습하라.

(2) torch.nn.RNN()을 이용하여 다음단어를 예측하는 신경망을 설계하고 학습하라.

(3) torch.nn.LSTMCell()을 이용하여 다음단어를 예측하는 신경망을 설계하고 학습하라.

(4) torch.nn.LSTM()을 이용하여 다음단어를 예측하는 신경망을 설계하고 학습하라.

참고: https://guebin.github.io/DL2022/posts/2022-11-29-13wk-2-final.html 의 1번풀이를 참고하세요

ref

https://www.kaggle.com/datasets/datatattle/covid-19-nlp-text-classification