강의영상

- (1/9): 공지사항

- (2/9): 액시즈를 이용한 플랏 (1)

- (3/9): 액시즈를 이용한 플랏 (2)

- (4/9): 액시즈를 이용한 플랏 (3)

- (5/9): 액시즈를 이용한 플랏 (4)

- (6/9): 액시즈를 이용한 플랏 (5)

- (7/9): title 설정

- (8/9): 축의 범위를 설정, 독립과 상관계수 (1)

- (9/9): 독립과 상관계수 (2)

matplotlib로 (진짜 어렵게) 그림을 그리는 방법

에제1: 액시즈를 이용한 플랏

- 목표: plt.plot() 을 사용하지 않고 아래 그림을 그려보자.

import matplotlib.pyplot as plt 
plt.plot([1,2,3],'or')

[<matplotlib.lines.Line2D at 0x7ff0fbe36af0>]

- 구조: axis $\subset$ axes $\subset$ figure

https://matplotlib.org/stable/gallery/showcase/anatomy.html#sphx-glr-gallery-showcase-anatomy-py

- 전략: 그림을 만들고 (도화지를 준비) $\to$ 액시즈를 만들고 (네모틀을 만든다) $\to$ 액시즈에 그림을 그린다. (.plot()을 이용)

- 우선 그림객체를 생성한다.

fig = plt.figure() # 도화지를 준비한다.

<Figure size 432x288 with 0 Axes>

fig # 현재 도화지상태를 체크

<Figure size 432x288 with 0 Axes>

그림객체를 출력해봐야 아무것도 나오지 않는다. (아무것도 없으니까..)

fig.add_axes() ## 액시즈를 fig에 추가하라. 
fig.axes ## 현재 fig에 있는 액시즈 정보

fig.axes # 현재 네모틀 상태를 체크

[]

fig.add_axes([0,0,1,1]) # 도화지안에 (0,0) 위치에 길이가 (1,1) 인 네모틀을 만든다.

<Axes:>

fig.axes # 현재 네모틀 상태를 체크 --> 네모틀이 하나 있음.

[<Axes:>]

fig # 현재도화지 상태 체크 --> 도화지에 (하나의) 네모틀이 잘 들어가 있음

axs1=fig.axes[0] ## 첫번째 액시즈

axs1.plot([1,2,3],'or') # 첫번쨰 액시즈에 접근하여 그림을 그림

[<matplotlib.lines.Line2D at 0x7ff0fc072070>]

fig #현재 도화지 상태 체크 --> 그림이 잘 그려짐

예제2: 액시즈를 이용한 서브플랏 (방법1)

- 목표: subplot

fig # 현재 도화지 출력

- 액시즈추가

fig.add_axes([1,0,1,1])

<Axes:>

fig.axes

[<Axes:>, <Axes:>]

fig

axs2=fig.axes[1] ## 두번째 액시즈

- 두번째 액시즈에 그림그림

axs2.plot([1,2,3],'ok') ## 두번째 액시즈에 그림그림

[<matplotlib.lines.Line2D at 0x7ff0fc0846d0>]

fig ## 현재 도화지 체크

- 첫번째 액시즈에 그림추가

axs1.plot([1,2,3],'--') ### 액시즈1에 점선추가

[<matplotlib.lines.Line2D at 0x7ff0fc084fa0>]

fig ## 현재 도화지 체크

예제3: 액시즈를 이용하여 서브플랏 (방법2)

- 예제2의 레이아웃이 좀 아쉽다.

- 다시 그려보자.

fig = plt.figure()

<Figure size 432x288 with 0 Axes>

fig.axes

[]

fig.subplots(1,2)

array([<AxesSubplot:>, <AxesSubplot:>], dtype=object)

fig.axes

[<AxesSubplot:>, <AxesSubplot:>]

ax1,ax2 = fig.axes

ax1.plot([1,2,3],'or')
ax2.plot([1,2,3],'ob')

[<matplotlib.lines.Line2D at 0x7ff0fc072190>]

fig

그림이 좀 좁은것 같다. (도화지를 늘려보자)

fig.set_figwidth(10)

fig

ax1.plot([1,2,3],'--')

[<matplotlib.lines.Line2D at 0x7ff0fb906d60>]

fig

예제4: 액시즈를 이용하여 2 $\times$ 2 서브플랏 그리기

fig = plt.figure()
fig.axes

[]

<Figure size 432x288 with 0 Axes>

fig.subplots(2,2) 
fig.axes

[<AxesSubplot:>, <AxesSubplot:>, <AxesSubplot:>, <AxesSubplot:>]

ax1,ax2,ax3,ax4=fig.axes

ax1.plot([1,2,3],'ob')
ax2.plot([1,2,3],'or')
ax3.plot([1,2,3],'ok')
ax4.plot([1,2,3],'oy')

[<matplotlib.lines.Line2D at 0x7ff0fb177430>]

fig

예제5: plt.subplots()를 이용하여 2 $\times$ 2 서브플랏 (복습)

x=[1,2,3,4]
y=[1,2,4,3]
_, axs = plt.subplots(2,2) 
axs[0,0].plot(x,y,'o:r') 
axs[0,1].plot(x,y,'Xb') 
axs[1,0].plot(x,y,'xm') 
axs[1,1].plot(x,y,'.--k')

[<matplotlib.lines.Line2D at 0x7ff0fac18dc0>]

- 단계적으로 코드를 실행하고 싶을때

x=[1,2,3,4]
y=[1,2,4,3]

_, axs = plt.subplots(2,2)

axs[0,0].plot(x,y,'o:r') 
axs[0,1].plot(x,y,'Xb') 
axs[1,0].plot(x,y,'xm') 
axs[1,1].plot(x,y,'.--k')

[<matplotlib.lines.Line2D at 0x7ff0faaa0940>]

어? 그림을 볼려면 어떻게 하지?

_

이렇게 하면된다.

- 단계적으로 그림을 그릴경우에는 도화지객체를 fig라는 변수로 명시하여 받는것이 가독성이 좋다.

x=[1,2,3,4]
y=[1,2,4,3]

fig, axs = plt.subplots(2,2)

axs[0,0].plot(x,y,'o:r') 
axs[0,1].plot(x,y,'Xb') 
axs[1,0].plot(x,y,'xm') 
axs[1,1].plot(x,y,'.--k')

[<matplotlib.lines.Line2D at 0x7ff0fa935160>]

fig # 현재 도화지 확인

예제6: plt.subplots()를 2 $\times$ 2 subplot 그리기 -- 액시즈를 각각 변수명으로 저장

x=[1,2,3,4]
y=[1,2,4,3]
fig, axs = plt.subplots(2,2)

ax1,ax2,ax3,ax4 =axs

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_1794991/648347195.py in <module>
----> 1 ax1,ax2,ax3,ax4 =axs

ValueError: not enough values to unpack (expected 4, got 2)

(ax1,ax2), (ax3,ax4) = axs

ax1.plot(x,y,'o:r') 
ax2.plot(x,y,'Xb') 
ax3.plot(x,y,'xm') 
ax4.plot(x,y,'.--k')

[<matplotlib.lines.Line2D at 0x7ff0fb3650d0>]

fig

예제7: plt.subplots()를 이용하여 2 $\times$ 2 서브플랏 그리기 -- fig.axes에서 접근!

fig, _ = plt.subplots(2,2)

fig.axes

[<AxesSubplot:>, <AxesSubplot:>, <AxesSubplot:>, <AxesSubplot:>]

ax1, ax2, ax3, ax4= fig.axes

ax1.plot(x,y,'o:r') 
ax2.plot(x,y,'Xb') 
ax3.plot(x,y,'xm') 
ax4.plot(x,y,'.--k')

[<matplotlib.lines.Line2D at 0x7ff0faddc670>]

fig

- 예제7, 예제4와 비교해볼것: 거의 비슷함

- matplotlib은 그래프를 쉽게 그릴수도 있지만 어렵게 그릴수도 있다.

- 오브젝트를 컨트르로 하기 어려우므로 여러가지 축약버전이 존재함.

사실 그래서 서브플랏을 그리는 방법 1,2,3... 와 같은 식으로 정리하여 암기하기에는 무리가 있다.

- 원리를 꺠우치면 다양한 방법을 자유자재로 쓸 수 있음. (자유도가 높음)

제목설정

예제1: plt.plot()

x=[1,2,3]
y=[1,2,2]

plt.plot(x,y)
plt.title('title')

Text(0.5, 1.0, 'title')

예제2: 액시즈를 이용

fig = plt.figure()
fig.subplots()

<AxesSubplot:>

ax1=fig.axes[0]

ax1.set_title('title')

Text(0.5, 1.0, 'title')

fig

- 문법을 잘 이해했으면 각 서브플랏의 제목을 설정하는 방법도 쉽게 알 수 있다.

예제3: subplot에서 각각의 제목설정

fig, ax = plt.subplots(2,2)

(ax1,ax2),(ax3,ax4) =ax

ax1.set_title('title1')
ax2.set_title('title2')
ax3.set_title('title3')
ax4.set_title('title4')

Text(0.5, 1.0, 'title4')

fig

- 보기싫음 $\to$ 서브플랏의 레이아웃 재정렬

fig.tight_layout() # 외우세요..

예제4: 액시즈의 제목 + Figure제목

fig.suptitle('sup title')

Text(0.5, 0.98, 'sup title')

fig

fig.tight_layout()

fig

축범위설정

예제1

x=[1,2,3]
y=[4,5,6]

plt.plot(x,y,'o')

[<matplotlib.lines.Line2D at 0x7ff0fa3e0f40>]

plt.plot(x,y,'o')
plt.xlim(-1,5)
plt.ylim(3,7)

(3.0, 7.0)

예제2

fig = plt.figure()
fig.subplots()

<AxesSubplot:>

ax1=fig.axes[0]

import numpy as np

ax1.plot(np.random.normal(size=100),'o')

[<matplotlib.lines.Line2D at 0x7ff0fa2b4d60>]

fig

ax1.set_xlim(-10,110)
ax1.set_ylim(-5,5)

(-5.0, 5.0)

fig

통계예제

- 여러가지 경우의 산점도와 표본상관계수

예제1

np.random.seed(43052)
x1=np.linspace(-1,1,100,endpoint=True)
y1=x1**2+np.random.normal(scale=0.1,size=100)

plt.plot(x1,y1,'o')
plt.title('y=x**2')

Text(0.5, 1.0, 'y=x**2')

np.corrcoef(x1,y1)

array([[1.        , 0.00688718],
       [0.00688718, 1.        ]])

- (표본)상관계수의 값이 0에 가까운 것은 두 변수의 직선관계가 약한것을 의미한 것이지 두 변수 사이에 아무런 함수관계가 없다는 것을 의미하는 것은 아니다.

예제2

- 아래와 같은 자료를 고려하자.

np.random.seed(43052)
x2=np.random.uniform(low=-1,high=1,size=100000)
y2=np.random.uniform(low=-1,high=1,size=100000)

plt.plot(x2,y2,'.')
plt.title('rect')

Text(0.5, 1.0, 'rect')

np.corrcoef(x2,y2)

array([[1.        , 0.00521001],
       [0.00521001, 1.        ]])

예제3

np.random.seed(43052)
_x3=np.random.uniform(low=-1,high=1,size=100000)
_y3=np.random.uniform(low=-1,high=1,size=100000)

plt.plot(_x3,_y3,'.')

[<matplotlib.lines.Line2D at 0x7ff0fadd4760>]

radius = _x3**2+_y3**2

x3=_x3[radius<1]
y3=_y3[radius<1]
plt.plot(_x3,_y3,'.')
plt.plot(x3,y3,'.')

[<matplotlib.lines.Line2D at 0x7ff0faf53730>]

plt.plot(x3,y3,'.')
plt.title('circ')

Text(0.5, 1.0, 'circ')

np.corrcoef(x3,y3)

array([[ 1.        , -0.00362687],
       [-0.00362687,  1.        ]])

숙제 1

- 예제1,2,3 을 하나의 figure안에 subplot 으로 그려보기 (1 $\times$ 3 행렬처럼 그릴것)

예제2~3으로 알아보는 두 변수의 독립성

- 예제2,3에 대하여 아래와 같은 절차를 고려하여 보자.

(1) $X\in [-h,h]$ 일 경우 $Y$ 의 분포를 생각해보자. 그리고 히스토그램을 그려보자.

(2) $X\in [0.9-h,0.9+h]$ 일 경우 $Y$ 의 분포를 생각해보자. 그리고 히스토그램을 그려보자.

(3) (1)-(2)를 비교해보자.

- 그림으로 살펴보자.

h=0.05
plt.hist(y2[(x2> -h )*(x2< h )])

(array([508., 527., 450., 512., 500., 521., 500., 515., 494., 506.]),
 array([-9.99973293e-01, -7.99983163e-01, -5.99993034e-01, -4.00002904e-01,
        -2.00012774e-01, -2.26437887e-05,  1.99967486e-01,  3.99957616e-01,
         5.99947746e-01,  7.99937876e-01,  9.99928006e-01]),
 <BarContainer object of 10 artists>)

h=0.05
_,axs= plt.subplots(2,2) 
axs[0,0].hist(y2[(x2> -h )*(x2< h )])
axs[0,1].hist(y2[(x2> 0.9-h )*(x2< 0.9+h )])
axs[1,0].hist(y3[(x3> -h )*(x3< h )])
axs[1,1].hist(y3[(x3> 0.9-h )*(x3< 0.9+h )])

(array([105., 194., 256., 259., 262., 270., 244., 245., 188.,  64.]),
 array([-0.5171188 , -0.41349885, -0.30987891, -0.20625896, -0.10263902,
         0.00098093,  0.10460087,  0.20822082,  0.31184076,  0.41546071,
         0.51908066]),
 <BarContainer object of 10 artists>)

- 축의범위를 조절하여보자.

h=0.05
_,axs= plt.subplots(2,2) 
axs[0,0].hist(y2[(x2> -h )*(x2< h )])
axs[0,0].set_xlim(-1.1,1.1)
axs[0,1].hist(y2[(x2> 0.9-h )*(x2< 0.9+h )])
axs[0,1].set_xlim(-1.1,1.1)
axs[1,0].hist(y3[(x3> -h )*(x3< h )])
axs[1,0].set_xlim(-1.1,1.1)
axs[1,1].hist(y3[(x3> 0.9-h )*(x3< 0.9+h )])
axs[1,1].set_xlim(-1.1,1.1)

(-1.1, 1.1)

예제4

np.random.seed(43052)
x4=np.random.normal(size=10000)
y4=np.random.normal(size=10000)

plt.plot(x4,y4,'o')

[<matplotlib.lines.Line2D at 0x7ff0f36d0b20>]

plt.plot(x4,y4,'.')

[<matplotlib.lines.Line2D at 0x7ff0f3649400>]

- 디자인적인 측면에서 보면 올바른 시각화라 볼 수 없다. (이 그림이 밀도를 왜곡시킨다)

- 아래와 같은 그림이 더 우수하다. (밀도를 표현하기 위해 투명도라는 개념을 도입)

plt.scatter(x4,y4,alpha=0.01)

<matplotlib.collections.PathCollection at 0x7ff0f352d610>

np.corrcoef(x4,y4)

array([[ 1.        , -0.01007718],
       [-0.01007718,  1.        ]])

h=0.05
fig, _ = plt.subplots(3,3)

fig.tight_layout()

fig

fig.set_figwidth(10)
fig.set_figheight(10)
fig

fig.axes

[<AxesSubplot:>,
 <AxesSubplot:>,
 <AxesSubplot:>,
 <AxesSubplot:>,
 <AxesSubplot:>,
 <AxesSubplot:>,
 <AxesSubplot:>,
 <AxesSubplot:>,
 <AxesSubplot:>]

k=np.linspace(-2,2,9)
k

array([-2. , -1.5, -1. , -0.5,  0. ,  0.5,  1. ,  1.5,  2. ])

h

0.05

h=0.2
for i in range(9):
    fig.axes[i].hist(y4[(x4>k[i]-h) * (x4<k[i]+h)])

fig

숙제 2

plt.scatter(x4,y4,alpha=0.01)

<matplotlib.collections.PathCollection at 0x7ff0f30f1a60>

- 이 그림의 색깔을 붉은색으로 바꿔서 그려보자. (주의: 수업시간에 알려주지 않은 방법임)

plt.scatter(x4,y4,alpha=0.01,'r')

  File "/tmp/ipykernel_1794991/399356376.py", line 1
    plt.scatter(x4,y4,alpha=0.01,'r')
                                 ^
SyntaxError: positional argument follows keyword argument