[SeSAC]혼공 머신러닝+딥러닝 Ch8. 이미지를 위한 인공 신경망

08-1 합성곱 신경망의 구성 요소

합성곱

1개 이상의 합성곱 층을 쓴 인공 신경망
경계선을 찾는 필터로 사용댔던 값

CNN (Convolutional Neural Network)

Convolutional Neural Network (CNN): 이미지 인식, 비디오 분석 등의 시각적 데이터 처리에 특화된 딥 러닝 아키텍처이다.

주요 구성 요소 및 용어

Filter (Kernel, Mask):
- 작은 크기의 행렬로 이미지의 특정 특징을 감지하는 데 사용
- 입력 이미지에 컨볼루션 연산을 적용하여 특징 맵(feature map)을 생성
- 1차원 입력뿐 만 아니라 2차원 입력에도 적용할 수 있는 장점
Feature Map:
- 필터를 통해 이미지에 적용된 결과로, 이미지에서 감지된 특정 특징을 나타냄

Stride:
Filter이 움직이는 간격

Filter을 거쳐 Feature Map이 만들어지는 과정

Pooling (또는 Subsampling):
- 피처 맵의 차원을 줄이거나 다운샘플링하기 위한 과정
- 연산량을 줄이고, 공간적 변동성에 대한 모델의 민감도를 감소시킴
Max Pooling:
- 피처 맵의 영역 내에서 가장 큰 값을 선택하는 풀링 방법
Mean Pooling (또는 Average Pooling):
- 피처 맵의 영역 내에서 모든 값의 평균을 계산하는 풀링 방법

CNN의 작동 원리

입력층:
- 원본 이미지가 네트워크에 입력된다
컨볼루션 레이어:
- 필터(커널)를 사용하여 입력 이미지에 컨볼루션을 적용한다
- 각 필터는 이미지의 다양한 특징 (예: 가장자리, 질감 등)을 감지한다
- 이 과정을 통해 다양한 피처 맵이 생성된다
풀링 레이어:
- 생성된 피처 맵의 크기를 줄이기 위해 풀링이 적용된다
- 이는 연산량을 줄이고, 과적합을 방지하는 데 도움을 준다
완전 연결 레이어 (Fully Connected Layer):
- 마지막으로, 모든 피처 맵은 완전 연결 레이어를 통과하여 최종 분류 결과를 생성한다
출력층:
- 네트워크의 최종 출력을 제공하며, 분류 문제의 경우 각 클래스에 대한 확률을 나타낸다

08-2 합성곱 신경망을 사용한 이미지 분류

  
# 실행마다 동일한 결과를 얻기 위해 케라스에 랜덤 시드를 사용하고 텐서플로 연산을 결정적으로 만듭니다.
import tensorflow as tf

tf.keras.utils.set_random_seed(42)
tf.config.experimental.enable_op_determinism()

패션 MNIST 데이터 불러오기

  
from tensorflow import keras
from sklearn.model_selection import train_test_split

(train_input, train_target), (test_input, test_target) = \
    keras.datasets.fashion_mnist.load_data()

train_scaled = train_input.reshape(-1, 28, 28, 1) / 255.0

train_scaled, val_scaled, train_target, val_target = train_test_split(
    train_scaled, train_target, test_size=0.2, random_state=42)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
29515/29515 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26421880/26421880 [==============================] - 1s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
5148/5148 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4422102/4422102 [==============================] - 1s 0us/step

합성곱 신경망 만들기

  
model = keras.Sequential()

  
model.add(keras.layers.Conv2D(32, kernel_size=3, activation='relu',
                              padding='same', input_shape=(28,28,1)))

padding='same': input data에서 줄어든 feature map을 기존의 사이즈와 같게 해줌(0으로 채움)

  
model.add(keras.layers.MaxPooling2D(2))

  
model.add(keras.layers.Conv2D(64, kernel_size=(3,3), activation='relu',
                              padding='same'))
model.add(keras.layers.MaxPooling2D(2))

  
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(100, activation='relu'))
model.add(keras.layers.Dropout(0.4))
model.add(keras.layers.Dense(10, activation='softmax'))

  
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 28, 28, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 14, 14, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 14, 14, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 7, 7, 64)         0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 3136)              0         
                                                                 
 dense (Dense)               (None, 100)               313700    
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense_1 (Dense)             (None, 10)                1010      
                                                                 
=================================================================
Total params: 333,526
Trainable params: 333,526
Non-trainable params: 0
_________________________________________________________________

  
keras.utils.plot_model(model)

  
keras.utils.plot_model(model, show_shapes=True)

모델 컴파일과 훈련

  
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
              metrics='accuracy')

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-cnn-model.h5',
                                                save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=2,
                                                  restore_best_weights=True)

history = model.fit(train_scaled, train_target, epochs=20,
                    validation_data=(val_scaled, val_target),
                    callbacks=[checkpoint_cb, early_stopping_cb])

Epoch 1/20
1500/1500 [==============================] - 20s 5ms/step - loss: 0.5073 - accuracy: 0.8198 - val_loss: 0.3131 - val_accuracy: 0.8833
Epoch 2/20
1500/1500 [==============================] - 6s 4ms/step - loss: 0.3361 - accuracy: 0.8782 - val_loss: 0.2760 - val_accuracy: 0.8951
Epoch 3/20
1500/1500 [==============================] - 7s 5ms/step - loss: 0.2912 - accuracy: 0.8952 - val_loss: 0.2475 - val_accuracy: 0.9087
Epoch 4/20
1500/1500 [==============================] - 6s 4ms/step - loss: 0.2575 - accuracy: 0.9064 - val_loss: 0.2417 - val_accuracy: 0.9109
Epoch 5/20
1500/1500 [==============================] - 8s 5ms/step - loss: 0.2349 - accuracy: 0.9132 - val_loss: 0.2266 - val_accuracy: 0.9166
Epoch 6/20
1500/1500 [==============================] - 7s 5ms/step - loss: 0.2155 - accuracy: 0.9215 - val_loss: 0.2178 - val_accuracy: 0.9197
Epoch 7/20
1500/1500 [==============================] - 7s 5ms/step - loss: 0.1999 - accuracy: 0.9252 - val_loss: 0.2090 - val_accuracy: 0.9232
Epoch 8/20
1500/1500 [==============================] - 8s 5ms/step - loss: 0.1820 - accuracy: 0.9319 - val_loss: 0.2211 - val_accuracy: 0.9186
Epoch 9/20
1500/1500 [==============================] - 7s 4ms/step - loss: 0.1707 - accuracy: 0.9366 - val_loss: 0.2150 - val_accuracy: 0.9244

  
import matplotlib.pyplot as plt

  
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

  
model.evaluate(val_scaled, val_target)

375/375 [==============================] - 1s 2ms/step - loss: 0.2090 - accuracy: 0.9232

[0.20898833870887756, 0.9231666922569275]

  
plt.imshow(val_scaled[0].reshape(28, 28), cmap='gray_r')
plt.show()

  
preds = model.predict(val_scaled[0:1])
print(preds)

1/1 [==============================] - 0s 118ms/step
[[4.0748841e-15 2.2528982e-23 1.1966817e-18 3.3831836e-18 3.4746105e-17
  1.5839887e-14 1.3949707e-13 3.1241570e-14 1.0000000e+00 7.2195376e-16]]

  
plt.bar(range(1, 11), preds[0])
plt.xlabel('class')
plt.ylabel('prob.')
plt.show()

  
classes = ['티셔츠', '바지', '스웨터', '드레스', '코트',
           '샌달', '셔츠', '스니커즈', '가방', '앵클 부츠']

  
import numpy as np
print(classes[np.argmax(preds)])

가방

  
test_scaled = test_input.reshape(-1, 28, 28, 1) / 255.0

  
model.evaluate(test_scaled, test_target)

313/313 [==============================] - 1s 3ms/step - loss: 0.2338 - accuracy: 0.9157

[0.23381413519382477, 0.9157000184059143]

바닐라 CNN: 가장 기본이 되는 CNN 모델

입력값의 수보다 노드의 수가 많아야 한다
keras.layers.Conv2D
parameters:
filters: int, the dimension of the output space (the number of filters in the convolution).
kernel_size: int or tuple/list of 2 integer, specifying the size of the convolution window.
strides: int or tuple/list of 2 integer, specifying the stride length of the convolution. strides > 1 is incompatible with dilation_rate > 1.
padding: string, either "valid" or "same" (case-insensitive). "valid" means no padding. "same" results in padding evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.
data_format: string, either "channels_last" or "channels_first". The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape (batch_size, channels, height, width) while "channels_first" corresponds to inputs with shape (batch_size, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".
dilation_rate: int or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution.
groups: A positive int specifying the number of groups in which the input is split along the channel axis. Each group is convolved separately with filters // groups filters. The output is the concatenation of all the groups results along the channel axis. Input channels and filters must both be divisible by groups.
activation: Activation function. If None, no activation is applied.
use_bias: bool, if True, bias will be added to the output.
kernel_initializer: Initializer for the convolution kernel. If None, the default initializer ("glorot_uniform") will be used.
bias_initializer: Initializer for the bias vector. If None, the default initializer ("zeros") will be used.
kernel_regularizer: Optional regularizer for the convolution kernel.
bias_regularizer: Optional regularizer for the bias vector.
activity_regularizer: Optional regularizer function for the output.
kernel_constraint: Optional projection function to be applied to the kernel after being updated by an Optimizer (e.g. used to implement norm constraints or value constraints for layer weights). The function must take as input the unprojected variable and must return the projected variable (which must have the same shape). Constraints are not safe to use when doing asynchronous distributed training.
bias_constraint: Optional projection function to be applied to the bias after being updated by an Optimizer.

08-3 합성곱 신경망의 시각화

XAI: 설명 가능한 인공지능

DNN: Deep Neural Network

정형 데이터

CNN: Convolutional Neural Network

면적이 있는 데이터: (이미지, 자연어 뿐만 아니라 정형도 가능함)

RNN: Recurrent Neural Network

순서

합성곱 신경망의 시각화

  
# 실행마다 동일한 결과를 얻기 위해 케라스에 랜덤 시드를 사용하고 텐서플로 연산을 결정적으로 만듭니다.
import tensorflow as tf

tf.keras.utils.set_random_seed(42)
tf.config.experimental.enable_op_determinism()

가중치 시각화

  
from tensorflow import keras

  
# 코랩에서 실행하는 경우에는 다음 명령을 실행하여 best-cnn-model.h5 파일을 다운로드받아 사용하세요.
!wget https://github.com/rickiepark/hg-mldl/raw/master/best-cnn-model.h5

--2023-07-14 08:39:47--  https://github.com/rickiepark/hg-mldl/raw/master/best-cnn-model.h5
Resolving github.com (github.com)... 20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/rickiepark/hg-mldl/master/best-cnn-model.h5 [following]
--2023-07-14 08:39:48--  https://raw.githubusercontent.com/rickiepark/hg-mldl/master/best-cnn-model.h5
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4046712 (3.9M) [application/octet-stream]
Saving to: ‘best-cnn-model.h5’

best-cnn-model.h5   100%[===================>]   3.86M  --.-KB/s    in 0.02s   

2023-07-14 08:39:49 (222 MB/s) - ‘best-cnn-model.h5’ saved [4046712/4046712]

  
model = keras.models.load_model('best-cnn-model.h5')

  
model.layers

[<keras.layers.convolutional.conv2d.Conv2D at 0x7a0182544820>,
 <keras.layers.pooling.max_pooling2d.MaxPooling2D at 0x7a020cc2d840>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x7a0179a19060>,
 <keras.layers.pooling.max_pooling2d.MaxPooling2D at 0x7a0179a189a0>,
 <keras.layers.reshaping.flatten.Flatten at 0x7a01790ad450>,
 <keras.layers.core.dense.Dense at 0x7a01790ad390>,
 <keras.layers.regularization.dropout.Dropout at 0x7a01790ae2f0>,
 <keras.layers.core.dense.Dense at 0x7a01790aeb90>]

  
conv = model.layers[0]

print(conv.weights[0].shape, conv.weights[1].shape)

(3, 3, 1, 32) (32,)

  
conv_weights = conv.weights[0].numpy()

print(conv_weights.mean(), conv_weights.std())

-0.02494116 0.24951957

  
import matplotlib.pyplot as plt

  
plt.hist(conv_weights.reshape(-1, 1))
plt.xlabel('weight')
plt.ylabel('count')
plt.show()

  
fig, axs = plt.subplots(2, 16, figsize=(15,2))

for i in range(2):
    for j in range(16):
        axs[i, j].imshow(conv_weights[:,:,0,i*16 + j], vmin=-0.5, vmax=0.5)
        axs[i, j].axis('off')

plt.show()

  
no_training_model = keras.Sequential()

no_training_model.add(keras.layers.Conv2D(32, kernel_size=3, activation='relu',
                                          padding='same', input_shape=(28,28,1)))

  
no_training_conv = no_training_model.layers[0]

print(no_training_conv.weights[0].shape)

(3, 3, 1, 32)

  
no_training_weights = no_training_conv.weights[0].numpy()

print(no_training_weights.mean(), no_training_weights.std())

-0.010310263 0.0773888

  
plt.hist(no_training_weights.reshape(-1, 1))
plt.xlabel('weight')
plt.ylabel('count')
plt.show()

  
fig, axs = plt.subplots(2, 16, figsize=(15,2))

for i in range(2):
    for j in range(16):
        axs[i, j].imshow(no_training_weights[:,:,0,i*16 + j], vmin=-0.5, vmax=0.5)
        axs[i, j].axis('off')

plt.show()

함수형 API

  
print(model.input)

KerasTensor(type_spec=TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='conv2d_input'), name='conv2d_input', description="created by layer 'conv2d_input'")

  
conv_acti = keras.Model(model.input, model.layers[0].output)

특성 맵 시각화

  
(train_input, train_target), (test_input, test_target) = keras.datasets.fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
29515/29515 [==============================] - 0s 1us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26421880/26421880 [==============================] - 2s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
5148/5148 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4422102/4422102 [==============================] - 1s 0us/step

  
plt.imshow(train_input[0], cmap='gray_r')
plt.show()

  
inputs = train_input[0:1].reshape(-1, 28, 28, 1)/255.0

feature_maps = conv_acti.predict(inputs)

1/1 [==============================] - 7s 7s/step

  
print(feature_maps.shape)

(1, 28, 28, 32)

  
fig, axs = plt.subplots(4, 8, figsize=(15,8))

for i in range(4):
    for j in range(8):
        axs[i, j].imshow(feature_maps[0,:,:,i*8 + j])
        axs[i, j].axis('off')

plt.show()

  
conv2_acti = keras.Model(model.input, model.layers[2].output)

  
feature_maps = conv2_acti.predict(train_input[0:1].reshape(-1, 28, 28, 1)/255.0)

1/1 [==============================] - 0s 99ms/step

  
print(feature_maps.shape)

(1, 14, 14, 64)

  
fig, axs = plt.subplots(8, 8, figsize=(12,12))

for i in range(8):
    for j in range(8):
        axs[i, j].imshow(feature_maps[0,:,:,i*8 + j])
        axs[i, j].axis('off')

plt.show()

[SeSAC]혼공 머신러닝+딥러닝 Ch8. 이미지를 위한 인공 신경망

08-1 합성곱 신경망의 구성 요소

합성곱

CNN (Convolutional Neural Network)

주요 구성 요소 및 용어

CNN의 작동 원리

08-2 합성곱 신경망을 사용한 이미지 분류

패션 MNIST 데이터 불러오기

합성곱 신경망 만들기

모델 컴파일과 훈련

keras.layers.Conv2D

08-3 합성곱 신경망의 시각화

합성곱 신경망의 시각화

가중치 시각화

함수형 API

특성 맵 시각화

Further Reading

[SeSAC]혼공 머신러닝+딥러닝 Ch3, 4

[SeSAC]혼공 머신러닝+딥러닝 Ch5. 트리 알고리즘

[SeSAC]혼공 머신러닝+딥러닝 Ch6. 비지도 학습

`keras.layers.Conv2D`