Home [SeSAC]혼공 머신러닝+딥러닝 Ch9. 텍스트를 위한 인공 신경망
Post
Cancel

[SeSAC]혼공 머신러닝+딥러닝 Ch9. 텍스트를 위한 인공 신경망

09-1 순차 데이터와 순환 신경망

벡터: 크기가 있는 스칼라에 방향을 추가함으로써 같은 크기임에도 다른 숫자로 표현 가능하게 한다

순환 신경망(RNN: Recurrent Neural Network)

weighted-sum(가중치합): perceptron이 계산하는것

N:1 구조

  • 입력 여러개: 출력 1개

autoencoder: encode & decode

  • N:1 진행 후 1:M로 나오게 함
  • N과 M이 달라도 된다

image embedding

word embedding

  • 단어의 중요성: 빈도수(긍정과 부정에 골고루 나타난다면 제거)
    • 중요한 단어의 가중치를 올려줌
  • 단어 간 관련: 같은 문장에 등장
    • 비슷한 단어들끼리 근처에 둠

09-2 순환 신경망으로 IMDB 리뷰 분류하기

1
2
3
4
5
# 실행마다 동일한 결과를 얻기 위해 케라스에 랜덤 시드를 사용하고 텐서플로 연산을 결정적으로 만듭니다.
import tensorflow as tf

tf.keras.utils.set_random_seed(42)
tf.config.experimental.enable_op_determinism()

IMDB 리뷰 데이터셋

1
2
3
4
from tensorflow.keras.datasets import imdb

(train_input, train_target), (test_input, test_target) = imdb.load_data(
    num_words=300)
1
2
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
17464789/17464789 [==============================] - 0s 0us/step

num_words=300: 문장 길이 300이라는 뜻이 아니라 300개의 어휘를 사용한다는 뜻

1
print(train_input.shape, test_input.shape)
1
(25000,) (25000,)
1
print(len(train_input[0]))
1
218
1
print(len(train_input[1]))
1
189
1
print(train_input[0])
1
[1, 14, 22, 16, 43, 2, 2, 2, 2, 65, 2, 2, 66, 2, 4, 173, 36, 256, 5, 25, 100, 43, 2, 112, 50, 2, 2, 9, 35, 2, 284, 5, 150, 4, 172, 112, 167, 2, 2, 2, 39, 4, 172, 2, 2, 17, 2, 38, 13, 2, 4, 192, 50, 16, 6, 147, 2, 19, 14, 22, 4, 2, 2, 2, 4, 22, 71, 87, 12, 16, 43, 2, 38, 76, 15, 13, 2, 4, 22, 17, 2, 17, 12, 16, 2, 18, 2, 5, 62, 2, 12, 8, 2, 8, 106, 5, 4, 2, 2, 16, 2, 66, 2, 33, 4, 130, 12, 16, 38, 2, 5, 25, 124, 51, 36, 135, 48, 25, 2, 33, 6, 22, 12, 215, 28, 77, 52, 5, 14, 2, 16, 82, 2, 8, 4, 107, 117, 2, 15, 256, 4, 2, 7, 2, 5, 2, 36, 71, 43, 2, 2, 26, 2, 2, 46, 7, 4, 2, 2, 13, 104, 88, 4, 2, 15, 297, 98, 32, 2, 56, 26, 141, 6, 194, 2, 18, 4, 226, 22, 21, 134, 2, 26, 2, 5, 144, 30, 2, 18, 51, 36, 28, 224, 92, 25, 104, 4, 226, 65, 16, 38, 2, 88, 12, 16, 283, 5, 16, 2, 113, 103, 32, 15, 16, 2, 19, 178, 32]
1
print(train_target[:20])
1
[1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 1 1 0 1]
1
2
3
4
from sklearn.model_selection import train_test_split

train_input, val_input, train_target, val_target = train_test_split(
    train_input, train_target, test_size=0.2, random_state=42)
1
2
3
import numpy as np

lengths = np.array([len(x) for x in train_input])
1
print(np.mean(lengths), np.median(lengths))
1
239.00925 178.0
1
2
3
4
5
6
import matplotlib.pyplot as plt

plt.hist(lengths)
plt.xlabel('length')
plt.ylabel('frequency')
plt.show()

롱테일, 파레토 그래프

1
2
3
from tensorflow.keras.preprocessing.sequence import pad_sequences

train_seq = pad_sequences(train_input, maxlen=100)

100단어 이상으로 넘어가면 잘라주고, 100단어 이하면 0으로 채워줌

1
print(train_seq.shape)
1
(20000, 100)
1
print(train_seq[0])
1
2
3
4
5
6
[ 10   4  20   9   2   2   2   5  45   6   2   2  33 269   8   2 142   2
   5   2  17  73  17 204   5   2  19  55   2   2  92  66 104  14  20  93
  76   2 151  33   4  58  12 188   2 151  12 215  69 224 142  73 237   6
   2   7   2   2 188   2 103  14  31  10  10   2   7   2   5   2  80  91
   2  30   2  34  14  20 151  50  26 131  49   2  84  46  50  37  80  79
   6   2  46   7  14  20  10  10   2 158]
1
print(train_input[0][-10:])
1
[6, 2, 46, 7, 14, 20, 10, 10, 2, 158]

pad_sequences할때 뒤를 자른게 아니라 앞을 자름

1
print(train_seq[5])
1
2
3
4
5
6
[  0   0   0   0   1   2 195  19  49   2   2 190   4   2   2   2 183  10
  10  13  82  79   4   2  36  71 269   8   2  25  19  49   7   4   2   2
   2   2   2  10  10  48  25  40   2  11   2   2  40   2   2   5   4   2
   2  95  14 238  56 129   2  10  10  21   2  94   2   2   2   2  11 190
  24   2   2   7  94 205   2  10  10  87   2  34  49   2   7   2   2   2
   2   2 290   2  46  48  64  18   4   2]
1
val_seq = pad_sequences(val_input, maxlen=100)

순환 신경망 만들기

1
2
3
4
5
6
from tensorflow import keras

model = keras.Sequential()

model.add(keras.layers.SimpleRNN(8, input_shape=(100, 300))) # 100: 단어 개수 300: 벡터 개수=가중치 개수
model.add(keras.layers.Dense(1, activation='sigmoid'))
1
train_oh = keras.utils.to_categorical(train_seq)

to_categorical: one-hot encoding

1
print(train_oh.shape)
1
(20000, 100, 300)
1
print(train_oh[0][0][:12])
1
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
1
print(np.sum(train_oh[0][0]))
1
1.0
1
val_oh = keras.utils.to_categorical(val_seq)
1
model.summary()
1
2
3
4
5
6
7
8
9
10
11
12
13
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 simple_rnn (SimpleRNN)      (None, 8)                 2472      
                                                                 
 dense (Dense)               (None, 1)                 9         
                                                                 
=================================================================
Total params: 2,481
Trainable params: 2,481
Non-trainable params: 0
_________________________________________________________________

순환 신경망 훈련하기

1
2
3
4
5
6
7
8
9
10
11
12
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model.compile(optimizer=rmsprop, loss='binary_crossentropy',
              metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-simplernn-model.h5',
                                                save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,
                                                  restore_best_weights=True)

history = model.fit(train_oh, train_target, epochs=100, batch_size=64,
                    validation_data=(val_oh, val_target),
                    callbacks=[checkpoint_cb, early_stopping_cb])

batch_size가 NLP에선 이미지에 비해 큰 이유: 다양성이 많아 오류가 날 확률이 커서 128, 256을 많이 줌

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
Epoch 1/100
313/313 [==============================] - 31s 81ms/step - loss: 0.7003 - accuracy: 0.5002 - val_loss: 0.6970 - val_accuracy: 0.5058
Epoch 2/100
313/313 [==============================] - 24s 78ms/step - loss: 0.6956 - accuracy: 0.5123 - val_loss: 0.6946 - val_accuracy: 0.5124
Epoch 3/100
313/313 [==============================] - 24s 76ms/step - loss: 0.6917 - accuracy: 0.5282 - val_loss: 0.6909 - val_accuracy: 0.5318
Epoch 4/100
313/313 [==============================] - 24s 77ms/step - loss: 0.6844 - accuracy: 0.5549 - val_loss: 0.6833 - val_accuracy: 0.5690
Epoch 5/100
313/313 [==============================] - 24s 76ms/step - loss: 0.6784 - accuracy: 0.5778 - val_loss: 0.6797 - val_accuracy: 0.5770
Epoch 6/100
313/313 [==============================] - 24s 75ms/step - loss: 0.6733 - accuracy: 0.5892 - val_loss: 0.6753 - val_accuracy: 0.5856
Epoch 7/100
313/313 [==============================] - 23s 75ms/step - loss: 0.6679 - accuracy: 0.6072 - val_loss: 0.6696 - val_accuracy: 0.5980
Epoch 8/100
313/313 [==============================] - 24s 76ms/step - loss: 0.6616 - accuracy: 0.6195 - val_loss: 0.6633 - val_accuracy: 0.6104
Epoch 9/100
313/313 [==============================] - 24s 76ms/step - loss: 0.6539 - accuracy: 0.6312 - val_loss: 0.6568 - val_accuracy: 0.6200
Epoch 10/100
313/313 [==============================] - 23s 75ms/step - loss: 0.6462 - accuracy: 0.6448 - val_loss: 0.6474 - val_accuracy: 0.6372
Epoch 11/100
313/313 [==============================] - 23s 72ms/step - loss: 0.6375 - accuracy: 0.6571 - val_loss: 0.6414 - val_accuracy: 0.6506
Epoch 12/100
313/313 [==============================] - 23s 74ms/step - loss: 0.6282 - accuracy: 0.6684 - val_loss: 0.6304 - val_accuracy: 0.6614
Epoch 13/100
313/313 [==============================] - 23s 75ms/step - loss: 0.6194 - accuracy: 0.6791 - val_loss: 0.6212 - val_accuracy: 0.6740
Epoch 14/100
313/313 [==============================] - 24s 76ms/step - loss: 0.6094 - accuracy: 0.6877 - val_loss: 0.6124 - val_accuracy: 0.6842
Epoch 15/100
313/313 [==============================] - 24s 77ms/step - loss: 0.5996 - accuracy: 0.7000 - val_loss: 0.6027 - val_accuracy: 0.6918
Epoch 16/100
313/313 [==============================] - 24s 75ms/step - loss: 0.5903 - accuracy: 0.7082 - val_loss: 0.5928 - val_accuracy: 0.6978
Epoch 17/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5818 - accuracy: 0.7134 - val_loss: 0.5849 - val_accuracy: 0.7026
Epoch 18/100
313/313 [==============================] - 23s 75ms/step - loss: 0.5731 - accuracy: 0.7204 - val_loss: 0.5782 - val_accuracy: 0.7106
Epoch 19/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5647 - accuracy: 0.7261 - val_loss: 0.5691 - val_accuracy: 0.7162
Epoch 20/100
313/313 [==============================] - 24s 77ms/step - loss: 0.5570 - accuracy: 0.7321 - val_loss: 0.5623 - val_accuracy: 0.7204
Epoch 21/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5510 - accuracy: 0.7337 - val_loss: 0.5561 - val_accuracy: 0.7254
Epoch 22/100
313/313 [==============================] - 24s 78ms/step - loss: 0.5445 - accuracy: 0.7391 - val_loss: 0.5517 - val_accuracy: 0.7260
Epoch 23/100
313/313 [==============================] - 25s 79ms/step - loss: 0.5382 - accuracy: 0.7420 - val_loss: 0.5477 - val_accuracy: 0.7296
Epoch 24/100
313/313 [==============================] - 24s 77ms/step - loss: 0.5336 - accuracy: 0.7470 - val_loss: 0.5423 - val_accuracy: 0.7320
Epoch 25/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5300 - accuracy: 0.7472 - val_loss: 0.5383 - val_accuracy: 0.7340
Epoch 26/100
313/313 [==============================] - 23s 74ms/step - loss: 0.5261 - accuracy: 0.7509 - val_loss: 0.5358 - val_accuracy: 0.7370
Epoch 27/100
313/313 [==============================] - 23s 74ms/step - loss: 0.5223 - accuracy: 0.7531 - val_loss: 0.5326 - val_accuracy: 0.7394
Epoch 28/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5194 - accuracy: 0.7552 - val_loss: 0.5321 - val_accuracy: 0.7390
Epoch 29/100
313/313 [==============================] - 24s 77ms/step - loss: 0.5160 - accuracy: 0.7574 - val_loss: 0.5453 - val_accuracy: 0.7302
Epoch 30/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5134 - accuracy: 0.7582 - val_loss: 0.5268 - val_accuracy: 0.7430
Epoch 31/100
313/313 [==============================] - 23s 75ms/step - loss: 0.5115 - accuracy: 0.7595 - val_loss: 0.5254 - val_accuracy: 0.7450
Epoch 32/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5092 - accuracy: 0.7615 - val_loss: 0.5248 - val_accuracy: 0.7450
Epoch 33/100
313/313 [==============================] - 24s 77ms/step - loss: 0.5071 - accuracy: 0.7620 - val_loss: 0.5239 - val_accuracy: 0.7424
Epoch 34/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5047 - accuracy: 0.7627 - val_loss: 0.5239 - val_accuracy: 0.7430
Epoch 35/100
313/313 [==============================] - 24s 75ms/step - loss: 0.5040 - accuracy: 0.7640 - val_loss: 0.5217 - val_accuracy: 0.7472
Epoch 36/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5023 - accuracy: 0.7645 - val_loss: 0.5195 - val_accuracy: 0.7456
Epoch 37/100
313/313 [==============================] - 24s 76ms/step - loss: 0.5008 - accuracy: 0.7658 - val_loss: 0.5200 - val_accuracy: 0.7474
Epoch 38/100
313/313 [==============================] - 24s 78ms/step - loss: 0.4994 - accuracy: 0.7647 - val_loss: 0.5198 - val_accuracy: 0.7484
Epoch 39/100
313/313 [==============================] - 25s 80ms/step - loss: 0.4982 - accuracy: 0.7648 - val_loss: 0.5206 - val_accuracy: 0.7472
1
2
3
4
5
6
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

단어 임베딩을 사용하기

1
2
3
4
5
6
7
model2 = keras.Sequential()

model2.add(keras.layers.Embedding(300, 16, input_length=100))
model2.add(keras.layers.SimpleRNN(8))
model2.add(keras.layers.Dense(1, activation='sigmoid'))

model2.summary()

16개의 벡터로 단어 표현

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding (Embedding)       (None, 100, 16)           4800      
                                                                 
 simple_rnn_1 (SimpleRNN)    (None, 8)                 200       
                                                                 
 dense_1 (Dense)             (None, 1)                 9         
                                                                 
=================================================================
Total params: 5,009
Trainable params: 5,009
Non-trainable params: 0
_________________________________________________________________
1
2
3
4
5
6
7
8
9
10
11
12
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model2.compile(optimizer=rmsprop, loss='binary_crossentropy',
               metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-embedding-model.h5',
                                                save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,
                                                  restore_best_weights=True)

history = model2.fit(train_seq, train_target, epochs=100, batch_size=64,
                     validation_data=(val_seq, val_target),
                     callbacks=[checkpoint_cb, early_stopping_cb])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
Epoch 1/100
313/313 [==============================] - 42s 127ms/step - loss: 0.6893 - accuracy: 0.5351 - val_loss: 0.6706 - val_accuracy: 0.5872
Epoch 2/100
313/313 [==============================] - 36s 116ms/step - loss: 0.6399 - accuracy: 0.6467 - val_loss: 0.6234 - val_accuracy: 0.6664
Epoch 3/100
313/313 [==============================] - 36s 114ms/step - loss: 0.6051 - accuracy: 0.6941 - val_loss: 0.6003 - val_accuracy: 0.6948
Epoch 4/100
313/313 [==============================] - 37s 118ms/step - loss: 0.5831 - accuracy: 0.7172 - val_loss: 0.5888 - val_accuracy: 0.7026
Epoch 5/100
313/313 [==============================] - 37s 117ms/step - loss: 0.5663 - accuracy: 0.7305 - val_loss: 0.5669 - val_accuracy: 0.7300
Epoch 6/100
313/313 [==============================] - 36s 114ms/step - loss: 0.5527 - accuracy: 0.7408 - val_loss: 0.5536 - val_accuracy: 0.7356
Epoch 7/100
313/313 [==============================] - 37s 118ms/step - loss: 0.5410 - accuracy: 0.7475 - val_loss: 0.5422 - val_accuracy: 0.7404
Epoch 8/100
313/313 [==============================] - 36s 116ms/step - loss: 0.5313 - accuracy: 0.7509 - val_loss: 0.5443 - val_accuracy: 0.7352
Epoch 9/100
313/313 [==============================] - 36s 114ms/step - loss: 0.5236 - accuracy: 0.7549 - val_loss: 0.5463 - val_accuracy: 0.7238
Epoch 10/100
313/313 [==============================] - 36s 115ms/step - loss: 0.5177 - accuracy: 0.7578 - val_loss: 0.5377 - val_accuracy: 0.7382
Epoch 11/100
313/313 [==============================] - 35s 113ms/step - loss: 0.5135 - accuracy: 0.7591 - val_loss: 0.5264 - val_accuracy: 0.7454
Epoch 12/100
313/313 [==============================] - 35s 112ms/step - loss: 0.5094 - accuracy: 0.7619 - val_loss: 0.5265 - val_accuracy: 0.7422
Epoch 13/100
313/313 [==============================] - 36s 116ms/step - loss: 0.5056 - accuracy: 0.7641 - val_loss: 0.5207 - val_accuracy: 0.7468
Epoch 14/100
313/313 [==============================] - 36s 114ms/step - loss: 0.5027 - accuracy: 0.7657 - val_loss: 0.5226 - val_accuracy: 0.7434
Epoch 15/100
313/313 [==============================] - 36s 117ms/step - loss: 0.4998 - accuracy: 0.7689 - val_loss: 0.5678 - val_accuracy: 0.6992
Epoch 16/100
313/313 [==============================] - 36s 116ms/step - loss: 0.4977 - accuracy: 0.7677 - val_loss: 0.5155 - val_accuracy: 0.7500
Epoch 17/100
313/313 [==============================] - 37s 119ms/step - loss: 0.4944 - accuracy: 0.7713 - val_loss: 0.5222 - val_accuracy: 0.7450
Epoch 18/100
313/313 [==============================] - 36s 115ms/step - loss: 0.4927 - accuracy: 0.7712 - val_loss: 0.5149 - val_accuracy: 0.7486
Epoch 19/100
313/313 [==============================] - 36s 114ms/step - loss: 0.4905 - accuracy: 0.7730 - val_loss: 0.5138 - val_accuracy: 0.7464
Epoch 20/100
313/313 [==============================] - 37s 117ms/step - loss: 0.4880 - accuracy: 0.7740 - val_loss: 0.5124 - val_accuracy: 0.7500
Epoch 21/100
313/313 [==============================] - 36s 115ms/step - loss: 0.4871 - accuracy: 0.7749 - val_loss: 0.5143 - val_accuracy: 0.7488
Epoch 22/100
313/313 [==============================] - 36s 114ms/step - loss: 0.4847 - accuracy: 0.7775 - val_loss: 0.5148 - val_accuracy: 0.7486
Epoch 23/100
313/313 [==============================] - 36s 116ms/step - loss: 0.4830 - accuracy: 0.7775 - val_loss: 0.5135 - val_accuracy: 0.7500
1
2
3
4
5
6
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

09-3 LSTM과 GRU 셀

  • Long Short-Term Memory
  • 단기 기억을 오래 기억하기 위해 고안됨

은닉 상태:

  • 입력과 이전 타임스텝의 은닉 상태를 가중치에 곱한 후 활성화 함수를 통과시켜 다음 은닉 상태를 만든다

셀 상태(Cell state):

  • 다음 층으로 전달되지 않고 LSTM 셀에서 순환만 되는 값

SNN: Spiking Neural Network

LSTM 셀

  • 1개의 큰 셀 안에 4개의 게이트 포함

  • 삭제 게이트 (x 연산)

  • 입력 게이트 (+ 연산)

  • 출력 게이트

LSTM의 구조

  1. 셀 상태 (Cell State) $C_{t}$: LSTM의 핵심 메모리 부분으로, 과거의 정보를 보존하는 역할을 한다. 셀 상태는 각 타임 스텝(time step)에서 업데이트되며, 기억해야 하는 정보와 잊어야 하는 정보를 포함하고있다.

  2. 입력 게이트 (Input Gate) $X_{t}$: 입력 게이트는 현재 입력을 얼마나 반영할지를 결정한다.

2-1. 타임 스텝($x_{t}$)에서의 입력 벡터와 이전 타임 스텝($h_{t-1}$)에서의 은닉 상태(hidden state)를 결합한 결과를 만들고, 이 결과를 tanh 함수에 통과시켜 값의 범위를 -1에서 1 사이로 압축한다. 이 값은 새로운 후보 셀 상태(candidate cell state)다.

2-2. 또 다른 입력 게이트에서 시그모이드 함수를 사용하여, 현재 입력이 얼마나 반영될지를 0과 1 사이의 값으로 결정한다. 1에 가까울수록 현재 입력을 많이 반영하게 된다. 이 값은 후보 셀 상태에 곱해져서 셀 상태를 업데이트하고 셀 상태에 더해진다.

  1. 망각 게이트 (Forget Gate): 망각 게이트는 셀 상태에서 어떤 정보를 잊어야 하는지를 결정한다. 시그모이드 함수를 사용하여 0과 1 사이의 값을 생성하며, 0에 가까울수록 해당 정보를 많이 잊게 된다.

  2. 출력 게이트 (Output Gate): 출력 게이트는 셀 상태를 기반으로 현재 타임 스텝에서 어떤 값을 출력할지를 결정한다. 이 게이트는 시그모이드와 tanh 함수를 사용하여 값을 생성하고, 이 두 값을 곱한 값을 출력으로 사용한다.

  3. 셀 상태 업데이트: 입력 게이트를 통해 새로운 정보가 추가되고, 망각 게이트를 통해 일부 정보가 삭제된다. 이렇게 업데이트된 셀 상태는 출력 게이트를 통해 최종 출력으로 사용된다.

  • 큰 노드 1개에 작은 노드 4개가 들어가있는 구조
  • cell state -> 이 단어를 얼마나 기억해야할지 결정
  • 잊어야 하는 값에는 음수를 곱하거나 작은값을 곱하기(x) 해서 빠르게 잊도록 한다.
  • 기억해야 하는 값에는 시그모이드와, tanh를 곱한 뒤 더해준다(+).
1
2
3
4
5
# 실행마다 동일한 결과를 얻기 위해 케라스에 랜덤 시드를 사용하고 텐서플로 연산을 결정적으로 만듭니다.
import tensorflow as tf

tf.keras.utils.set_random_seed(42)
tf.config.experimental.enable_op_determinism()

LSTM 신경망 훈련하기

1
2
3
4
5
6
7
8
from tensorflow.keras.datasets import imdb
from sklearn.model_selection import train_test_split

(train_input, train_target), (test_input, test_target) = imdb.load_data(
    num_words=500)

train_input, val_input, train_target, val_target = train_test_split(
    train_input, train_target, test_size=0.2, random_state=42)
1
2
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
17464789/17464789 [==============================] - 2s 0us/step
1
2
3
4
from tensorflow.keras.preprocessing.sequence import pad_sequences

train_seq = pad_sequences(train_input, maxlen=100)
val_seq = pad_sequences(val_input, maxlen=100)
1
2
3
4
5
6
7
8
9
from tensorflow import keras

model = keras.Sequential()

model.add(keras.layers.Embedding(500, 16, input_length=100))
model.add(keras.layers.LSTM(8))
model.add(keras.layers.Dense(1, activation='sigmoid'))

model.summary()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding (Embedding)       (None, 100, 16)           8000      
                                                                 
 lstm (LSTM)                 (None, 8)                 800       
                                                                 
 dense (Dense)               (None, 1)                 9         
                                                                 
=================================================================
Total params: 8,809
Trainable params: 8,809
Non-trainable params: 0
_________________________________________________________________
1
2
3
4
5
6
7
8
9
10
11
12
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model.compile(optimizer=rmsprop, loss='binary_crossentropy',
              metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-lstm-model.h5',
                                                save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,
                                                  restore_best_weights=True)

history = model.fit(train_seq, train_target, epochs=100, batch_size=64,
                    validation_data=(val_seq, val_target),
                    callbacks=[checkpoint_cb, early_stopping_cb])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Epoch 1/100
313/313 [==============================] - 23s 44ms/step - loss: 0.6927 - accuracy: 0.5277 - val_loss: 0.6924 - val_accuracy: 0.5368
Epoch 2/100
313/313 [==============================] - 6s 20ms/step - loss: 0.6916 - accuracy: 0.5677 - val_loss: 0.6911 - val_accuracy: 0.5668
Epoch 3/100
313/313 [==============================] - 7s 23ms/step - loss: 0.6896 - accuracy: 0.6011 - val_loss: 0.6883 - val_accuracy: 0.6062
Epoch 4/100
313/313 [==============================] - 6s 19ms/step - loss: 0.6842 - accuracy: 0.6363 - val_loss: 0.6792 - val_accuracy: 0.6618
Epoch 5/100
313/313 [==============================] - 6s 20ms/step - loss: 0.6589 - accuracy: 0.6827 - val_loss: 0.6277 - val_accuracy: 0.7076
Epoch 6/100
313/313 [==============================] - 6s 19ms/step - loss: 0.6084 - accuracy: 0.7135 - val_loss: 0.5964 - val_accuracy: 0.7114
Epoch 7/100
313/313 [==============================] - 7s 21ms/step - loss: 0.5834 - accuracy: 0.7280 - val_loss: 0.5746 - val_accuracy: 0.7330
Epoch 8/100
313/313 [==============================] - 6s 19ms/step - loss: 0.5617 - accuracy: 0.7452 - val_loss: 0.5553 - val_accuracy: 0.7452
Epoch 9/100
313/313 [==============================] - 6s 20ms/step - loss: 0.5421 - accuracy: 0.7560 - val_loss: 0.5375 - val_accuracy: 0.7578
Epoch 10/100
313/313 [==============================] - 7s 22ms/step - loss: 0.5248 - accuracy: 0.7675 - val_loss: 0.5226 - val_accuracy: 0.7646
Epoch 11/100
313/313 [==============================] - 6s 19ms/step - loss: 0.5089 - accuracy: 0.7749 - val_loss: 0.5087 - val_accuracy: 0.7740
Epoch 12/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4946 - accuracy: 0.7822 - val_loss: 0.4974 - val_accuracy: 0.7750
Epoch 13/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4822 - accuracy: 0.7883 - val_loss: 0.4860 - val_accuracy: 0.7860
Epoch 14/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4706 - accuracy: 0.7948 - val_loss: 0.4763 - val_accuracy: 0.7866
Epoch 15/100
313/313 [==============================] - 7s 21ms/step - loss: 0.4605 - accuracy: 0.7986 - val_loss: 0.4776 - val_accuracy: 0.7790
Epoch 16/100
313/313 [==============================] - 6s 21ms/step - loss: 0.4518 - accuracy: 0.8043 - val_loss: 0.4627 - val_accuracy: 0.7866
Epoch 17/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4438 - accuracy: 0.8074 - val_loss: 0.4557 - val_accuracy: 0.7920
Epoch 18/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4376 - accuracy: 0.8080 - val_loss: 0.4491 - val_accuracy: 0.7932
Epoch 19/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4320 - accuracy: 0.8108 - val_loss: 0.4467 - val_accuracy: 0.7942
Epoch 20/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4278 - accuracy: 0.8116 - val_loss: 0.4420 - val_accuracy: 0.7982
Epoch 21/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4246 - accuracy: 0.8116 - val_loss: 0.4392 - val_accuracy: 0.7986
Epoch 22/100
313/313 [==============================] - 6s 21ms/step - loss: 0.4220 - accuracy: 0.8141 - val_loss: 0.4414 - val_accuracy: 0.7982
Epoch 23/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4198 - accuracy: 0.8146 - val_loss: 0.4356 - val_accuracy: 0.8000
Epoch 24/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4175 - accuracy: 0.8149 - val_loss: 0.4367 - val_accuracy: 0.7986
Epoch 25/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4161 - accuracy: 0.8140 - val_loss: 0.4334 - val_accuracy: 0.8000
Epoch 26/100
313/313 [==============================] - 6s 21ms/step - loss: 0.4143 - accuracy: 0.8163 - val_loss: 0.4328 - val_accuracy: 0.7990
Epoch 27/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4131 - accuracy: 0.8158 - val_loss: 0.4314 - val_accuracy: 0.8000
Epoch 28/100
313/313 [==============================] - 6s 21ms/step - loss: 0.4120 - accuracy: 0.8150 - val_loss: 0.4326 - val_accuracy: 0.8002
Epoch 29/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4106 - accuracy: 0.8156 - val_loss: 0.4333 - val_accuracy: 0.8034
Epoch 30/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4092 - accuracy: 0.8170 - val_loss: 0.4315 - val_accuracy: 0.8010
1
2
3
4
5
6
7
8
import matplotlib.pyplot as plt

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

순환 층에 드롭아웃 적용하기

1
2
3
4
5
model2 = keras.Sequential()

model2.add(keras.layers.Embedding(500, 16, input_length=100))
model2.add(keras.layers.LSTM(8, dropout=0.3))
model2.add(keras.layers.Dense(1, activation='sigmoid'))
1
2
3
4
5
6
7
8
9
10
11
12
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model2.compile(optimizer=rmsprop, loss='binary_crossentropy',
               metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-dropout-model.h5',
                                                save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,
                                                  restore_best_weights=True)

history = model2.fit(train_seq, train_target, epochs=100, batch_size=64,
                     validation_data=(val_seq, val_target),
                     callbacks=[checkpoint_cb, early_stopping_cb])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
Epoch 1/100
313/313 [==============================] - 15s 40ms/step - loss: 0.6924 - accuracy: 0.5317 - val_loss: 0.6917 - val_accuracy: 0.5696
Epoch 2/100
313/313 [==============================] - 7s 22ms/step - loss: 0.6907 - accuracy: 0.5788 - val_loss: 0.6897 - val_accuracy: 0.5984
Epoch 3/100
313/313 [==============================] - 6s 21ms/step - loss: 0.6879 - accuracy: 0.6057 - val_loss: 0.6861 - val_accuracy: 0.6232
Epoch 4/100
313/313 [==============================] - 7s 21ms/step - loss: 0.6826 - accuracy: 0.6251 - val_loss: 0.6787 - val_accuracy: 0.6392
Epoch 5/100
313/313 [==============================] - 6s 18ms/step - loss: 0.6674 - accuracy: 0.6511 - val_loss: 0.6529 - val_accuracy: 0.6630
Epoch 6/100
313/313 [==============================] - 7s 21ms/step - loss: 0.6258 - accuracy: 0.6999 - val_loss: 0.6108 - val_accuracy: 0.7108
Epoch 7/100
313/313 [==============================] - 6s 19ms/step - loss: 0.5967 - accuracy: 0.7250 - val_loss: 0.5894 - val_accuracy: 0.7274
Epoch 8/100
313/313 [==============================] - 7s 21ms/step - loss: 0.5771 - accuracy: 0.7402 - val_loss: 0.5723 - val_accuracy: 0.7396
Epoch 9/100
313/313 [==============================] - 6s 18ms/step - loss: 0.5599 - accuracy: 0.7513 - val_loss: 0.5534 - val_accuracy: 0.7522
Epoch 10/100
313/313 [==============================] - 6s 20ms/step - loss: 0.5442 - accuracy: 0.7606 - val_loss: 0.5381 - val_accuracy: 0.7640
Epoch 11/100
313/313 [==============================] - 6s 18ms/step - loss: 0.5293 - accuracy: 0.7683 - val_loss: 0.5235 - val_accuracy: 0.7690
Epoch 12/100
313/313 [==============================] - 6s 20ms/step - loss: 0.5166 - accuracy: 0.7732 - val_loss: 0.5114 - val_accuracy: 0.7700
Epoch 13/100
313/313 [==============================] - 5s 17ms/step - loss: 0.5052 - accuracy: 0.7791 - val_loss: 0.5009 - val_accuracy: 0.7844
Epoch 14/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4928 - accuracy: 0.7819 - val_loss: 0.4905 - val_accuracy: 0.7852
Epoch 15/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4837 - accuracy: 0.7881 - val_loss: 0.4857 - val_accuracy: 0.7782
Epoch 16/100
313/313 [==============================] - 6s 21ms/step - loss: 0.4753 - accuracy: 0.7911 - val_loss: 0.4750 - val_accuracy: 0.7890
Epoch 17/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4669 - accuracy: 0.7939 - val_loss: 0.4688 - val_accuracy: 0.7916
Epoch 18/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4599 - accuracy: 0.7969 - val_loss: 0.4637 - val_accuracy: 0.7898
Epoch 19/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4538 - accuracy: 0.7991 - val_loss: 0.4582 - val_accuracy: 0.7918
Epoch 20/100
313/313 [==============================] - 7s 21ms/step - loss: 0.4504 - accuracy: 0.7995 - val_loss: 0.4541 - val_accuracy: 0.7952
Epoch 21/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4468 - accuracy: 0.8002 - val_loss: 0.4514 - val_accuracy: 0.7942
Epoch 22/100
313/313 [==============================] - 7s 21ms/step - loss: 0.4440 - accuracy: 0.8015 - val_loss: 0.4500 - val_accuracy: 0.7952
Epoch 23/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4414 - accuracy: 0.8028 - val_loss: 0.4478 - val_accuracy: 0.7942
Epoch 24/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4390 - accuracy: 0.8008 - val_loss: 0.4482 - val_accuracy: 0.7948
Epoch 25/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4371 - accuracy: 0.8021 - val_loss: 0.4450 - val_accuracy: 0.7960
Epoch 26/100
313/313 [==============================] - 7s 21ms/step - loss: 0.4352 - accuracy: 0.8040 - val_loss: 0.4442 - val_accuracy: 0.7998
Epoch 27/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4324 - accuracy: 0.8061 - val_loss: 0.4417 - val_accuracy: 0.7974
Epoch 28/100
313/313 [==============================] - 7s 22ms/step - loss: 0.4325 - accuracy: 0.8048 - val_loss: 0.4414 - val_accuracy: 0.7976
Epoch 29/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4299 - accuracy: 0.8065 - val_loss: 0.4431 - val_accuracy: 0.7990
Epoch 30/100
313/313 [==============================] - 7s 22ms/step - loss: 0.4293 - accuracy: 0.8058 - val_loss: 0.4400 - val_accuracy: 0.7984
Epoch 31/100
313/313 [==============================] - 5s 18ms/step - loss: 0.4276 - accuracy: 0.8077 - val_loss: 0.4421 - val_accuracy: 0.7936
Epoch 32/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4258 - accuracy: 0.8087 - val_loss: 0.4384 - val_accuracy: 0.7978
Epoch 33/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4255 - accuracy: 0.8073 - val_loss: 0.4382 - val_accuracy: 0.8018
Epoch 34/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4244 - accuracy: 0.8074 - val_loss: 0.4375 - val_accuracy: 0.8026
Epoch 35/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4247 - accuracy: 0.8068 - val_loss: 0.4369 - val_accuracy: 0.8034
Epoch 36/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4223 - accuracy: 0.8102 - val_loss: 0.4357 - val_accuracy: 0.7988
Epoch 37/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4238 - accuracy: 0.8077 - val_loss: 0.4401 - val_accuracy: 0.8006
Epoch 38/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4199 - accuracy: 0.8087 - val_loss: 0.4349 - val_accuracy: 0.8016
Epoch 39/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4214 - accuracy: 0.8091 - val_loss: 0.4339 - val_accuracy: 0.8018
Epoch 40/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4215 - accuracy: 0.8102 - val_loss: 0.4375 - val_accuracy: 0.7934
Epoch 41/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4203 - accuracy: 0.8094 - val_loss: 0.4332 - val_accuracy: 0.8018
Epoch 42/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4176 - accuracy: 0.8083 - val_loss: 0.4341 - val_accuracy: 0.8046
Epoch 43/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4189 - accuracy: 0.8105 - val_loss: 0.4345 - val_accuracy: 0.7974
Epoch 44/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4178 - accuracy: 0.8117 - val_loss: 0.4335 - val_accuracy: 0.8038
1
2
3
4
5
6
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

2개의 층을 연결하기

1
2
3
4
5
6
7
8
model3 = keras.Sequential()

model3.add(keras.layers.Embedding(500, 16, input_length=100))
model3.add(keras.layers.LSTM(8, dropout=0.3, return_sequences=True))
model3.add(keras.layers.LSTM(8, dropout=0.3))
model3.add(keras.layers.Dense(1, activation='sigmoid'))

model3.summary()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_2 (Embedding)     (None, 100, 16)           8000      
                                                                 
 lstm_2 (LSTM)               (None, 100, 8)            800       
                                                                 
 lstm_3 (LSTM)               (None, 8)                 544       
                                                                 
 dense_2 (Dense)             (None, 1)                 9         
                                                                 
=================================================================
Total params: 9,353
Trainable params: 9,353
Non-trainable params: 0
_________________________________________________________________
1
2
3
4
5
6
7
8
9
10
11
12
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model3.compile(optimizer=rmsprop, loss='binary_crossentropy',
               metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-2rnn-model.h5',
                                                save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,
                                                  restore_best_weights=True)

history = model3.fit(train_seq, train_target, epochs=100, batch_size=64,
                     validation_data=(val_seq, val_target),
                     callbacks=[checkpoint_cb, early_stopping_cb])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
Epoch 1/100
313/313 [==============================] - 17s 44ms/step - loss: 0.6929 - accuracy: 0.5168 - val_loss: 0.6927 - val_accuracy: 0.5280
Epoch 2/100
313/313 [==============================] - 8s 25ms/step - loss: 0.6914 - accuracy: 0.5788 - val_loss: 0.6893 - val_accuracy: 0.6228
Epoch 3/100
313/313 [==============================] - 7s 24ms/step - loss: 0.6750 - accuracy: 0.6473 - val_loss: 0.6432 - val_accuracy: 0.6782
Epoch 4/100
313/313 [==============================] - 8s 24ms/step - loss: 0.6067 - accuracy: 0.6889 - val_loss: 0.5798 - val_accuracy: 0.7046
Epoch 5/100
313/313 [==============================] - 8s 25ms/step - loss: 0.5611 - accuracy: 0.7233 - val_loss: 0.5406 - val_accuracy: 0.7358
Epoch 6/100
313/313 [==============================] - 7s 22ms/step - loss: 0.5311 - accuracy: 0.7439 - val_loss: 0.5173 - val_accuracy: 0.7542
Epoch 7/100
313/313 [==============================] - 8s 25ms/step - loss: 0.5142 - accuracy: 0.7568 - val_loss: 0.5006 - val_accuracy: 0.7670
Epoch 8/100
313/313 [==============================] - 7s 24ms/step - loss: 0.4986 - accuracy: 0.7653 - val_loss: 0.4950 - val_accuracy: 0.7680
Epoch 9/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4860 - accuracy: 0.7739 - val_loss: 0.4806 - val_accuracy: 0.7778
Epoch 10/100
313/313 [==============================] - 8s 25ms/step - loss: 0.4782 - accuracy: 0.7804 - val_loss: 0.4716 - val_accuracy: 0.7838
Epoch 11/100
313/313 [==============================] - 7s 21ms/step - loss: 0.4717 - accuracy: 0.7793 - val_loss: 0.4670 - val_accuracy: 0.7842
Epoch 12/100
313/313 [==============================] - 8s 25ms/step - loss: 0.4669 - accuracy: 0.7839 - val_loss: 0.4628 - val_accuracy: 0.7850
Epoch 13/100
313/313 [==============================] - 7s 22ms/step - loss: 0.4599 - accuracy: 0.7875 - val_loss: 0.4589 - val_accuracy: 0.7896
Epoch 14/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4566 - accuracy: 0.7892 - val_loss: 0.4575 - val_accuracy: 0.7864
Epoch 15/100
313/313 [==============================] - 8s 25ms/step - loss: 0.4520 - accuracy: 0.7928 - val_loss: 0.4676 - val_accuracy: 0.7814
Epoch 16/100
313/313 [==============================] - 7s 21ms/step - loss: 0.4506 - accuracy: 0.7913 - val_loss: 0.4557 - val_accuracy: 0.7878
Epoch 17/100
313/313 [==============================] - 8s 24ms/step - loss: 0.4492 - accuracy: 0.7930 - val_loss: 0.4538 - val_accuracy: 0.7876
Epoch 18/100
313/313 [==============================] - 7s 22ms/step - loss: 0.4460 - accuracy: 0.7958 - val_loss: 0.4477 - val_accuracy: 0.7942
Epoch 19/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4446 - accuracy: 0.7962 - val_loss: 0.4477 - val_accuracy: 0.7932
Epoch 20/100
313/313 [==============================] - 7s 24ms/step - loss: 0.4403 - accuracy: 0.7969 - val_loss: 0.4451 - val_accuracy: 0.7952
Epoch 21/100
313/313 [==============================] - 7s 21ms/step - loss: 0.4405 - accuracy: 0.7970 - val_loss: 0.4439 - val_accuracy: 0.7940
Epoch 22/100
313/313 [==============================] - 8s 25ms/step - loss: 0.4382 - accuracy: 0.7983 - val_loss: 0.4471 - val_accuracy: 0.7950
Epoch 23/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4367 - accuracy: 0.8008 - val_loss: 0.4430 - val_accuracy: 0.7932
Epoch 24/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4350 - accuracy: 0.8009 - val_loss: 0.4456 - val_accuracy: 0.7908
Epoch 25/100
313/313 [==============================] - 8s 24ms/step - loss: 0.4353 - accuracy: 0.8008 - val_loss: 0.4414 - val_accuracy: 0.7936
Epoch 26/100
313/313 [==============================] - 7s 22ms/step - loss: 0.4335 - accuracy: 0.7994 - val_loss: 0.4426 - val_accuracy: 0.7892
Epoch 27/100
313/313 [==============================] - 8s 24ms/step - loss: 0.4313 - accuracy: 0.8015 - val_loss: 0.4407 - val_accuracy: 0.7932
Epoch 28/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4313 - accuracy: 0.8003 - val_loss: 0.4401 - val_accuracy: 0.7932
Epoch 29/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4312 - accuracy: 0.8038 - val_loss: 0.4453 - val_accuracy: 0.7968
Epoch 30/100
313/313 [==============================] - 8s 25ms/step - loss: 0.4296 - accuracy: 0.8054 - val_loss: 0.4393 - val_accuracy: 0.7938
Epoch 31/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4296 - accuracy: 0.8038 - val_loss: 0.4470 - val_accuracy: 0.7890
Epoch 32/100
313/313 [==============================] - 7s 23ms/step - loss: 0.4270 - accuracy: 0.8030 - val_loss: 0.4395 - val_accuracy: 0.7910
Epoch 33/100
313/313 [==============================] - 7s 22ms/step - loss: 0.4283 - accuracy: 0.8027 - val_loss: 0.4434 - val_accuracy: 0.7984
1
2
3
4
5
6
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

GRU 신경망 훈련하기

1
2
3
4
5
6
7
model4 = keras.Sequential()

model4.add(keras.layers.Embedding(500, 16, input_length=100))
model4.add(keras.layers.GRU(8))
model4.add(keras.layers.Dense(1, activation='sigmoid'))

model4.summary()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_3 (Embedding)     (None, 100, 16)           8000      
                                                                 
 gru (GRU)                   (None, 8)                 624       
                                                                 
 dense_3 (Dense)             (None, 1)                 9         
                                                                 
=================================================================
Total params: 8,633
Trainable params: 8,633
Non-trainable params: 0
_________________________________________________________________
1
2
3
4
5
6
7
8
9
10
11
12
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model4.compile(optimizer=rmsprop, loss='binary_crossentropy',
               metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-gru-model.h5',
                                                save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,
                                                  restore_best_weights=True)

history = model4.fit(train_seq, train_target, epochs=100, batch_size=64,
                     validation_data=(val_seq, val_target),
                     callbacks=[checkpoint_cb, early_stopping_cb])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Epoch 1/100
313/313 [==============================] - 15s 38ms/step - loss: 0.6920 - accuracy: 0.5486 - val_loss: 0.6912 - val_accuracy: 0.5724
Epoch 2/100
313/313 [==============================] - 6s 19ms/step - loss: 0.6898 - accuracy: 0.5790 - val_loss: 0.6889 - val_accuracy: 0.5830
Epoch 3/100
313/313 [==============================] - 7s 21ms/step - loss: 0.6865 - accuracy: 0.5975 - val_loss: 0.6852 - val_accuracy: 0.5958
Epoch 4/100
313/313 [==============================] - 6s 18ms/step - loss: 0.6813 - accuracy: 0.6107 - val_loss: 0.6792 - val_accuracy: 0.5994
Epoch 5/100
313/313 [==============================] - 6s 20ms/step - loss: 0.6732 - accuracy: 0.6204 - val_loss: 0.6701 - val_accuracy: 0.6134
Epoch 6/100
313/313 [==============================] - 6s 19ms/step - loss: 0.6608 - accuracy: 0.6387 - val_loss: 0.6556 - val_accuracy: 0.6302
Epoch 7/100
313/313 [==============================] - 6s 21ms/step - loss: 0.6404 - accuracy: 0.6581 - val_loss: 0.6304 - val_accuracy: 0.6640
Epoch 8/100
313/313 [==============================] - 6s 20ms/step - loss: 0.6019 - accuracy: 0.6898 - val_loss: 0.5781 - val_accuracy: 0.7058
Epoch 9/100
313/313 [==============================] - 6s 19ms/step - loss: 0.5315 - accuracy: 0.7373 - val_loss: 0.5158 - val_accuracy: 0.7470
Epoch 10/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4969 - accuracy: 0.7618 - val_loss: 0.4963 - val_accuracy: 0.7624
Epoch 11/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4798 - accuracy: 0.7743 - val_loss: 0.4828 - val_accuracy: 0.7722
Epoch 12/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4667 - accuracy: 0.7823 - val_loss: 0.4724 - val_accuracy: 0.7772
Epoch 13/100
313/313 [==============================] - 6s 19ms/step - loss: 0.4567 - accuracy: 0.7879 - val_loss: 0.4655 - val_accuracy: 0.7826
Epoch 14/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4485 - accuracy: 0.7962 - val_loss: 0.4589 - val_accuracy: 0.7856
Epoch 15/100
313/313 [==============================] - 7s 21ms/step - loss: 0.4425 - accuracy: 0.7997 - val_loss: 0.4651 - val_accuracy: 0.7866
Epoch 16/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4375 - accuracy: 0.8038 - val_loss: 0.4533 - val_accuracy: 0.7942
Epoch 17/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4331 - accuracy: 0.8059 - val_loss: 0.4504 - val_accuracy: 0.7922
Epoch 18/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4300 - accuracy: 0.8080 - val_loss: 0.4471 - val_accuracy: 0.7930
Epoch 19/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4270 - accuracy: 0.8095 - val_loss: 0.4477 - val_accuracy: 0.7920
Epoch 20/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4246 - accuracy: 0.8117 - val_loss: 0.4448 - val_accuracy: 0.7942
Epoch 21/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4231 - accuracy: 0.8102 - val_loss: 0.4453 - val_accuracy: 0.7912
Epoch 22/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4215 - accuracy: 0.8124 - val_loss: 0.4476 - val_accuracy: 0.7946
Epoch 23/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4201 - accuracy: 0.8131 - val_loss: 0.4430 - val_accuracy: 0.7964
Epoch 24/100
313/313 [==============================] - 5s 17ms/step - loss: 0.4187 - accuracy: 0.8145 - val_loss: 0.4433 - val_accuracy: 0.7908
Epoch 25/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4181 - accuracy: 0.8130 - val_loss: 0.4419 - val_accuracy: 0.7932
Epoch 26/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4168 - accuracy: 0.8151 - val_loss: 0.4433 - val_accuracy: 0.7934
Epoch 27/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4162 - accuracy: 0.8148 - val_loss: 0.4412 - val_accuracy: 0.7930
Epoch 28/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4157 - accuracy: 0.8142 - val_loss: 0.4418 - val_accuracy: 0.7916
Epoch 29/100
313/313 [==============================] - 6s 20ms/step - loss: 0.4148 - accuracy: 0.8148 - val_loss: 0.4455 - val_accuracy: 0.7964
Epoch 30/100
313/313 [==============================] - 6s 18ms/step - loss: 0.4136 - accuracy: 0.8165 - val_loss: 0.4423 - val_accuracy: 0.7914
1
2
3
4
5
6
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

마무리

1
2
3
4
5
test_seq = pad_sequences(test_input, maxlen=100)

rnn_model = keras.models.load_model('best-2rnn-model.h5')

rnn_model.evaluate(test_seq, test_target)
1
2
3
4
5
6
7
782/782 [==============================] - 4s 5ms/step - loss: 0.4346 - accuracy: 0.7975





[0.4346226453781128, 0.7974799871444702]
This post is licensed under CC BY 4.0 by the author.

[SeSAC]혼공 머신러닝+딥러닝 Ch8. 이미지를 위한 인공 신경망

차원 축소 (Dimensionality Reduction) 기법