이글은 다음 문서를 참조합니다.
www.tensorflow.org/guide/keras/train_and_evaluate
(번역은 자력 + 파파고 + 구글 번역기를 사용하였으니, 부자연스럽더라도 양해바랍니다.)
Part 2: Writing your own training & evaluation loops from scratch
fit(), evaluate() 함수를 통한 학습&평가 방식이 아닌 좀 더 low-level을 다루고 싶다면, 매우 간단하게 커스터마이징할 수 있습니다. 그러나 디버깅할때 수많은 노력이 필요할 것입니다.
Using the GradientTape: a first end-to-end example
loss에 대하여 layer의 학습가능한 weight의 gradient를 알고싶다면 GradientTape scope를 정의해야 합니다. optimizer객체를 사용하여 model.trainable_variables를 통해 업데이트되는 gradient를 사용할 수 있습니다.
예제를 살펴보겠습니다.
# Get the model.
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
# Instantiate an optimizer.
optimizer = keras.optimizers.SGD(learning_rate=1e-3)
# Instantiate a loss function.
loss_fn = keras.losses.SparseCategoricalCrossentropy()
# Prepare the training dataset.
batch_size = 64
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)
# Iterate over epochs.
for epoch in range(3):
print('Start of epoch %d' % (epoch,))
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
# Open a GradientTape to record the operations run
# during the forward pass, which enables autodifferentiation.
with tf.GradientTape() as tape:
# Run the forward pass of the layer.
# The operations that the layer applies
# to its inputs are going to be recorded
# on the GradientTape.
logits = model(x_batch_train) # Logits for this minibatch
# Compute the loss value for this minibatch.
loss_value = loss_fn(y_batch_train, logits)
# Use the gradient tape to automatically retrieve
# the gradients of the trainable variables with respect to the loss.
grads = tape.gradient(loss_value, model.trainable_variables)
# Run one step of gradient descent by updating
# the value of the variables to minimize the loss.
optimizer.apply_gradients(zip(grads, model.trainable_variables))
# Log every 200 batches.
if step % 200 == 0:
print('Training loss (for one batch) at step %s: %s' % (step, float(loss_value)))
print('Seen so far: %s samples' % ((step + 1) * 64))
Low-level handling of metrics
우리들이 직접 정의한 metric이나 내장된 metric을 쉽게 사용(재사용)할 수 있습니다.
- 첫 루프에서 메트릭을 초기화합니다.
- 각 batch 후에 metric.update_state()를 호출합니다.
- metric의 현재 값을 알고싶다면 metric.result()를 호출합니다.
- (보통 the end of epoch에) metric의 상태를 clear하고 싶을 때 metric.reset_states()를 호출합니다.
SparseCategoricalAccuracy예제를 살펴보겠습니다.
# Get model
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
# Instantiate an optimizer to train the model.
optimizer = keras.optimizers.SGD(learning_rate=1e-3)
# Instantiate a loss function.
loss_fn = keras.losses.SparseCategoricalCrossentropy()
# Prepare the metrics.
train_acc_metric = keras.metrics.SparseCategoricalAccuracy()
val_acc_metric = keras.metrics.SparseCategoricalAccuracy()
# Prepare the training dataset.
batch_size = 64
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)
# Prepare the validation dataset.
val_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))
val_dataset = val_dataset.batch(64)
# Iterate over epochs.
for epoch in range(3):
print('Start of epoch %d' % (epoch,))
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
with tf.GradientTape() as tape:
logits = model(x_batch_train)
loss_value = loss_fn(y_batch_train, logits)
grads = tape.gradient(loss_value, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
# Update training metric.
train_acc_metric(y_batch_train, logits)
# Log every 200 batches.
if step % 200 == 0:
print('Training loss (for one batch) at step %s: %s' % (step, float(loss_value)))
print('Seen so far: %s samples' % ((step + 1) * 64))
# Display metrics at the end of each epoch.
train_acc = train_acc_metric.result()
print('Training acc over epoch: %s' % (float(train_acc),))
# Reset training metrics at the end of each epoch
train_acc_metric.reset_states()
# Run a validation loop at the end of each epoch.
for x_batch_val, y_batch_val in val_dataset:
val_logits = model(x_batch_val)
# Update val metrics
val_acc_metric(y_batch_val, val_logits)
val_acc = val_acc_metric.result()
val_acc_metric.reset_states()
print('Validation acc: %s' % (float(val_acc),))
Low-level handling of extra losses
이전에 우리는 call method에서 self.add_loss(value)를 통하여 regularization loss를 사용할 수 있다는 것을 보았습니다.
일반적으로, custom 학습 루프에 이러한 loss를 고려하게 될 것입니다.(모델을 직접 작성하여 나만의 loss를 만들고 싶은 경우)
class ActivityRegularizationLayer(layers.Layer):
def call(self, inputs):
self.add_loss(1e-2 * tf.reduce_sum(inputs))
return inputs
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
# Insert activity regularization as a layer
x = ActivityRegularizationLayer()(x)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
logits = model(x_train)
모델을 만들고,
logits = model(x_train[:64])
print(model.losses)
forward pass동안 만들어지는 loss는 model.losses에 저장됩니다.
추적된 loss는 모델의 __call__을 시작할 때 clear되기 떄문에, 한 번의 forward pass 동안 발생한 손실만 볼 수 있을 것이다. 예를 들어 모델을 반복적으로 호출한 다음 loss를 쿼리하면 마지막에 발생한 최신 loss만 알 수 있습니다.
logits = model(x_train[:64])
logits = model(x_train[64: 128])
logits = model(x_train[128: 192])
print(model.losses)
[<tf.Tensor: id=999851, shape=(), dtype=float32, numpy=6.88884>]
학습하는 동안 발생하는 모든 loss를 고려하고 싶다면, 학습 loop에서 total_loss에 sum(model.losses)를 추가해야합니다.
optimizer = keras.optimizers.SGD(learning_rate=1e-3)
for epoch in range(3):
print('Start of epoch %d' % (epoch,))
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
with tf.GradientTape() as tape:
logits = model(x_batch_train)
loss_value = loss_fn(y_batch_train, logits)
# Add extra losses created during this forward pass:
loss_value += sum(model.losses)
grads = tape.gradient(loss_value, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
# Log every 200 batches.
if step % 200 == 0:
print('Training loss (for one batch) at step %s: %s' % (step, float(loss_value)))
print('Seen so far: %s samples' % ((step + 1) * 64))
'# Machine Learning > TensorFlow doc 정리' 카테고리의 다른 글
tensorflow 2.0 keras Writing layers and models with tf keras (2) (0) | 2019.04.13 |
---|---|
tensorflow 2.0 keras Writing layers and models with tf keras (1) (0) | 2019.04.13 |
tensorflow 2.0 keras Training and evaluation (3) (1) | 2019.04.09 |
tensorflow 2.0 keras Training and evaluation (2) (0) | 2019.04.09 |
tensorflow 2.0 keras Training and evaluation (1) (0) | 2019.04.08 |