tensorflow 2.0 keras Training and evaluation (1)

이글은 다음 문서를 참조합니다.

www.tensorflow.org/guide/keras/train_and_evaluate

(번역은 자력 + 파파고 + 구글 번역기를 사용하였으니, 부자연스럽더라도 양해바랍니다.)

Training and Evaluation with Tensorflow Keras

이번 가이드는 tensorflow 2.0에서 겪을 수 있는 training, evaluation, prediction을 다룹니다.

(model.fit(), model.evaluate(), model.predict())를 사용하는 경우
eager execution and the GradientTape를 사용하는 경우

일반적으로, 여러분이 내장된 루프를 사용하든, 여러분 자신의 루프를 쓰든, 모델 훈련과 평가는 모든 종류의 케라스 모델, 즉 Sequential 모델, funtional API로 제작된 모델, 모델 하위 분류를 통해 처음부터 작성된 모델에서 엄격하게 같은 방식으로 작동한다.

이 가이드는 분산 트레이닝에 대한 내용은 다루지 않습니다.

Part 1: Using build-in training & evaluation loops

model의 내장된 traninig 루프에 데이터를 넣어줄 떄, Numpy array(데이터가 작거나 메모리에 적당한 경우)나 tf.data Dataset 객체를 사용해야만 합니다. 다음에 나오는 내용들은 optimizers, losses, metrics를 어떻게 사용하는지에 대해 numpy array를 사용한 MNIST 데이터셋에 대한 예제를 다룰 것입니다.

API overview: a first end-to-end example

다음 예제를 보자.

from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)

model = keras.Model(inputs=inputs, outputs=outputs)

test data에 대한 평가와 train data에서 생성되는 hold out set에 대한 검증, train data로 구성되는 전형적인 end-to-end의 workflow는 다음과 같습니다.

# Load a toy dataset for the sake of this example
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Preprocess the data (these are Numpy arrays)
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255

# Reserve 10,000 samples for validation
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]

# Specify the training configuration (optimizer, loss, metrics)
model.compile(optimizer=keras.optimizers.RMSprop(),  # Optimizer
              # Loss function to minimize
              loss=keras.losses.SparseCategoricalCrossentropy(),
              # List of metrics to monitor
              metrics=[keras.metrics.SparseCategoricalAccuracy()])

# Train the model by slicing the data into "batches"
# of size "batch_size", and repeatedly iterating over
# the entire dataset for a given number of "epochs"
print('# Fit model on training data')
history = model.fit(x_train, y_train,
                    batch_size=64,
                    epochs=3,
                    # We pass some validation for
                    # monitoring validation loss and metrics
                    # at the end of each epoch
                    validation_data=(x_val, y_val))

# The returned "history" object holds a record
# of the loss values and metric values during training
print('\nhistory dict:', history.history)

# Evaluate the model on the test data using `evaluate`
print('\n# Evaluate on test data')
results = model.evaluate(x_test, y_test, batch_size=128)
print('test loss, test acc:', results)

# Generate predictions (probabilities -- the output of the last layer)
# on new data using `predict`
print('\n# Generate predictions for 3 samples')
predictions = model.predict(x_test[:3])
print('predictions shape:', predictions.shape)

Specifying a loss, metrics, and an optimizer

model의 fit함수를 사용하여 train시키려면 우리는 구체적인 loss function, optimizer, metric에 대해 정의해야 합니다.

이러한 것들은 compile()에 인자로 넘길 수 있습니다.

model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
              loss=keras.losses.SparseCategoricalCrossentropy(),
              metrics=[keras.metrics.SparseCategoricalAccuracy()])

metrics 인자는 list형태여야 하며 우리가 원하는 어떠한 metric도 추가 될 수 있습니다.

만약 우리의 모델이 multiple output이라면, 각각 다른 loss와 metrics를 지정할 수 있고, total loss에 대한 영향을 조절할 수 있습니다. 향후 다룰 "Passing data to multi-input, multi-output models" section에서 더욱 자세한 내용을 찾아 볼 수 있습니다.

다음 예시와 같이 loss, metrics는 string 식별자로도 접근할 수 있습니다.

model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])

나중에 재사용할 수 있도록 모델 정의와 기능의 단계를 정리해 봅시다. 이 가이드의 다른 예에서 여러 번 부르기로 합시다.

def get_uncompiled_model():
  inputs = keras.Input(shape=(784,), name='digits')
  x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
  x = layers.Dense(64, activation='relu', name='dense_2')(x)
  outputs = layers.Dense(10, activation='softmax', name='predictions')(x)
  model = keras.Model(inputs=inputs, outputs=outputs)
  return model

def get_compiled_model():
  model = get_uncompiled_model()
  model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])
  return model

Many built-in optimizers, losses, and metrics are available

일반적으로 직접 정의하는 것 보단 Keras API에 정의된 것을 사용하는게 편할 것입니다.

Optimizers : - SGD()(with or without momentum) - RMSprop() - Adam() - etc.
Losses : - MeanSquaredError() - KLDivergence() - CosineSimilarity() - etc.
Metrics : - AUC() - Precision() - Recall() - etc.

Writing custom losses and metrics

API에 존재하지 않는 지표가 필요하다면, Metric class를 이용해 쉽게 customizing할 수 있습니다. 4개의 method가 필요합니다.

__init__(self), 우리의 metrics를 위한 state variables를 정의합니다.
update_state(self, y_true, y_pred, sample_weight=None), y_true값과 state variables를 업데이트할 y_pred를 사용합니다.
result(self), 결과를 도출합니다.
reset_states(self), 지표의 상태를 초기화합니다.

State update와 결과 도출은 결과를 계산하는 것이 매우 오래걸리기 때문에 각각 updatae_state()와 result()로 나누어져 있습니다.

아래 예제는 분류문제에서 각 Sample이 얼마나 맞췄는지에 대해 갯수를 세는 CategoricalTruePositives를 정의한 것입니다.

class CatgoricalTruePositives(keras.metrics.Metric):
  
    def __init__(self, name='binary_true_positives', **kwargs):
      super(CatgoricalTruePositives, self).__init__(name=name, **kwargs)
      self.true_positives = self.add_weight(name='tp', initializer='zeros')

    def update_state(self, y_true, y_pred, sample_weight=None):
      y_pred = tf.argmax(y_pred)
      values = tf.equal(tf.cast(y_true, 'int32'), tf.cast(y_pred, 'int32'))
      values = tf.cast(values, 'float32')
      if sample_weight is not None:
        sample_weight = tf.cast(sample_weight, 'float32')
        values = tf.multiply(values, sample_weight)
      return self.true_positives.assign_add(tf.reduce_sum(values))  # TODO: fix

    def result(self):
      return tf.identity(self.true_positives)  # TODO: fix
    
    def reset_states(self):
      # The state of the metric will be reset at the start of each epoch.
      self.true_positives.assign(0.)


model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
              loss=keras.losses.SparseCategoricalCrossentropy(),
              metrics=[CatgoricalTruePositives()])
model.fit(x_train, y_train,
          batch_size=64,
          epochs=3)

Handling losses and metrics that don't fit the standard signature

압도적인 다수의 손실과 측정 지표는 y_true와 y_pred로 계산할 수 있습니다. 여기서 y_pred는 모델의 산출물입니다. 하지만 전부 다 그런 것은 아닙니다. 예를 들어, 정규화 손실은 계층의 활성화만 필요할 수 있으며(이 경우에는 대상이 없다), 이 활성화는 모델 출력이 아닐 수 있 수 있습니다.

이러한 경우에 우리는 custom layer의 call 메서드를 통한 self.add_loss(loss_value)를 사용할 수 있습니다.

class ActivityRegularizationLayer(layers.Layer):
  
  def call(self, inputs):
    self.add_loss(tf.reduce_sum(inputs) * 0.1)
    return inputs  # Pass-through layer.
  
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)

# Insert activity regularization as a layer
x = ActivityRegularizationLayer()(x)

x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)

model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
              loss='sparse_categorical_crossentropy')

# The displayed loss will be much higher than before
# due to the regularization component.
model.fit(x_train, y_train,
          batch_size=64,
          epochs=1)

metric 값 logging에 대해서도 동일한 작업을 진행할 수 있습니다.

class MetricLoggingLayer(layers.Layer):
  
  def call(self, inputs):
    # The `aggregation` argument defines
    # how to aggregate the per-batch values
    # over each epoch:
    # in this case we simply average them.
    self.add_metric(keras.backend.std(inputs),
                    name='std_of_activation',
                    aggregation='mean')
    return inputs  # Pass-through layer.

  
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)

# Insert std logging as a layer.
x = MetricLoggingLayer()(x)

x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x)

model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
              loss='sparse_categorical_crossentropy')
model.fit(x_train, y_train,
          batch_size=64,
          epochs=1)

Functional API에서 또한, model.add_loss(loss_tensor) or model.add_metric(metric_tensor, name, aggregation)을 호출할 수 있습니다.

간단한 예시입니다.

inputs = keras.Input(shape=(784,), name='digits')
x1 = layers.Dense(64, activation='relu', name='dense_1')(inputs)
x2 = layers.Dense(64, activation='relu', name='dense_2')(x1)
outputs = layers.Dense(10, activation='softmax', name='predictions')(x2)
model = keras.Model(inputs=inputs, outputs=outputs)

model.add_loss(tf.reduce_sum(x1) * 0.1)

model.add_metric(keras.backend.std(x1),
                 name='std_of_activation',
                 aggregation='mean')

model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
              loss='sparse_categorical_crossentropy')
model.fit(x_train, y_train,
          batch_size=64,
          epochs=1)

Automatically setting apart a validation holdout set

우리는 각 epoch마다 (x_val, y_val)을 통해 평가할 수 있습니다.

이를 위한 옵션이 있습니다 : validation_split 인자는 자동적으로 우리의 training_data에서 우리가 허용한 만큼 뒷 부분을 validation_data로 만듭니다. 이 인자값은 데이터에서 얼마만큼의 검증데이터를 만들 것인지를 나타내며, 0과 1사이의 값으로 조정합니다. 예를 들어, validation_split = 0.2는 traninig 데이터의 20퍼를 검증데이터로 활용하겠다는 뜻입니다.

이러한 검증 방법은 shuffling 전에 fit함수에서의 호출을 통해 데이터의 끝 x%만큼을 validation_data로 활용하게 됩니다.

model = get_compiled_model()
model.fit(x_train, y_train, batch_size=64, validation_split=0.2, epochs=3)

'# Machine Learning > TensorFlow doc 정리' 카테고리의 다른 글

tensorflow 2.0 keras Training and evaluation (3) (1)	2019.04.09
tensorflow 2.0 keras Training and evaluation (2) (0)	2019.04.09
tensorflow 2.0 keras Functional API (2) (0)	2019.03.30
tensorflow 2.0 keras Functional API (1) (0)	2019.03.29
tensorflow 2.0 keras Overview (2) (0)	2019.03.24

대학원생이 쉽게 설명해보기