tensorflow 2.0 keras Writing layers and models with tf keras (1)

이글은 다음 문서를 참조합니다.

www.tensorflow.org/guide/keras/custom_layers_and_models

(번역은 자력 + 파파고 + 구글 번역기를 사용하였으니, 부자연스럽더라도 양해바랍니다.)

Writing layers and models with Tensorflow Keras

The Layer class

Layers encapsulate a state (weights) and some computation

우리가 주로 작업할 데이터 구조는 layer입니다. 레이어는 상태(가중치)와 입력에서 출력으로의 변환을 동시에 캡슐화시킵니다.

예제를 살펴보겠습니다.

from tensorflow.keras import layers


class Linear(layers.Layer):

  def __init__(self, units=32, input_dim=32):
    super(Linear, self).__init__()
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(initial_value=w_init(shape=(input_dim, units),
                                              dtype='float32'),
                         trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(initial_value=b_init(shape=(units,),
                                              dtype='float32'),
                         trainable=True)

  def call(self, inputs):
    return tf.matmul(inputs, self.w) + self.b

x = tf.ones((2, 2))
linear_layer = Linear(4, 2)
y = linear_layer(x)
print(y)

tf.Tensor(
[[ 0.04076247  0.12488913 -0.09827997 -0.00854541]
 [ 0.04076247  0.12488913 -0.09827997 -0.00854541]], shape=(2, 4), dtype=float32)

레이어 속성으로 설정된 경우 레이어별로 가중치 w와 b가 자동으로 추적된다는 점에 유의해야 합니다. (쉽게 말해서 그냥 레이어의 가중치와 bias에 접근할 수 있다.. 요정도?)

assert linear_layer.weights == [linear_layer.w, linear_layer.b]

add_weight를 사용하면 위의 예보다 코드도 짧고 더 빠릅니다.

class Linear(layers.Layer):

  def __init__(self, units=32, input_dim=32):
    super(Linear, self).__init__()
    self.w = self.add_weight(shape=(input_dim, units),
                             initializer='random_normal',
                             trainable=True)
    self.b = self.add_weight(shape=(units,),
                             initializer='zeros',
                             trainable=True)

  def call(self, inputs):
    return tf.matmul(inputs, self.w) + self.b

x = tf.ones((2, 2))
linear_layer = Linear(4, 2)
y = linear_layer(x)
print(y)

Layers can have non-trainable weights

layer는 학습가능한 가중치와 동시에 학습불가능한 가중치를 가질 수 있습니다. 이러한 가중치는 layer를 학습시킬 때 backprop을 진행하는 동안 학습 대상에서 고려되지 않습니다.

class ComputeSum(layers.Layer):

  def __init__(self, input_dim):
    super(ComputeSum, self).__init__()
    self.total = tf.Variable(initial_value=tf.zeros((input_dim,)),
                             trainable=False)

  def call(self, inputs):
    self.total.assign_add(tf.reduce_sum(inputs, axis=0))
    return self.total

x = tf.ones((2, 2))
my_sum = ComputeSum(2)
y = my_sum(x)
print(y.numpy())
y = my_sum(x)
print(y.numpy())

tf.Variable(~, trainable = False) 는 backprop을 진행하지 않겠다는 것입니다.

layer.weights에서 non-trainable weight로 카테고리화 되어 있습니다.

print('weights:', len(my_sum.weights))
print('non-trainable weights:', len(my_sum.non_trainable_weights))

# It's not included in the trainable weights:
print('trainable_weights:', my_sum.trainable_weights)

Best practice: deferring weight creation until the shape of the inputs is known

위의 로지스틱 회귀 예에서, Linear 클래스는 __init__에서 w와 b의 shape을 결정짓는 input_dim의 인자를 받습니다.

class Linear(layers.Layer):

  def __init__(self, units=32, input_dim=32):
      super(Linear, self).__init__()
      self.w = self.add_weight(shape=(input_dim, units),
                               initializer='random_normal',
                               trainable=True)
      self.b = self.add_weight(shape=(units,),
                               initializer='random_normal',
                               trainable=True)

많은 경우에, 우리는 입력의 크기를 미리 알지 못할 수 있으며, 그 값이 알려질 때, 어떤 때는 레이어를 인스턴스화한 후에서야 가중치를 만들 수도 있습니다.

Keras API에서 다음 예와 같이 build(inputs_shape)에서 layer weights를 만드는 것을 추천합니다.

class Linear(layers.Layer):

  def __init__(self, units=32):
    super(Linear, self).__init__()
    self.units = units

  def build(self, input_shape):
    self.w = self.add_weight(shape=(input_shape[-1], self.units),
                             initializer='random_normal',
                             trainable=True)
    self.b = self.add_weight(shape=(self.units,),
                             initializer='random_normal',
                             trainable=True)

  def call(self, inputs):
    return tf.matmul(inputs, self.w) + self.b

__call__은 자동으로 처음 불렸을 때 build를 실행시킵니다. 좀 더 사용하기가 쉬워졌습니다.

linear_layer = Linear(32)  # At instantiation, we don't know on what inputs this is going to get called
y = linear_layer(x)  # The layer's weights are created dynamically the first time the layer is called

Layers are recursively composable

계층 인스턴스를 다른 계층의 속성으로 할당하면 외부 계층이 내부 계층의 가중치를 사용할 수 있게 됩니다.

우리는 이러한 sublayer들을 __init__ method에서 사용하길 추천합니다(서브레이어는 build가 호출되면 사용될 것입니다)

# Let's assume we are reusing the Linear class
# with a `build` method that we defined above.

class MLPBlock(layers.Layer):

  def __init__(self):
    super(MLPBlock, self).__init__()
    self.linear_1 = Linear(32)
    self.linear_2 = Linear(32)
    self.linear_3 = Linear(1)

  def call(self, inputs):
    x = self.linear_1(inputs)
    x = tf.nn.relu(x)
    x = self.linear_2(x)
    x = tf.nn.relu(x)
    return self.linear_3(x)


mlp = MLPBlock()
y = mlp(tf.ones(shape=(3, 64)))  # The first call to the `mlp` will create the weights
print('weights:', len(mlp.weights))
print('trainable weights:', len(mlp.trainable_weights))

Layers recursively collect losses created during the forward pass

call method를 호출한 후에, 우리는 loss tensor를 다룰 수 있습니다. 이는 self.add_loss(value)를 통해 할 수 있습니다.

# A layer that creates an activity regularization loss
class ActivityRegularizationLayer(layers.Layer):

  def __init__(self, rate=1e-2):
    super(ActivityRegularizationLayer, self).__init__()
    self.rate = rate

  def call(self, inputs):
    self.add_loss(self.rate * tf.reduce_sum(inputs))
    return inputs

이 loss(내부 층에 의해 만들어져 포함된)는 layer.losses를 통해 접근할 수 있습니다. layer.losses는 항상 마지막 forward pass를 하는 동안에 계산된 loss값을 포함하고, 최상위층에서 모든 __call__의 시작부분에서 초기화됩니다.

class OuterLayer(layers.Layer):

  def __init__(self):
    super(OuterLayer, self).__init__()
    self.activity_reg = ActivityRegularizationLayer(1e-2)

  def call(self, inputs):
    return self.activity_reg(inputs)


layer = OuterLayer()
assert len(layer.losses) == 0  # No losses yet since the layer has never been called
_ = layer(tf.zeros(1, 1))
assert len(layer.losses) == 1  # We created one loss value

# `layer.losses` gets reset at the start of each __call__
_ = layer(tf.zeros(1, 1))
assert len(layer.losses) == 1  # This is the loss created during the call above

또한 손실 속성은 내부 layer의 가중치에 대해 생성된 regularization losses도 포함합니다.

class OuterLayer(layers.Layer):

  def __init__(self):
    super(OuterLayer, self).__init__()
    self.dense = layers.Dense(32, kernel_regularizer=tf.keras.regularizers.l2(1e-3))

  def call(self, inputs):
    return self.dense(inputs)


layer = OuterLayer()
_ = layer(tf.zeros((1, 1)))

# This is `1e-3 * sum(layer.dense.kernel)`,
# created by the `kernel_regularizer` above.
print(layer.losses)

이러한 losses는 다음 예와 같이 학습 loops동안에 계산되어집니다.

# Instantiate an optimizer.
optimizer = tf.keras.optimizers.SGD(learning_rate=1e-3)
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Iterate over the batches of a dataset.
for x_batch_train, y_batch_train in train_dataset:
  with tf.GradientTape() as tape:
    logits = layer(x_batch_train)  # Logits for this minibatch
    # Loss value for this minibatch
    loss_value = loss_fn(y_batch_train, logits))
    # Add extra losses created during this forward pass:
    loss_value += sum(model.losses)

    grads = tape.gradient(loss_value, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))

'# Machine Learning > TensorFlow doc 정리' 카테고리의 다른 글

tensorflow 2.0 keras Saving and Serializing Models with Tensorflow Keras (0)	2019.04.15
tensorflow 2.0 keras Writing layers and models with tf keras (2) (0)	2019.04.13
tensorflow 2.0 keras Training and evaluation (4) (0)	2019.04.12
tensorflow 2.0 keras Training and evaluation (3) (1)	2019.04.09
tensorflow 2.0 keras Training and evaluation (2) (0)	2019.04.09

대학원생이 쉽게 설명해보기