All Superinterfaces:: Serializable

public interface ActivationFunction extends Serializable

The activation function in hidden layers.

Method Summary

Modifier and Type

Method

Description

void

f(Vector x)

The output function.

void

g(Vector g, Vector y)

The gradient function.

static ActivationFunction

leaky()

The leaky rectifier activation function max(x, 0.01x).

static ActivationFunction

leaky(double a)

The leaky rectifier activation function max(x, ax) where 0 <= a < 1.

static ActivationFunction

linear()

Linear/Identity activation function.

String

name()

Returns the name of activation function.

static ActivationFunction

rectifier()

The rectifier activation function max(0, x).

static ActivationFunction

sigmoid()

Logistic sigmoid function: sigmoid(v)=1/(1+exp(-v)).

static ActivationFunction

tanh()

Hyperbolic tangent activation function.

Method Details
- name
  
  String name()
  
  Returns the name of activation function.
  
  Returns:
  
  the name of activation function.
- f
  
  void f(Vector x)
  
  The output function.
  
  Parameters:
  
  x - the input vector.
- g
  
  void g(Vector g, Vector y)
  
  The gradient function.
  
  Parameters:
  
  g - the gradient vector. On input, it holds W'*g, where W and g are the weight matrix and gradient of upper layer, respectively. On output, it is the gradient of this layer.
  
  y - the output vector.
- linear
  
  static ActivationFunction linear()
  
  Linear/Identity activation function.
  
  Returns:
  
  the linear activation function.
- rectifier
  
  static ActivationFunction rectifier()
  
  The rectifier activation function max(0, x). It is introduced with strong biological motivations and mathematical justifications. The rectifier is the most popular activation function for deep neural networks. A unit employing the rectifier is called a rectified linear unit (ReLU).
  ReLU neurons can sometimes be pushed into states in which they become inactive for essentially all inputs. In this state, no gradients flow backward through the neuron, and so the neuron becomes stuck in a perpetually inactive state and "dies". This is a form of the vanishing gradient problem. In some cases, large numbers of neurons in a network can become stuck in dead states, effectively decreasing the model capacity. This problem typically arises when the learning rate is set too high. It may be mitigated by using leaky ReLUs instead, which assign a small positive slope for x < 0 however the performance is reduced.
  
  Returns:
  
  the rectifier activation function.
- leaky
  
  static ActivationFunction leaky()
  
  The leaky rectifier activation function max(x, 0.01x).
  
  Returns:
  
  the leaky rectifier activation function.
- leaky
  
  static ActivationFunction leaky(double a)
  
  The leaky rectifier activation function max(x, ax) where 0 <= a < 1. By default a = 0.01. Leaky ReLUs allow a small, positive gradient when the unit is not active. It has a relation to "maxout" networks.
  
  Parameters:
  
  a - the parameter of leaky ReLU.
  
  Returns:
  
  the leaky rectifier activation function.
- sigmoid
  
  static ActivationFunction sigmoid()
  
  Logistic sigmoid function: sigmoid(v)=1/(1+exp(-v)). For multi-class classification, each unit in output layer corresponds to a class. For binary classification and cross entropy error function, there is only one output unit whose value can be regarded as posteriori probability.
  
  Returns:
  
  the logistic sigmoid activation function.
- tanh
  
  static ActivationFunction tanh()
  
  Hyperbolic tangent activation function. The tanh function is a rescaling of the logistic sigmoid, such that its outputs range from -1 to 1.
  
  Returns:
  
  the hyperbolic tangent activation function.

Interface ActivationFunction

Method Summary

Method Details

name

f

g

linear

rectifier

leaky

leaky

sigmoid

tanh