Loading...
Loading...
DL4J provides the following classes to configure networks:
MultiLayerNetwork
ComputationGraph
MultiLayerNetwork
consists of a single input layer and a single output layer with a stack of layers in between them.
ComputationGraph
is used for constructing networks with a more complex architecture than MultiLayerNetwork
. It can have multiple input layers, multiple output layers and the layers in between can be connected through a direct acyclic graph.
Whether you create MultiLayerNetwork
or ComputationGraph
, you have to provide a network configuration to it through NeuralNetConfiguration.Builder
. As the name implies, it provides a Builder pattern to configure a network. To create a MultiLayerNetwork
, we build a MultiLayerConfiguraion
and for ComputationGraph
, it’s ComputationGraphConfiguration
.
The pattern goes like this: [High Level Configuration] -> [Configure Layers] -> [Build Configuration]
Here we are calling list() to get the ListBuilder
. It provides us the necessary api to add layers to the network through the layer(arg1, arg2)
function.
The first parameter is the index of the position where the layer needs to be added.
The second parameter is the type of layer we need to add to the network.
To build and add a layer we use a similar builder pattern as:
Finally, the last build()
call builds the configuration for us.
You can get your network configuration as String, JSON or YAML for sanity checking. For JSON we can use the toJson()
function.
Finally, to create a MultiLayerNetwork
, we pass the configuration to it as shown below
The only difference here is the way we are building layers. Instead of calling the list()
function, we call the graphBuilder()
to get a GraphBuilder
for building our ComputationGraphConfiguration
. Following table explains what each function of a GraphBuilder
does.
The output layers defined here use another function lossFunction
to define what loss function to use.
You can get your network configuration as String, JSON or YAML for sanity checking. For JSON we can use the toJson()
function
Finally, to create a ComputationGraph
, we pass the configuration to it as shown below
with Skip Connections
With deep learning, we can compose a deep neural network to suit the input data and its features. The goal is to train the network on the data to make predictions, and those predictions are tied to the outcomes that you care about; i.e. is this transaction fraudulent or not, or which object is contained in the photo? There are different techniques to configure a neural network, and all of them build a relational hierarchy between the inputs and outputs.
In this tutorial, we are going to configure the simplest neural network and that is logistic regression model network.
Regression is a process that helps show the relations between the independent variables (inputs) and the dependent variables (outputs). Logistic regression is one in which the dependent variable is categorical rather than continuous - meaning that it can predict only a limited number of classes or categories, like a switch you flip on or off. For example, it can predict that an image contains a cat or a dog, or it can classify input in ten buckets with the integers 0 through 9.
A simple logistic regression calculates x*w + b = y
. Where x
is an instance of input data, w
is the weight or coefficient that transforms that input, b
is the bias and y
is the output, or prediction about the data. The biological terms show how this artificial neuron loosely maps to a neuron in the human brain. The most important point is how data flows through and is transformed by this structure.
We’re going to configure the simplest network, with just one input layer and one output layer, to show how logistic regression works.
We are going to first build the layers and then feed these layers into the network configuration.
You may be wondering why didn’t we write any code for building our input layer. The input layer is only a set of inputs values fed into the network. It doesn’t perform a calculation. It’s just an input sequence (raw or pre-processed data) coming into the network, data to be trained on or to be evaluated. Later, we are going to work with data iterators, which feed input to a network in a specific pattern, and which can be thought of as an input layer of the network.
Loading...
In our previous tutorial, we learned about a very simple neural network model - the logistic regression model. Although you can solve many tasks with a simple model like that, most of the problems require a much complex network configuration. Typical Deep leaning model consists of many layers between the inputs and outputs. In this tutorial, we are going to learn about one of those configuration i.e. Feed-forward neural networks.
Feed-forward networks are those in which there is not cyclic connection between the network layers. The input flows forward towards the output after going through several intermediate layers. A typical feed-forward network looks like this:
Here you can see a different layer named as a hidden layer. The layers in between our input and output layers are called hidden layers. It’s called hidden because we don’t directly deal with them and hence not visible. There can be more than one hidden layer in the network.
Just as our softmax activation after our output layer in the previous tutorial, there can be activation functions between each layer of the network. They are responsible to allow (activate) or disallow our network output to the next layer node. There are different activation functions such as sigmoid and relu etc.
As you can see above that we have made a feed-forward network configuration with one hidden layer. We have used a RELU activation between our hidden and output layer. RELUs are one of the most popularly used activation functions. Activation functions also introduce non-linearities in our network so that we can learn on more complex features present in our data. Hidden layers can learn features from the input layer and it can send those features to be analyzed by our output layer to get the corresponding outputs. You can similarly make network configurations with more hidden layers as:
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...