1 of 2

Contribute

How to contribute to the Eclipse Deeplearning4j source code.

Prerequisites

Before contributing, make sure you know the structure of all of the Eclipse Deeplearning4j libraries. As of early 2018, all libraries now live in the Deeplearning4j monorepo. These include:

DeepLearning4J: Contains all of the code for learning neural networks, both on a single machine and distributed.
ND4J: “N-Dimensional Arrays for Java”. ND4J is the mathematical backend upon which DL4J is built. All of DL4J’s neural networks are built using the operations (matrix multiplications, vector operations, etc) in ND4J. ND4J is how DL4J supports both CPU and GPU training of networks, without any changes to the networks themselves. Without ND4J, there would be no DL4J.
DataVec: DataVec handles the data import and conversion side of the pipeline. If you want to import images, video, audio or simply CSV data into DL4J: you probably want to use DataVec to do this.
RL4J: Reinforcement Learning for Java. This set of libraries contains the ability to do reinforcement learning built on the deeplearning4j library.
Samediff: Built within the nd4j library, this library contains a tensorflow/pytorch like library for building data flow graphs.

We also have an extensive examples repository at dl4j-examples.

Ways to contribute

There are numerous ways to contribute to DeepLearning4J (and related projects), depending on your interests and experince. Here’s some ideas:

Add new types of neural network layers (for example: different types of RNNs, locally connected networks, etc)
Add a new training feature
Bug fixes
DL4J examples: Is there an application or network architecture that we don’t have examples for?
Testing performance and identifying bottlenecks or areas to improve
Improve website documentation (or write tutorials, etc)
Improve the JavaDocs

There are a number of different ways to find things to work on. These include:

Looking at the issue trackers:
https://github.com/eclipse/deeplearning4j/issues
https://github.com/eclipse/deeplearning4j-examples/issues
Reviewing our Roadmap
Talking to the developers on the community forums
Reviewing recent papers and blog posts on training features, network architectures and applications
Reviewing the website and examples - what seems missing, incomplete, or would simply be useful (or cool) to have?

General guidelines

Before you dive in, there’s a few things you need to know. In particular, the tools we use:

Maven: a dependency management and build tool, used for all of our projects. See this for details on Maven.
Git: the version control system we use
Project Lombok: Project Lombok is a code generation/annotation tool that is aimed to reduce the amount of ‘boilerplate’ code (i.e., standard repeated code) needed in Java. To work with source, you’ll need to install the Project Lombok plugin for your IDE
VisualVM: A profiling tool, most useful to identify performance issues and bottlenecks.
IntelliJ IDEA: This is our IDE of choice, though you may of course use alternatives such as Eclipse and NetBeans. You may find it easier to use the same IDE as the developers in case you run into any issues. But this is up to you.

Things to keep in mind:

Code should be Java 7 compliant
If you are adding a new method or class: add JavaDocs
You are welcome to add an author tag for significant additions of functionality. This can also help future contributors, in case they need to ask questions of the original author. If multiple authors are present for a class: provide details on who did what (“original implementation”, “added feature x” etc)
Provide informative comments throughout your code. This helps to keep all code maintainable.
Any new functionality should include unit tests (using JUnit) to test your code. This should include edge cases.
If you add a new layer type, you must include numerical gradient checks, as per these unit tests. These are necessary to confirm that the calculated gradients are correct
If you are adding significant new functionality, consider also updating the relevant section(s) of the website, and providing an example. After all, functionality that nobody knows about (or nobody knows how to use) isn’t that helpful. Adding documentation is definitely encouraged when appropriate, but strictly not required.
If you are unsure about something - ask us on the community forums!

Eclipse Contributors

IP/Copyright requirements for Eclipse Foundation Projects

This page explains steps required to contribute code to the projects in the eclipse/deeplearning4j GitHub repository: https://github.com/eclipse/deeplearning4j

Contributors (anyone who wants to commit code to the repository) need to do two things, before their code can be merged:

Sign the Eclipse Contributor Agreement (once)
Sign commits (each time)

Why Is This Required?

These two requirements must be satisfied for all Eclipse Foundation projects, not just DL4J and ND4J. A full list of Eclipse Foundation Projects can be found here: https://projects.eclipse.org/

By signing the ECA, you are essentially asserting that the code you are submitting is something that either you wrote, or that you have the right to contribute to the project. This is a necessary legal protection to avoid copyright issues.

By signing your commits, you are asserting that the code in that particular commit is your own.

Signing the Eclipse Contributor Agreement

You only need to sign the Eclipse Contributor Agreement (ECA) once. Here's the process:

Step 1: Sign up for an Eclipse account

This can be done at https://accounts.eclipse.org/user/register

Note: You must register using the same email as your GitHub account (the GitHub account you want to submit pull requests from).

Step 2: Sign the ECA

Go to https://accounts.eclipse.org/user/eca and follow the instructions.

Signing Your Commits

Signing a New Commit

There are a few ways to sign commits. Note that you can use any of these aoptions.

Option 1: Use -s When Committing on Command Line

Signing commits here is simple:

git commit -s -m "My signed commit"

Note the use of -s (lower case s) - upper-case S (i.e., -S) is for GPG signing (see below).

Option 2: Set up Bash Alias (or Windows cmd Alias) for Automated Signing

For example, you could set up the following alias in Bash:

alias gcm='git commit -s -m'

Then committing would be done with the following:

gcm "My Commit"

For Windows command line, similar options are available through a few mechanisms (see here)

One simple way is to create a gcm.bat file with the following contents, and add it to your system path:

@echo off
echo.
git commit -s -m %*

You can then commit using the same process as above (i.e., gcm "My Commit")

Option 3: Use GPG Signing

For details on GPG signing, see this link

Note that this option can be combined with aliases (above), as in alias gcm='git commit -S -m' - note the upper case -S for GPG signing.

Option 4: Commit using IntelliJ with Auto Signing

IntelliJ can be used to perform git commits, including through signed commits. See this page for details.

Checking If A Commit Is Signed

After performing a commit, you can check in a few different ways. One way is to use git log --show-signature -1 to show the signature for the last commit (use -5 to show the last 5 commits, for example)

The output will look like:

$ git log --show-signature -2
commit 81681455918371e29da1490d3f0ca3deecaf0490 (HEAD -> commit_test_branch)
Author: YourName <you@email.com>
Date:   Fri Jun 21 22:27:50 2019 +1000

    This commit is unsigned

commit 2349c6aa3497bd65866d7d0a18fe82bb691bb868
Author: YourName <you@email.com>
Date:   Fri Jun 21 21:42:38 2019 +1000

    My signed commit

    Signed-off-by: YourName <you@email.com>

The top commit is unsigned, and the bottom commit is signed (note the presence of the Signed-off-by).

If You Forget to Sign a Commit - Amending the Last Commit

If you forgot to sign the last commit, you can use the following command:

git commit --amend --signoff

If You Forget to Sign Multiple Commits

Suppose your branch has 3 new commits, all of which are unsigned:

$ git log -4 --oneline
4b164026 (HEAD -> commit_test_branch) Your new commit 3
d7799615 Your new commit 2
6bb6113a Your new commit 1
ef09606c This commit already exists

One simple way is to squash and sign these commits. To do this for the last 3 commits, use the following: (note you might want to make a backup first)

git reset --soft HEAD~3
git commit -s -m "Squashed and signed"

The result:

$ git log -2 --oneline
31658e11 (HEAD -> commit_test_branch) Squashed and signed
ef09606c This commit already exists

You can confirm that the commit is signed using git log -1 --show-signature as shown earlier.

Note that your commits will be squashed once they are merged to master anyway, so the loss of the commit history does not matter.

If you are updating an existing PR, you may need to force push using -f (as in git push X -f).