functions: This module has the basic building blocks to build SameDiff variables and graphs.
execution: has everything related to SameDiff graph execution.
gradcheck: Utility functionality for checking SameDiff gradients, similar in structure to the respective tool in DL4J.
loss: Loss functions for SameDiff
samediff: Main SameDiff module to define, set up and run SameDiff operations and graphs.
DifferentialFunction, which underlies pretty much everything in SameDiff. Mathematically, what we're doing in SameDiff is build a directed acyclic graph whose nodes are differential functions, for which we can compute gradients. In that regard,
DifferentialFunctionmakes up a SameDiff graph on a fundamental level.
DifferentialFunctioncomes with a
SameDiffinstance. We'll discuss
SameDiffand this relationship later on. Also, while there's only few key abstractions, they're essentially used everywhere, so it's almost impossible to discuss SameDiff concepts separately. Eventually we'll get around to each part.
resolvePropertiesFromSameDiffBeforeExecutionare what you want to look at to get started.
args(): returns all input variables.
arg(): returns the first input variable (the only one for unary operations).
rarg(): return the first and second (read "left" and "right") argument for binary operations
outputVariables(): returns a list of all output variables. Depending on the operation, this may be computed dynamically. As we'll see later on, to get the result for ops with a single output, we'll call
calculateOutputShapefor a differential function might be necessary, but if implemented incorrectly can lead to hard-to-debug failures. (Note that SameDiff will eventually call op execution in
libnd4jand dynamic custom ops either infer output shapes or need to be provided with the correct output shape.)
doDiff. Each operation has to provide an implementation of
doDiff. If you're implementing a SameDiff operation for a
xand you're lucky to find
x_bp(as in "back-propagation") you can use that and your
doDiffimplementation comes essentially for free.
diffimplementation that's used internally and calls
DifferentialFunctionFactory, by calling
f(). More precisely, this will return the factory of the SameDiff instance the differential function has:
Sumoperation defined elsewhere in ND4J and then return the first output variable (of type
SDVariable, discussed in a second). Disregarding the implementation details for now, what this allows you to do is call
f().sum(...)from anywhere you have access to a differential function factory. For instance, when implementing a SameDiff op
xand you already have
x_bpin your function factory, you can override
SDVariable(read SameDiff variable) extends
DifferentialFunctionand is to SameDiff what
INDArrayis to good old ND4J. In particular, SameDiff graphs operate on these variables and each individual operation takes in and spits out a list of
SDVariablecomes with a name, is equipped with a
SameDiffinstance, has shape information and knows how to initialize itself with an ND4J
WeightInitScheme. You'll also find a few helpers to set and get these properties.
SDVariablecan do that a
DifferentialFunctioncan't it evaluate its result and return an underlying
eval(). This will run SameDiff internally and retrieve the result. A similar getter is
getArr()which you can call at any point to get the current value of this variable. This functionality is used extensively in testing, to assert proper results. An
SDVariablealso has access to its current gradient through
gradient(). Upon initialization there won't be any gradient, it will usually be computed at a later point.
SDVariablealso carries methods for concrete ops (and is in that regard a little similar to
DifferentialFunctionFactory). For instance, defining
c = a.add(b)on two SameDiff variables, the result of which can be accessed by
SameDiffclass is the main workhorse of the module and brings together most of the concepts discussed so far. A little unfortunately, the inverse is also true and
SameDiffinstances are part of all other SameDiff module abstractions in some way or the other (which is why you've seen it many times already). Generally speaking,
SameDiffis the main entry point for automatic differentiation and you use it to define a symbolic graph that carries operations on
SDVariables. Once built, a SameDiff graph can be run in a few ways, for instance
propertiesForFunctionare of particular note.
SameDiffis also the place where you expose new operations to the SameDiff module. Essentially, you write a little wrapper for the respective operation in the
f(). Here's an example for cross products:
SameDiffto expose them to SameDiff at various levels. As for actually implementing these ops, you need to know a few things. In libnd4j you find two classes of operations, which are described here in detail. We'll show how to implement both op types.
layers, which is reserved for deep learning layer implementations (like
Conv2D). These higher-level ops are based on the concept of Modules, similar to modules in pytorch or layers in TensorFlow. These layer op implementation also provide a source of more involved op implementations.
coslegacy op from libn4j: Cosine implementation. When it comes to SameDiff, the good thing about legacy ops is that they're already available in ND4J, but need to be augmented by SameDiff specific functionality to pass the muster. Since the cosine function does not have any properties, this implementation is straightforward. The parts that make this op SameDiff compliant are:
BaseTransformOp, which implements other SameDiff functionality. (Note that
BaseOp, which extends
DifferentialFunctionfrom earlier.) For instance,
calculateOutputShapeis implemented there. If you want to implement a new transform, you can simply inherit from
BaseTransformOp, too. For other op types like reductions etc. there are op base classes available as well, meaning you only need to address the three bullet points above.
crops. Note how
crops, which are both of integer type, get added to integer arguments for the operation by calling
addIArgument. For float arguments and other types, use
DynamicPartition. This op has precisely one property, called
numPartitionsin SameDiff. To map and use this property, you do the following:
addArgsthat is used in the constructor of the op and in an import helper one-liner that we're discussing next. It's not necessary, but encouraged to do this and call it
addArgsconsistently, for clarity.
DynamicPartitionhas proper property mapping, it currently does not have a working
Dilation2D. Not only has this op far more properties to map, as you can see in
mappingsForFunction, the properties also come with property values, as defined in
attributeAdaptersForFunction. We've chosen to show this op because it is one that has property mapping, but is neither exposed to
DynamicCustomOpexamples shown each come with their own defects and represent examples of the work that has to be done for SameDiff. To summarize, to add a new SameDiff op you need to: