# Interactive Automatic Differentiation With Clad and Jupyter Notebooks

*Tutorial level: Intro*

xeus-cling provides a Jupyter kernel for C++ with the help of the C++ interpreter cling and the native implementation of the Jupyter protocol xeus.

Within the xeus-cling framework, Clad can enable automatic differentiation (AD) such that users can automatically generate C++ code for their computation of derivatives of their functions.

## Rosenbrock Function¶

In mathematical optimization, the Rosenbrock function is a non-convex function used as a performance test problem for optimization problems. The function is defined as:

```
#include "clad/Differentiator/Differentiator.h"
#include <chrono>
```

```
// Rosenbrock function declaration
double rosenbrock_func(double x, double y) {
return (x - 1) * (x - 1) + 100 * (y - x * x) * (y - x * x);
}
```

In order to compute the function’s derivatives, we can employ both Clad’s Forward Mode or Reverse Mode as detailed below:

## Forward Mode AD¶

For a function *f* of several inputs and single (scalar) output, forward
mode AD can be used to compute (or, in case of Clad, create a function) computing a
directional derivative of *f* with respect to a single specified input
variable. Moreover, the generated derivative function has the same signature as the
original function *f*, however its return value is the value of the
derivative.

```
double rosenbrock_forward(double x[], int size) {
double sum = 0;
auto rosenbrockX = clad::differentiate(rosenbrock_func, 0);
auto rosenbrockY = clad::differentiate(rosenbrock_func, 1);
for (int i = 0; i < size-1; i++) {
double one = rosenbrockX.execute(x[i], x[i + 1]);
double two = rosenbrockY.execute(x[i], x[i + 1]);
sum = sum + one + two;
}
return sum;
}
```

```
const int size = 100000000;
double Xarray[size];
for(int i=0;i<size;i++)
Xarray[i]=((double)rand()/RAND_MAX);
```

```
double forward_time = timeForwardMode();
```

Elapsed time for rosenbrock_forward: 3.038005 s The result of the function is 3232877463.475859.

## Reverse Mode AD¶

Reverse-mode AD enables the gradient computation within a single pass of the
computation graph of *f* using at most a constant factor (around 4) more
arithmetical operations compared to the original function. While its constant
factor and memory overhead is higher than that of the forward-mode, it is
independent of the number of inputs.

Moreover, the generated function has void return type and same input arguments.
The function has an additional argument of type T*, where T is the return type of
*f*. This is the “result” argument which has to point to the
beginning of the vector where the gradient will be stored.

```
double rosenbrock_reverse(double x[], int size) {
double sum = 0;
auto rosenbrock_dX_dY = clad::gradient(rosenbrock_func);
for (int i = 0; i < size-1; i++) {
double result[2] = {};
rosenbrock_dX_dY.execute(x[i],x[i+1], result);
sum = sum + result[0] + result[1];
}
return sum;
}
```

```
double reverse_time = timeReverseMode();
```

Elapsed time for rosenbrock_reverse: 2.912199 s The result of the function is 3232877463.475859.

## Performance Comparison¶

The derivative function created by the forward-mode AD is guaranteed to have at most a constant factor (around 2-3) more arithmetical operations compared to the original function. Whilst for the reverse-mode AD for a function having N inputs and consisting of T arithmetical operations, computing its gradient takes a single execution of the reverse-mode AD and around 4T operations. In comparison, it would take N executions of the forward-mode, this requiring up to N3*T operations.

```
double difference = forward_time - reverse_time;
printf("Forward - Reverse timing for an array of size: %d is: %fs\n", size, difference);
```

Forward - Reverse timing for an array of size: 100000000 is: 0.125806s

## Clad Produced Code¶

We can now call `rosenbrockX`

/ `rosenbrockY`

/
`rosenbrock_dX_dY.dump()`

to obtain a print out of the Clad’s
generated code. As an illustration, the reverse-mode produced code is:

```
auto rosenbrock_dX_dY = clad::gradient(rosenbrock_func);
rosenbrock_dX_dY.dump();
```

The code is: void rosenbrock_func_grad(double x, double y, double *_result) { double _t2; double _t3; double _t4; double _t5; double _t6; double _t7; double _t8; double _t9; double _t10; _t3 = (x - 1); _t2 = (x - 1); _t7 = x; _t6 = x; _t5 = (y - _t7 * _t6); _t8 = 100 * _t5; _t10 = x; _t9 = x; _t4 = (y - _t10 * _t9); double rosenbrock_func_return = _t3 * _t2 + _t8 * _t4; goto _label0; _label0: { double _r0 = 1 * _t2; _result[0UL] += _r0; double _r1 = _t3 * 1; _result[0UL] += _r1; double _r2 = 1 * _t4; double _r3 = _r2 * _t5; double _r4 = 100 * _r2; _result[1UL] += _r4; double _r5 = -_r4 * _t6; _result[0UL] += _r5; double _r6 = _t7 * -_r4; _result[0UL] += _r6; double _r7 = _t8 * 1; _result[1UL] += _r7; double _r8 = -_r7 * _t9; _result[0UL] += _r8; double _r9 = _t10 * -_r7; _result[0UL] += _r9; } }