# Neural ODEs with Euler's Method

}

In classical data modelling problems, we have $N$ pairs of data $(x_{i}, y_{i})$ that are used in training of the model. To make a prediction, we feed a new data point $x^{*}$ into the model to find $y^{*}$. We assume that there is an inherent relationship between pairs of $(x,y)$. In other words, there is an unknown function $F$ that maps between domains $X$ and $Y$. The machine learning approaches can be used to approximate the function $F$.

## How would we like to approximate $F$?

We can try to approximate the function $F$ with curve fitting or neural networks. By using neural networks, we assume that there is a continuous and differentiable relationship between $x$ and $y$. Therefore, our approximation of $F$ is differentiable. What about modeling derivative of $F$ instead of itself? In terms of parametric efficiency, modeling the derivative may be practical.

$F(x)=\int f(x) dx \textrm{, where } F^{\prime}(x)=f$ $\frac{d y}{d x} \approx f(x, y)$

We are planning to approximate an ODE that governs the function F which is $f$. Solving the approximated ODE is equivalent to function approximation of $F$. Numerical methods can be utilized to solve an approximated ODE. In this post, we will use a simple numerical method which is Euler’s Method. The source of the idea is the beautiful blog of “Understanding Neural ODEs” by Jonth Sinai.

## What is Euler’s Method?

Euler Method is simply re-writing the basic definition of the gradient at a point with a fixed step size $\delta$.

$\frac{d y}{d x} \approx \frac{y(x+\delta)-y(x)}{\delta}$ $y(x+\delta)=y(x)+\delta \frac{d y}{d x}$

The algorithm for the Euler’s Method is following:

• define $f=(x,y)$ derivative of $F(x)$
• input $x_{0}$ and $y_{0}$
• input step size $\delta$ and number of steps $n$
• for $j$ from 1 to $n$:
• $m=f(x_{0},y_{0})$
• $y_{1}$ = $y_{0}+\delta * m$
• $x_{1}=x_{0}+h$
• $x_{0}=x_{1}$
• $y_{0}=y_{1}$

It is iteratively using the gradient at the current point to decide in which direction to move in and by how much.


def euler(givenfunction,x0,y0,h,n,i):
m=givenfunction(x0,y0)
y1=y0+h*m
x1=x0+h
i=i+1
if i==n:
xs.append(round(x1,2))
ys.append(round(y1,5))
return (x1,y1)
else:
xs.append(round(x1,2))
ys.append(round(y1,5))
return euler(givenfunction,x1,y1,h,n,i)


For the given equation and initial condition

$y^{\prime}=-sin(x)$ with $y(0)=1$

def f_prime(x,y):
y_prime = -np.sin(x)
return y_prime


Pick the values for $\delta$ and $n$ then run the algorithm with

delta=0.1
x0=0
y0=1
n=len(np.arange(0, 5.0, delta))
xs=[x0]
ys=[y0]

_=euler(f_prime,x0,y0,delta,n,i=1)


Also, find the analytical solution to compare with Euler’s Method. Notice that we cannot do the comparison in a real problem we do not have the $f$ itself.

$y(x)=cos(x)$

def f(x):
y=np.cos(x)
return y

x_range = np.arange(0.0, 5.0, 0.02)

plt.figure(figsize=(10,5))
plt.plot(xs,ys,'bo',label='Euler\'s method')
plt.plot(x_range, f(x_range), 'k', label='Analytical Solution')
plt.legend()
plt.show()
plt.show()



As we see in the figure above, Euler’s Method gives a good approximation of the function $F$ by using the ODE that describes the rate of change.

## Approximating the governing ODE with a NN

We will try to approximate derivative of unknown function $y=F(x)$.

$y(t)=1+\frac{1}{2} \mathbf{e}^{-4 t}-\frac{1}{2} \mathbf{e}^{-2 t}$ $\frac{d y}{d x}=f(x, y)=a$

In the approximation, the value $a$ is the only free parameter that should be optimized. We will again utilize Euler’s Method for our forward pass on our data.

• Select $\delta$ as stepsize.
• Choose $k$ ordered data pairs in our dataset.
• Create a regularly spaced computation trajectory between first and last data point where each interval has a length of $\delta$.
• Compute predictions along the computation trajectory with the ODE approximation and Euler’s Method.
• Measure the loss at $k$ datapoints
• Perform backpropagation with your optimizer.

Now, let’s see all of the steps in code. You can go over the steps in Google Colab.


import torch

#function F
def Func(x):

y = 1+0.5*np.exp(-4*x)-0.5*np.exp(-2*x)
return y

#dataset
delta=0.1
x = np.arange(-1,4,delta)
y = Func(x)

#sample sorted sequences in dataset
N = 1000
seq_len=5
data = np.empty((N,2,seq_len))
for i in range(N):
data[i]= np.stack((X,Y))

#neural network for approximating f which is derivative of F
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class Net(nn.Module):

def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(2, 4)
self.fc2 = nn.Linear(4, 8)
self.fc3 = nn.Linear(8, 4)
self.fc4 = nn.Linear(4, 1)

def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = (self.fc4(x))
return x

#initialize the network
net = Net()
print(net)

#initialize loss function and optimizer
criterion = nn.MSELoss()

#training
N=1000
epochs=10
bs = 1
for epoch in range(epochs):
for index in range(N):
Ys=torch.zeros((1,seq_len-1))
data_point_X = data[index,[0]]
data_point_Y = data[index,[1]]

X = torch.Tensor(data_point_X[:,0])
Y = torch.Tensor(data_point_Y[:,0])
XY = torch.stack([X,Y]).view(bs,2).detach()

for j in range(data_point_X.shape[1]-1):
for i in np.arange(data_point_X[:,j][0],data_point_X[:,j+1][0],delta):

m_net_out = net(XY.detach())
Y=(Y.detach()+delta*m_net_out[0])
X=(X+delta)
XY = torch.stack([X,Y]).view(bs,2)
Ys[0,j]=Y
loss = criterion(Ys,torch.Tensor(data_point_Y[:,1:seq_len]))
print("loss",loss.item())
loss.backward()
optimizer.step()

#inference
x = np.arange(-1,4,delta)
y0 = Func(-1)
X = torch.Tensor([x[0]])
Y = torch.Tensor([y0])

XY = torch.stack([X,Y]).view(bs,2).detach()
Ys=torch.zeros((1,len(x)))
net.eval()
Ys[0,0]=y0
for i in range(len(x)-1):
m_net_out = net(XY)
Y=(Y+delta*m_net_out[0])
X=(X+delta)
XY = torch.stack([X,Y]).view(bs,2)
Ys[0,i+1]=Y

#real trajectory
x_compare = np.arange(-1,4,0.01)
y_compare = Func(x_compare)


plt.figure(figsize=(10,7))
plt.plot(x_compare,y_compare,'k',label='Real trajectory')
plt.plot(x[0], Ys.numpy().reshape(len(x))[0], 'r*', label='Initial condition', ms=10)
plt.plot(x[1:], Ys.numpy().reshape(len(x))[1:], 'bo', label='Predictions')
plt.legend()
plt.show()



As we can see above, we inferred the function itself by approximating the governing ODE and solving it numerically with Euler’s method.

Sources:

Written on August 25, 2020