1 Mathematics Behind Dynamics

We have motivated the entire thesis with the usefulness of the knowledge and the understanding of synchronization in neuroscience. However, we will not describe any experiments performed on real networks of neurons in a lab. Instead, we shall deal with the mathematical abstraction for the studied object, e.g., coupled neurons.

This abstraction is typically called a (mathematical) model. The model should, in theory, capture all the important characteristics of the underlying reality. If the state of the model evolves causally in time, e.g., a model of a neuron starts spiking when evolving in time, with dependence solely on the current state, we often use a dynamical system as the model. On the other hand, if the governing rules are noncausal with respect to only the current state due to the dependence on the history of the system, the arising dynamical system becomes infinite dimensional (and thus harder to work with). For better “mathematical ergonomics” we will define a semidynamical system to deal with delayed dependencies.

1.1 A Primer on Dynamical Systems

First and foremost, we will formally introduce the notion of a dynamical system, along with its properties. This section is mostly based on [1] and [2] in terms of content. It should be noted the notation follows [3], [4], [5], or [6] for better compatibility with semidynamical systems. Also, [7] was used as a valuable inspiration for the structure of this section.

Definition 1.1 Dynamical system (more precisely deterministic dynamical system) is a triple \((\timeSet, \stateSpace, \evolOp)\), where \((\timeSet, +, \leq)\), \(\timeSet \subseteq \R\), is an ordered commutative semigroup with identity element \(0\), \(\stateSpace\) (called state space) is a metric space, and \(\evolOp\) is a continuous¹ evolution map \[ \begin{aligned} \evolOp : {} & \timeSet \times \timeSet \times \stateSpace & &\to & &\stateSpace \\ & (t, s, x) &&\mapsto &&\evolOp(t, s, x), \end{aligned} \] projecting a state \(x\) from time \(s\) to time \(t\), which satisfies

(deterministic property): \(\evolOp(s,s, x) = x\) for each \(x \in \stateSpace\) and \(s \in \timeSet\),
(composition property): \(\evolOp(t,s,x) = \evolOp(t,r,\evolOp(r,s,x))\) for each \(x \in \stateSpace\) and \(t,r,s \in \timeSet\).

If \((\timeSet, +)\) forms a group, we shall speak of an invertible dynamical system.

For simplicity, we shall assume the commutative monoid \((\timeSet, +)\) is cancellative, i.e., \(a + c = b + c \implies a = b\) for each \(a,b,c \in \timeSet\). Then it can be embedded in a Grothendieck group \(\timeSet_G\), see [8].

Definition 1.2 Autonomous dynamical system is a dynamical system such that the autonomous property, \[ \evolOp(t, s, x) = \evolOp(t + r, s + r, x) \tag{1.1}\] for each \(x \in \stateSpace\), \(t,s \in \timeSet\), and \(r \in \timeSet_G\), is fulfilled.

Note 1.1: Evolution operator as monoid action

In the case of Definition 1.2, we can always vanish the starting time by taking \(r = -s\) in (1.1), which is ensured by \(r \in \timeSet_G\). Then, by abuse of notation, we will write \(\evolOp: \timeSet \times \stateSpace \to \stateSpace\). Moreover, the composition and autonomous properties combine into the semigroup property, \[ \evolOp(t, \evolOp(s, x)) = \evolOp(t + s, x) \; \forall x \in \stateSpace, \forall t,s \in \timeSet, \] i.e., the map \(\evolOp\) is a (left) monoid action of \((\timeSet, +)\) on \(\stateSpace\). The operator \(\evolOp\) is typically called flow of the dynamical system.

In Definition 1.1, the time set \(\timeSet\) can take on various forms. In ecology, we often see a discrete \(\timeSet = \Z\) or \(\N_0\) representing a yearly intervals between measurements of our system — given this choice of \(\timeSet\), we call the dynamical system discrete.

On the other hand, in physics (and neuroscience) we typically employ \(\timeSet = \R\) or \(\R^+\) as we are concerned with even the shortest time intervals and associated changes — corresponding systems are called continuous or continuous-time dynamical systems. Similarly, the exact choice of the state space \(\stateSpace\) is dependent on the system in question, but typically we use \(\R^n\), although we will also encounter infinite-dimensional state space.

The deterministic property in Definition 1.1 guarantees the system does not abruptly change state. Should this condition be broken, we would call such a system stochastic, see Section 1.1.5. Moreover, a common assumption that the “laws of nature” do not change over time restricts us to autonomous dynamical systems, which will be exclusively used in this thesis, where it is captured by the semigroup (or autonomous) property. In particular, dependence on the past is permitted, but requires special attention, as we shall see in Section 1.2. If the system is invertible, the (semi)group property implies that the solution can be found both forward and backward in time.

Most often, a dynamical system is given implicitly by a differential or difference equation. For example, in population dynamics, one of the earliest models, the Verhulst model of population², can prescribed by an ordinary differential equation (ODE) \[ \dot{x}(t) = \deriv {x(t)} t = x(t) \cdot r_0 \cdot \brackets{1 - \frac {x(t)} K}, \tag{1.2}\] or a difference equation (although these two models are not equivalent) \[ x(t+1) = x(t) \cdot r_0 \cdot \brackets{1 - \frac {x(t)} K}. \tag{1.3}\]

1.1.1 Basic Concepts

In this section, we shall introduce basic concepts regarding dynamical systems, including, but not limited to, notions of certain special solutions and their stability. Little comment besides the definitions themselves will be provided, as an interested reader can find much more in [2], [9], or [1].

Definition 1.3 (Orbit) An orbit (trajectory) with an initial point \(x_0 \in \stateSpace\) is an ordered subset of the state space \(\stateSpace\), \[ \Orbit{x_0} = \set{x \in \stateSpace \divider x = \evolOp(t, x_0), \forall t \in \timeSet \text{ such that } \evolOp(t, x_0) \text{ is defined}} \]

In the case of a continuous dynamical system, the orbits are oriented curves in the state space. For a discrete dynamical system, they become sequences of points in \(\stateSpace\).

Definition 1.4 (Phase portrait) A phase portrait of a dynamical system is a partitioning of the state space into trajectories.

Definition 1.5 (Equilibrium) A point \(x^* \in \stateSpace\) is called an equilibrium (fixed point) if \(\evolOp(t, x^*) = x^*\) for all \(t \in \timeSet\).

Definition 1.6 (Cycle) A cycle is a periodic orbit, namely a non-equilibrium orbit \(L\), such that each point \(x_0 \in L\) satisfies \(\evolOp(t + T, x_0) = \evolOp(t, x_0)\) with some \(T > 0\), for all \(t \in \timeSet\). The smallest admissible \(T\) is called the period of the cycle \(L\).

Definition 1.7 (Invariant set) An invariant set of a dynamical system \((\timeSet, \stateSpace, \evolOp)\) is a subset \(\obj S \subset \stateSpace\) which satisfies \[ x \in \obj{S} \implies \evolOp(t, x) \in \obj{S} \; \forall t \in \timeSet. \]

Definition 1.8 (\(\omega\)-limit and \(\alpha\)-limit point) A point \(x_* \in \stateSpace\) is called an \(\omega\)-limit point (resp. \(\alpha\)-limit point) of the orbit \(\Orbit{x_0}\) starting at \(x_0 \in \stateSpace\) if their exists a sequence of times \(\seqnc{t}{k}{1}{\infty} \subseteq \timeSet\) with \(t_k \to \infty\) (resp. \(t_k \to - \infty\))³, such that \[ \evolOp(t_k, x_0) \onBottom{\longrightarrow}{k \to \infty} x_*. \]

Definition 1.9 (\(\omega\)-limit and \(\alpha\)-limit set) A set \(\obj \Omega(\Orbit{x_0})\) of all \(\omega\)-limit points of the orbit \(\Orbit{x_0}\), see Definition 1.8, is called an \(\omega\)-limit set. Similarly, a set \(\obj A(\Orbit{x_0})\) of all \(\alpha\)-limit points of the orbit \(\Orbit{x_0}\) is called an \(\alpha\)-limit set.

The set \(\limitSet(\Orbit{x_0}) = \obj \Omega(\Orbit{x_0}) \cup \obj A(\Orbit{x_0})\) of all limit points of the orbit \(\Orbit{x_0}\) is called its limit set.

Definition 1.10 (Limit cycle) A cycle of a continuous-time dynamical system, in a neighborhood of which there are no other cycles, is called a limit cycle.

Note 1.2: Equivalent definition of a limit cycle

Equivalently to Definition 1.10, one can define a limit cycle as a cycle, which is the limit set, see Definition 1.9, of orbits in its neighborhood.

Definition 1.11 An invariant set \(S_0\) is called:

Lyapunov stable if for any sufficiently small neighborhood \(U \supset S_0\) there exists a neighborhood \(V \supset S_0\) such that \(\evolOp(t, x) \in U\) for all \(x \in V\) and all \(t > 0\);
asymptotically stable if there exists a neighborhood \(U_0 \supset S_0\) such that \(\evolOp(t, x) \to S_0\) for all \(x \in U_0\), as \(t \to +\infty\);
stable if it is both Lyapunov and asymptotically stable;
unstable if it is not stable.

A stable invariant set is called an attractor, whereas if the invariant set is unstable it is called a repeller.

Topologically Equivalent Dynamical Systems

So far, we have described dynamical systems mainly in general terms. Later, we will present concrete examples of dynamical systems, primarily from the neuroscience field, see Section 1.4, which will be too complex to apply certain techniques from bifurcation theory directly. Hence, we introduce the notion of (local) topological equivalence to remedy this issue.

Definition 1.12 (Topological equivalence) A dynamical system \(\obj D_1 = (\timeSet, \R^n, \evolOp)\) is said to be topologically equivalent to a dynamical system \(\obj D_2 = (\timeSet, \R^n, \oper{\psi})\), if there exists a homeomorphism⁴ \(\vi h: \R^n \to \R^n\), which maps orbits of system \(\obj D_1\) to orbits of system \(\obj D_2\), such that their orientation is kept. In such a case, their phase portraits are called (topologically) equivalent.

Definition 1.13 (Local topological equivalence) A dynamical system \(\obj D_1 = (\timeSet, \R^n, \evolOp)\) is said to be locally topologically equivalent in the neighborhood of its equilibrium \(\vi x^*\) to a dynamical system \(\obj D_2 = (\timeSet, \R^n, \oper{\psi})\) in the neighborhood of its equilibrium \(\vi y^*\), if there exists a homeomorphism \(\vi h: \R^n \to \R^n\), such that

\(\vi h\) is defined on a (small) neighborhood \(\neigh{\vi x^*} \subset \R^n\),
satisfies \(\vi y^* = \vi h(\vi x^*)\) and
maps orbits of dynamical system \(\obj D_1\) in the neighborhood \(\neigh{\vi x_0}\) to orbits of system \(\obj D_2\) in the neighborhood \(\neigh{\vi y^*}\), such that their orientation is kept.

1.1.2 Continuous-time Autonomous Systems

Definition 1.14 An autonomous system of (ordinary) differential equations is a system of the form \[ \dvi x = \vi f(\vi x), \tag{1.4}\] where \(\vi x \in \stateSpace = \R^n\) and a vector-valued function \(\vi f: \R^n \to \R^n\) is sufficiently smooth. The symbol \(\dvi x\) denotes a derivative of \(\vi x(t)\) with respect to time \(t \in \timeSet = \R\).

It can be shown the system of ODEs (1.4) induces an invertible continuous-time dynamical system if \(\vi f\) is sufficiently smooth, see [1] (Theorem 1.4).

In (1.4), we have given the ODE in explicit form, but, in general, it can also be defined by an implicit higher-order equation \[ \vi F(\vi x, \dvi x, \ddot{\vi x}, \dots, \vi x^{(k)}) = \vi 0. \tag{1.5}\] Such a system can always be translated into a first-order system of equations \[ \vi G(\vi y, \dvi y) = \vi 0 \] and, if \(\vi F\) (or \(\vi G\), respectively) allows it, even to the explicit form (1.4).

Tip 1.1: Vector field

The function \(\vi f\) is called a vector field, as it maps a vector \(\vi f(\vi x)\) to each point \(\vi x\) of the state space. See Figure 1.1 for an example.

Code

fig = Figure(size=(450, 350))
ax = Axis(fig[1,1])
streamplot!(ax, (x,y) -> Point2f(-x, 2y), -2..4, -2..2, colormap=:Zissou1)

fig

Figure 1.1: Vector field (or a phase portrait) corresponding to the dynamical system \(\dot{x} = -x, \dot{y} = 2y\).

Theorem 1.1 (Lyapunov) Consider a dynamical system (1.4) with an equilibrium \(\vi x^*\). Let \[ \Jacobi^* = \jacobi \vi f(\vi x^*) = \mtr{ \pDeriv {f_1} {x_1} (\vi x^*) & \pDeriv {f_1} {x_2} (\vi x^*) & \dots & \pDeriv {f_1} {x_n} (\vi x^*) \\ \pDeriv {f_2} {x_1} (\vi x^*) & \pDeriv {f_2} {x_2} (\vi x^*) & \dots & \pDeriv {f_2} {x_n} (\vi x^*) \\ \vdots & \vdots & \ddots & \vdots \\ \pDeriv {f_n} {x_1} (\vi x^*) & \pDeriv {f_n} {x_2} (\vi x^*) & \dots & \pDeriv {f_n} {x_n} (\vi x^*) } \] denote a Jacobian matrix evaluated at \(\vi x^*\). Then \(\vi x^*\) is stable, if all eigenvalues \(\lmbd_i\), where \(i \in \oneToN{n}\)⁵, of the matrix \(\Jacobi^*\) satisfy \(\reOf{\lmbd_i} < 0\).

Proof. See [10], page 160, or [11], page 185.

Definition 1.15 (Hyperbolic equilibrium) An equilibrium \(\vi x^*\) of the system (1.4) is called hyperbolic if none of the eigenvalues corresponding to the Jacobian matrix \(\Jacobi^* = \jacobi \vi f(\vi x^*)\) lies on the imaginary axis.

Theorem 1.2 (Grobman-Hartman) The system (1.4) in a neighborhood of its hyperbolic equilibrium \(\vi x^*\) is locally topologically equivalent, in the sense of Definition 1.13, to its linearization \[ \dvi x = \jacobi \vi f(\vi x^*) \vi x. \tag{1.6}\]

Proof. See [10], page 306, or [11], page 120.

This theorem is very prominently used in conjunction with linear continuous-time dynamical systems, i.e., \[ \dvi y = \vi A \vi y, \] where \(\vi y \in \R^n\) and \(\vi A \in \R^{n\times n}\), with an equilibrium \(\vi y^*\). Thus, we can partition the state space into disjoint linear (vector) subspaces:

stable subspace \(\obj{E}^s = \Span{\vi v^1, \dots, \vi v^{n_{-}}}\) of dimension \(n_{-}\), where \(\vi v^1, \dots, \vi v^{n_{-}}\) are the eigenvectors of \(\vi A\) corresponding to eigenvalues with negative real parts;
unstable subspace \(\obj{E}^u = \Span{\vi u^1, \dots, \vi u^{n_{+}}}\) of dimension \(n_{+}\), where \(\vi u^1, \dots, \vi u^{n_{+}}\) are the eigenvectors of \(\vi A\) corresponding to eigenvalues with positive real parts;
center subspace \(\obj{E}^c = \Span{\vi w^1, \dots, \vi w^{n_0}}\) of dimension \(n_{0}\), where \(\vi w^1, \dots, \vi w^{n_0}\) are the eigenvectors of \(\vi A\) corresponding to strictly imaginary eigenvalues, i.e., eigenvalues with zero real part.

Then it holds that \(n_{-} + n_{+} + n_0 = n\). Similarly, for linear discrete-time dynamical systems, one can perform an equivalent partitioning, such that \(\obj{E}^s\) is spanned by eigenvectors corresponding to eigenvalues lying inside the unit circle (on the imaginary plane), see Theorem 1.5 below. Furthermore, \(\obj{E}^u\) and \(\obj{E}^c\) can be defined for discrete-time dynamical systems analogously.

Moreover, by Theorem 1.2 we know that these subspaces will be preserved in some neighborhood of the hyperbolic equilibrium of the nonlinear system (1.4). Specifically, because the equilibrium is hyperbolic, \(\obj{E}^c = \set{\vi 0}\).

Lyapunov’s Direct Method

Definition 1.16 A function \(V: \R^n \to \R\) is called locally positive-definite (LPD) at \(\vi x^*\), if the following holds:

\(V(\vi x^*) = 0\),
\(V(\vi x) > 0\), \(\vi x \in \neigh{\vi x^*} \setminus \set{\vi x^*}\) for some neighborhood \(\neigh{\vi x^*}\).

If only \(V(\vi x) \geq 0\) holds on a neighborhood \(\neigh{\vi x^*}\), then the function is called locally positive semi-definite (LPSD). Analogously, we can define a locally negative definite and semi-definite functions.

Definition 1.17 (Lyapunov function) Let \(\vi \vf(t; \vi x_0)\) be a solution of the system (1.4) together with the initial \(\vi x(0) = \vi x_0\). A function \(V: \R^n \to \R\) is called Lyapunov at \(\vi x^*\), if \(V\) is locally positive definite and also \(\forall \vi x_0 \in \neigh{\vi x^*}\) is the function \(V \circ \vi \vf(t; \vi x_0)\) non-increasing for all \(t > 0\).

Moreover, the function \(V\) is called strictly Lyapunov if \(V \circ \vi \vf(t; \vi x_0)\) is (strictly) decreasing for all \(t > 0\).

Note 1.3: Monotonicity of \(V \circ \vi \vf(t; \vi x_0)\)

Equivalently to requiring a non-increasing \(V \circ \vi \vf(t; \vi x_0)\), one can instead demand the derivatives with respect to the trajectories of \(V\), i.e., \(\dot V(\vi x(t))\), to be non-positive. In other words, the derivative w.r.t. the trajectories \(\dot V\) must be locally negative semi-definite.

Theorem 1.3 (Lyapunov’s direct method) If \(\vi x^*\) is an equilibrium of the system (1.4) and \(V\) is a Lyapunov function for the system at \(\vi x^*\), then \(\vi x^*\) is Lyapunov stable. If, in addition, \(V\) is a strict Lyapunov function, then \(\vi x^*\) is stable.

Proof. See [10], page 24.

1.1.3 Discrete Dynamical Systems

Definition 1.18 An autonomous system of difference equations is a system of the form \[ \vi x \mapsto \vi f(\vi x) \quad \iff \quad \vi x_{m+1} = \vi f(\vi x_m), \tag{1.7}\] where \(\vi x, \vi x_m \in \stateSpace = \R^n\) and the function \(\vi f : \R^n \to \R^n\) is sufficiently smooth.

The system (1.7) induces a discrete-time autonomous dynamical system.

Theorem 1.4 (Lyapunov, analogous to Theorem 1.1) Consider a dynamical system (1.7) with a fixed point \(\vi x^*\). Let \(\Jacobi^* = \jacobi \vi f(\vi x^*)\) denote the Jacobian matrix evaluated at \(\vi x^*\). Then \(\vi x^*\) is stable, if all eigenvalues⁶ \(\lambda_i\), where \(i \in \oneToN{n}\), of the matrix \(\Jacobi^*\) satisfy \(\absval{\lambda_i} < 1\).

Proof. See [12], page 222.

Definition 1.19 (Hyperbolic fixed point) A fixed point \(\vi x^*\) of the system (1.7) is called hyperbolic if none of the eigenvalues corresponding to the Jacobian matrix \(\Jacobi^* = \jacobi \vi f(\vi x^*)\) have unit magnitude.

Theorem 1.5 (Grobman-Hartman, analogous to Theorem 1.2) The system (1.7) is locally topologically equivalent in the neighborhood of its hyperbolic fixed point \(\vi x^*\) to its linearization \[ \vi x \mapsto \jacobi \vi f(\vi x^*) \vi x. \]

Proof. See [10], page 311.

Example 1.1 (Fixed points of two-dimensional discrete-time dynamical system) As an example, consider a two-dimensional discrete-time dynamical system \[ \vi x_{m+1} = \vi f(\vi x_{m}), \tag{1.8}\] where \(\vi x_m = (x_{m, 1}, x_{m, 2})\Tr\) and \(\vi f : \R^2 \to \R^2\) is smooth. Moreover, let us assume there exists a hyperbolic equilibrium \(\vi x^* = \vi f(\vi x^*)\) of the system (1.8) and let \(\Jacobi^* = \jacobi \vi f(\vi x^*)\) denote the corresponding Jacobian matrix evaluated at \(\vi x^*\). Then \(\Jacobi^*\) has two eigenvalues \(\lambda_1, \lambda_2\), which satisfy \[ \det \brackets{\Jacobi^* - \lambda \ident_2} = \lambda^2 - \trace \Jacobi^* \lambda + \det \Jacobi^*. \] Here, \(I_2\) corresponds to a 2-by-2 identity matrix, \(\trace \Jacobi^* = \lambda_1 + \lambda_2\) is the trace of the determinant and finally, \(\det \Jacobi^* = \lambda_1 \lambda_2\) denotes the determinant of \(\Jacobi^*\).

By Theorem 1.5, we know that the system (1.8) is locally topologically equivalent to its linearization in a neighborhood of its hyperbolic fixed point \(\vi x^*\). Interestingly, we can classify the said fixed point based on the number of stable (and unstable) eigenvectors of the corresponding \(\Jacobi^*\), i.e., the classification is based on the dimensions of \(\obj E^s\), \(\obj E^u\) and \(\obj E^c\) of the partitioning of the state space per linearization.

A fixed point is classified as a sink, when both eigenvalues are real with a modulus below 1. Similarly, it is called a spiral sink if both eigenvalues are complex with modulus less than 1. Should one eigenvalue have modulus below 1 and the other modulus greater than 1, the resulting fixed point is a saddle. Finally, an equilibrium is a source, resp. spiral source, when both eigenvalues have modulus above 1 and are real, resp. complex. For an overview, see Figure 1.2.

Code

function integrate(A, x, n)
    X = similar(x, n, 2)
    X[1, :] = x
    for i = 2:n
        X[i, :] = A*X[i - 1, :]
    end
    return X
end

fig = Figure(size = (1000, 200), backgroundcolor=:transparent)

x0 = [1.0, 1.0]
x_eq = [0.01, 0.01]
n = 12

ax = Axis(fig[1,1], title="sink")
hidedecorations!(ax)
scatterlines!(ax, integrate([0.5 0; 0 0.25], x0, n), colormap=:Zissou1, color=1:n)

ax = Axis(fig[1,2], title="spiral sink")
hidedecorations!(ax)
scatterlines!(ax, integrate([0.5 0.25; -0.25 0.5], x0, n), colormap=:Zissou1, color=1:n)

ax = Axis(fig[1,3], title="saddle")
hidedecorations!(ax)
scatterlines!(ax, integrate([1.25 0; 0 0.25], x0, n), colormap=:Zissou1, color=1:n)
scatterlines!(ax, integrate([1.25 0; 0 0.25], x0 .* [-1.0, 1.0], n), colormap=:Zissou1, color=1:n)

ax = Axis(fig[1,4], title="source")
hidedecorations!(ax)
scatterlines!(ax, integrate([1.5 0; 0 1.25], x_eq, n), colormap=:Zissou1, color=1:n)

ax = Axis(fig[1,5], title="spiral source")
hidedecorations!(ax)
scatterlines!(ax, integrate([1.5 0.75; -0.75 1.5], x_eq, n), colormap=:Zissou1, color=1:n)

fig

Figure 1.2: Orbits near different equilibria of a linear discrete-time dynamical system (time flows from blue to red).

For completeness’ sake, we give a classical result from the theory of ODEs concerning the existence, uniqueness, and smooth dependence on the initial conditions of the solution for a given ODE.

Theorem 1.6 (Existence, uniqueness, and smooth dependence) Consider a system of ordinary differential equations \[ \dvi x = \vi f(\vi x), \vi x \in \R, \] where \(\vi f: \R^n \to \R^n\) is smooth in an open region \(U \subset \R^n\). Then there is a unique function \(\vi x = \vi x(t, \vi x_0)\), \(\vi x : \R^1 \times \R^n \to \R^n\), that is smooth in \((t,x)\), and satisfies, for each \(\vi x_0 \in U\), the following conditions:

\(\vi x(0, \vi x_0) = \vi x_0\);
there is an interval \(\timeInt = (-\delta_1, \delta_2)\), where \(\delta_{1,2} = \delta_{1,2}(\vi x_0) > 0\), such that, for all \(t \in \timeInt\), \[ \vi y(t) = \vi x(t, \vi x_0) \in U, \] and \[ \dvi y(t) = \vi f(\vi y(t)). \]

Proof. See [9], page 94.

1.1.4 Poincaré Map

The study of continuous-time dynamical systems naturally leads to discrete-time dynamical systems. One possible route is via sampling the continuous orbit at discrete times, e.g., separated by \(\Delta t\) which induces a time-shift map. Another way to obtain a discrete-time dynamical system from a continuous-time one is through a so-called Poincaré map.

Consider a continuous-time dynamical system of the form \[ \dvi x = \vi f(\vi x), \quad \vi x \in \R^n, \tag{1.9}\] where \(\vi f\) is smooth and assume that (1.9) has a periodic orbit \(L_0\). Let \(\vi x_0\) be a point on \(L_0\) and denote \(\crossSection\) the cross-section to the cycle at this point, see Figure 1.3.

Figure 1.3: Poincaré map corresponding to the system (1.9), the cycle \(L_0\) and its point \(\vi x_0\).

The cross-section \(\crossSection\) is a smooth hypersurface of dimension \(n-1\) (thus we say \(\codim \crossSection = 1\), i.e., the cross-section hypersurface \(\crossSection\) is of codimension one), which intersects \(L_0\) at a nonzero angle. The nonzero angle requirement is called the transversality condition, which effectively dictates that the hypersurface is not parallel to a through-going trajectory, thus the trajectory truly intersects the cross-section \(\crossSection\).

Consider now orbits of (1.9) near the cycle \(L_0\) and recall that the cycle \(L_0\) itself is an orbit which starts at point \(\vi x_0\) on \(\crossSection\) and returns to the same point on \(\crossSection\). As Theorem 1.6 guarantees, the solution of (1.9) depends smoothly on its initial condition, an orbit starting at \(\vi x \in \crossSection\) sufficiently close to \(\vi x_0\) will transversally intersect the hypersurface \(\crossSection\) at some other point \(\tilde{\vi x}\) near \(\vi x_0\). Therefore, this induces a map \(\vi P: \crossSection \to \crossSection\), \[ \vi x \mapsto \tilde{\vi x} = \vi P(\vi x). \]

Definition 1.20 (Poincaré map) The map \(\vi P\) defined above is called a Poincaré map associated with the cycle \(L_0\).

Similarly, we can characterize the Poincaré map \(\vi P\) using local coordinates \(\vi \xi = (\xi_1, \dots, \xi_{n - 1})\) on \(\crossSection\), such that the choice \(\vi \xi = \vi 0\) corresponds to \(\vi x_0\). Then the Poincaré map can be locally defined as a function \(\vi P: \R^{n - 1} \to \R^{n - 1}\), which maps \(\vi \xi\) corresponding to \(\vi x\) to \(\tilde{\vi \xi}\) corresponding to \(\tilde {\vi x}\), i.e., \[ \vi P(\vi \xi) = \tilde{\vi \xi}. \]

In other words, the Poincaré map \(\vi P\) prescribes a discrete-time dynamical system on the hypersurface \(\crossSection\). Its origin \(\vi \xi = \vi 0\) is a fixed point of this mapping. To our advantage, the stability of the underlying cycle \(L_0\) is then equivalent to the stability of the fixed point \(\vi \xi_0 = \vi 0\). By Theorem 1.4, we know the cycle is thus stable if all eigenvalues (also called multipliers) \(\mu_1, \dots, \mu_{n - 1}\) of the \((n - 1) \times (n - 1)\) Jacobian matrix of \(\vi P\), \[ \Jacobi_{\vi P} = \jacobi_{\vi \xi} \vi P(\vi \xi_0), \] are located inside the unit circle \(\norm{\vi \mu} = 1\). The Poincaré map will throughout this thesis paint itself as a powerful tool in the bifurcation analysis of dynamical systems, and the following lemma hints at its usefulness.

Lemma 1.1 The multipliers \(\mu_1, \dots, \mu_{n - 1}\) of the Jacobian matrix \(\Jacobi_{\vi P}\) of the Poincaré map \(\vi P\) associated with a cycle \(L_0\) are independent of the point \(\vi x_0\) on \(L_0\), the cross-section \(\crossSection\), and local coordinates on it.

Proof. See [1], page 27.

Consider now the cycle \(L_0\) and let \(\vi x^0(t)\) denote its corresponding periodic solution of (1.9) with the period \(T_0\), i.e., \(\vi x^0(t + T_0) = \vi x^0(t)\). Then any solution \(\vi x(t)\) of (1.9) can be written as \[ \vi x(t) = \vi x^0(t) + \vi u(t), \] where \(\vi u(t)\) stands for the deviation of the solution from the referential periodic solution. Then, \[\begin{align*} \dvi u(t) &= \dvi x(t) - \dvi x^0(t) \\ &= \vi f(\vi x^0(t) + \vi u(t)) - \vi f(\vi x^0(t)) \\ &= \Jacobi(t) \vi u(t) + O(\norm{\vi u(t)}^2). \end{align*}\] Omitting \(O(\norm{\vi u(t)}^2)\) terms gives us a linear \(T_0\)-periodic system \[ \dvi u = \Jacobi(t) \vi u, \quad \vi u \in \R^n, \tag{1.10}\] where \(\Jacobi(t) = \jacobi_{\vi x} \vi f(\vi x^0(t))\) and \(\Jacobi(t + T_0) = \Jacobi(t)\).

Definition 1.21 System (1.10) is called the variational equation about the cycle \(L_0\).

As the variational equation describes the evolution of perturbations in the proximity of the cycle \(L_0\), its stability naturally depends on the properties of the variational equation.

Definition 1.22 The time-dependent matrix \(\vi M(t)\) is called the fundamental matrix solution of (1.9) if it satisfies \[ \dvi M = \Jacobi(t) \vi M, \] with the initial condition \(\vi M(0) = \ident_n\). The matrix \(\vi M(T_0)\) is called a monodromy matrix of the cycle \(L_0\).

Theorem 1.7 (Floquet exponents) The monodromy matrix \(\vi M(T_0)\) has eigenvalues (called Floquet exponents or multipliers) \[ 1, \mu_1, \dots, \mu_{n - 1}, \] where \(\mu_i\) are the multipliers of the Poincaré map \(\vi P\) associated with the cycle \(L_0\), see Lemma 1.1.

Proof. See [1], page 30, for a sketch of the proof.

1.1.5 Stochastic Dynamical Systems

So far, we have only considered deterministic dynamical systems. Their defining feature was the absence of abrupt changes, i.e., the total lack of randomness. Contrastingly, in practice one has to constantly deal with noise in measurements of the studied system. This effectively results in a stochastic dynamical system induced by the stochastic random process arising from the measurements. Let us note this section is primarily based on [13].

Let us first introduce the necessary theory of stochastic analysis.

Definition 1.23 (Stochastic process) Let \((\eventSet, \sAlgebra, P)\) be a probability space and \(\timeSet \subseteq \R\) some index set of times. Then random/stochastic process \(\set{\rX(t, \omega); \; t \in \timeSet}\) is a mapping \[ \rX : \timeSet \times \eventSet \to \R, \] which is \(\sAlgebra\)-measurable, i.e.,⁷ \[ \forall t \in \timeSet, \; \forall B \in \borelAlgebra: \; \set{\omega \in \eventSet : \; \rX(t, \omega) \in B} \in \sAlgebra. \] For a fixed \(\omega \in \eventSet\), the mapping \(\rX(t) = \rX(t,\omega)\) is called a trajectory. Similarly, for a fixed \(t\in \timeSet\) the resulting \(\rX(\omega) = \rX(t,\omega)\) is a random variable.

Definition 1.24 (Wiener process) Wiener process, sometimes also called Brownian motion, is a random process \(\set{\rW(t, \omega); \; t \geq 0}\) satisfying

\(\rW(0, \omega) = 0\),
trajectories are almost surely continuous, i.e., \[ P(\set{\omega \in \eventSet: \; \text{trajectory } t \mapsto \rW(t, \omega) \text{ is discontinuous}}) = 0, \]
for all \(0 \leq s_1 < t_1 \leq s_2 < t_2 \dots \leq s_k < t_k \leq \cdots\), the increments of Wiener process \[ \set{\diff \rW_1(\omega), \diff \rW_2(\omega), \dots} \] are stochastically independent random variables with Gaussian probability distribution \[ \diff \rW_k \sim \normalD{0, t_k - s_k}, \] where \(\diff \rW_k(\omega) = \rW(t_k, \omega) - \rW(s_k, \omega)\) for \(k = 1,2,\cdots\).

Theorem 1.8 (Properties of Wiener process) Let \(\set{\rW(t);\; t \geq 0}\) be Wiener process. Then, for all \(t,s \geq 0\), it holds⁸

\(\procMean(t) = 0\),
\(\acov(t,t) = t\),
\(\rW(t) \sim \normalD{0,t}\),
\(\acov(s,t) = \min\set{s,t}\),
\(\acf(s,t) = \sqrt{\frac {\min\set{s,t}} {\max\set{s,t}}}\).

Proof. See [13], page 28.

In other words, in Definition 1.23, we have constructed a process that selects a random trajectory with each sample. Moreover, we have defined in Definition 1.24 a special process that is almost always continuous everywhere but differentiable nowhere (not unlike the Weierstrass function) with stochastically independent increments on disjunct intervals!

Our goal in this endeavour is to construct a sensible calculus on stochastic processes, which we could then use to understand deterministic dynamical systems perturbed by (small) stochastic noise. Let us consider a (trivial) differential equation (mainly focus on the differential formulation) \[ \deriv {x(t)} t = a x(t) \iff \d x(t) = a x(t) \dd t, \] which has the solution \(x(t) = e^{at}\), where \(x(t)\) often represents size of some population. Taking \(a = r + \sigma \rve(t, \omega)\), where \(\rve(t, \omega)\) is a random process, the population size now becomes a random variable, \[ \d \rX(t,\omega) = \brackets{r + \sigma \rve(t, \omega)} \rX(t, \omega) \dd t, \] which is a stochastic differential equation (SDE). For a moment, let us use differences instead of differentials, i.e., \[\begin{align*} \diff \rX(t,\omega) &= \brackets{r + \sigma \rve(t, \omega)} \rX(t, \omega) \, \diff t \\ &= r \rX(t, \omega) \, \diff t + \sigma \rX(t, \omega) \underbrace{\rve(t) \, \diff t}_{\diff \rZ(t)}. \end{align*}\]

What is the form of the random process \(\rZ(t)\), such that \(\diff \rZ(t) = \rve(t) \, \diff t\), where \(\rve(t)\) is a Gaussian white noise? To a careful reader, this may seem at least a little reminiscent of the Wiener process. In other words, differences \(\diff Z(t)\) are Gaussian white noise⁹ and if we require (almost sure) continuity and \(\rZ(0) = 0\), we get that \(\rZ(t)\) is the Wiener process — that is, \(\diff \rW(t)\) has the same properties as \(\rve(t) \diff t\). With the limit transition \(\lim_{\diff t \to 0}\) in the \(\ltwo\) sense we get, arguably after much more mathematics, \[ \d \rW(t) = \rve(t) \dd t, \] which can be understood as a distributional derivative. Now let us return to the original stochastic differential equation, that now becomes \[ \dd \rX(t,\omega) = r \rX(t, \omega) \dd t + \sigma \rX(t, \omega) \overbrace{\rve(t, \omega) \dd t}^{\d \rW(t)}, \] which we are tasked with solving for a random process \(\rX(t)\), \(t \geq 0\), that satisfies this SDE. This rather simple SDE prescribes so-called geometric Brownian motion. In general, a random process \(\rX(t)\) is called Itô process, if it is a solution to \[ \dd \rX(t) = \rnd U(t) \dd t + \rnd V(t) \dd \rW(t), \; \rX(0) = \rX_0, \] where \(\rnd U(t, \omega), \rnd V(t, \omega)\) are random processes satisfying certain conditions and \(\rX_0\) is required to have finite second moment.

1.2 Dynamical Systems with Delays

To understand how delays influence dynamical systems and define a semidynamical system, we have to generalize our notion of a differential equation. This modification will rely on introducing a deviating argument, i.e., some of the derivatives of the differential equation will be evaluated at different argument values. This section will be primarily sourced from [14], [15], [16], [17], [6], and [4].

1.2.1 Functional Differential Equations

So far, we have mainly discussed ordinary differential equations, which are equations relating the values of an unknown function and some of its derivatives at a single and the same argument value, see (1.4) and (1.5), e.g., \(F(t, x, \dot{x}, \ddot{x}) = 0\).

In contrast, a functional equation (FE) is an equation connecting outputs of an unknown function evaluated at different argument values. As an example, \(x(-t) + 3x(2t) = 0\), \(x(x(t)) = x(t) + 2\), and \(x(t) = tx(t+1) - \brackets{x(t-2)}^3\) are all examples of functional equations. The differences between the argument values of an unknown function and \(t\) (its “default” argument) in an FE are called argument deviations. If all argument deviations are constant (like in the last example shown above), then the FE is called a difference-functional equation.

Although we have so far shown only examples of FEs with discrete (or concentrated) argument deviations, another possibility is FEs with continuous and mixed (both continuous and discrete) argument deviations. They are called integral and integral-functional (or integral-difference) equations.

We can further combine these notions of ordinary differential equations and functional equations, yielding in the process functional differential equations (FDEs). FDEs can also contain discrete, continuous, or mixed argument deviations. Thus, one can introduce differential-difference equations, and integro-differential equations in a similar way as seen above.

The aforementioned generalization in its full form leads to functional differential equations — equations describing the relation between the unknown function and some of its derivatives for, in general, different argument values. Our main subject of study will be a subclass of differential-difference equations called the functional differential equations with aftereffects, which are of the form \[ \vi x^{(m)}(t) = \vi f\brackets{t, \vi x^{(m_1)}(t - \tau_1(t)), \dots, \vi x^{(m_k)}(t - \tau_k(t))}, \] where \(\vi x(t) \in \R^n\), \(m_1, \dots, m_k \geq 0\), and \(\tau_1(t), \dots, \tau_k(t) \geq 0\). They can be further classified as:

retarded FDEs (RFDEs) or equivalently delay differential equations (DDEs), if \(\max \set{m_1, \dots, m_k} < m\);
neutral FDEs (NFDEs), if \(\max \set{m_1, \dots, m_k} = m\);
advanced FDEs (AFDEs), if \(\max \set{m_1, \dots, m_k} > m\).

Let us also note that the argument deviations can, in theory, be dependent on the state, e.g., \(\dot{x}(t) = f(t, x(t), x(t - h(t, x(t))))\), but we shall omit them from our considerations.

Caution 1.1: Note on naming convention

For clarity and brevity, throughout this thesis RFDEs with (discrete) constant delays will be called delay differential equations (DDEs), although the term can be more general in other literature.

1.2.2 Method of Steps

All throughout this work, we shall see many parallels between ODEs and DDEs. While their connection is intricate and would require a more thorough discussion, one way it can be partly understood is via the method of steps — a way of solving delay differential equations. This subsection is mainly based on [6] and [18].

To begin gently, we will start with a simple case of \(x \in \R\). Let us thus consider a nonlinear DDE of the form \[ \dot x(t) = f(t, x(t), x(t - r)) \tag{1.11}\] with a single delay \(r > 0\). Assume that \(f(t,x,y)\) and \(f_x(t,x,y)\) are continuous on \(\R^3\). Let \(s \in \R\) and a continuous history function \(\phi: [s - r, s] \to \R\) be given. We aim to find a solution of (1.11) such that \[ x(t) = \phi(t), \; s-r \leq t \leq s \tag{1.12}\] and satisfying (1.11) on \(s \leq t \leq s + \sigma\) for some \(\sigma > 0\). As a careful reader might notice, we must interpret \(\dot x(s)\) as a right-hand derivative at \(s\), i.e., \(\dot x(s)^+\).

The system of equations (1.11) and (1.12) can now be solved with the so-called method of steps as follows. For \(s \leq t \leq s + r\), \(x(t)\) must satisfy the initial-value problem (IVP) for the following ODE: \[ \dot y(t) = \underbrace{f(t, y(t), \phi(t - r))}_{g(t,y)}, \; y(s) = \phi(s), \; s \leq t \leq s + r. \]

From the continuity of \(f\) and \(f_x\) immediately follows the continuity of \(g\) and \(g_y\), thus the existence of a unique local solution is guaranteed by classic results from the ODE theory, see, for example, [10]. Moreover, if this local solution \(x(t)\) exists on the entire interval \(s \leq t \leq s + r\), then, together with the history \(\phi\), we know the solution \(x(t)\) of (1.11)-(1.12) on \([s-r, s+r]\) and we may repeat the presented argument to extend our solution to the right. Indeed, considering now the interval \(s+r \leq t \leq s + 2r\), a solution of \(x(t)\) of (1.11)-(1.12) must now satisfy the following ODE problem: \[ \begin{aligned} \dot y(t) &= f(t, y(t), x(t - r)), && s+r \leq t \leq s+2r, \\ y(s+r) &= x(s+r). && \end{aligned} \tag{1.13}\] Just as before, the continuity of \(f\) and \(f_x\) yields a locally unique solution called, by abuse of notation, again \(x(t)\), defined on some subinterval \([s+r, \sigma) \subset [s+r, s + 2r]\) or, possibly, the entire interval. Naturally, \(x(t)\), now existing on \([s-r, \sigma)\) where \(\sigma > s + r\), is again a solution of (1.11)-(1.12). Finally, if the solution exists on the entire interval \([s+r, s+2r]\) we can perform another step, i.e., repeat this procedure, to extend \(x(t)\) further to the right to \([s+2r, s+3r]\) or some subinterval.

Theorem 1.9 (Single-delay method of steps on \(\R\)) Let \(f(t,x,y)\) and \(f_x(x,y,t)\) be continuous on \(\R^3\), \(s \in \R\), and let \(\phi: [s-r, s] \to \R\) be continuous history function. Then there exists \(\sigma > s\) and a unique solution of the initial-value problem (1.11)-(1.12) on \([s-r, \sigma]\)

Proof. Follows directly from the calculations above.

A careful reader might notice that Theorem 1.9 guarantees only a local solution, but we might want a solution for any \(t \geq s\). We will use the notation \([s-r, \sigma\rco\) to denote either the right-open interval, \([s-r, \sigma)\), or the right-closed one, \([s-r, \sigma]\).

From the uniqueness follows that for two solutions \(x(t)\) on \([s-r, \sigma\rco\) and \(\hat x(t)\) on \([s-r, \rho\rco\), the equality \(x(t) = \hat x(t)\) must hold for all \(t\), such that both sides of the equality are defined. Moreover, if \([s-r, \sigma\rco \subset [s-r, \rho\rco\), then \(\hat x\) is called an extension of \(x\) and we write \(x \subset \hat x\).

One can prove there exists a unique maximally defined solution, i.e., one for which there are no extensions, \(x : [s-r, \sigma) \to \R\), similarly as for ODEs, for \(\sigma \in \R \cup \set{\infty}\). Such a solution is called non-extendable and is necessarily defined on a right-open interval \([s-r, \sigma)\). Indeed, if \(\sigma < \infty\) and \(x\) were a solution on \([s-r, \sigma]\), then, by Theorem 1.9, it could be extended to a larger interval, which contradicts non-extendability of \(x\).

Theorem 1.10 (Finite-time blow-up) Let \(f\) satisfy the conditions of Theorem 1.9 and let \(x: [s-r, \sigma) \to \R\), \(\sigma \in \R \cup \set{\infty}\), be the non-extendable solution of the initial value problem (1.11)-(1.12). If \(\sigma < \infty\), then \[ \lim_{t \to \sigma^-} \absval{x(t)} = \infty. \]

Proof. See [6], page 26.

Note 1.4: On multiple delays and \(\vi x \in \R^n\)

One can rather easily show that Theorem 1.9 and Theorem 1.10 extend to the case of \(\vi x(\cdot) \in \R^n\) and \(\vi f : \R \times \R^n \times \R^n \to \R^n\). Similarly, it can also be generalized to multiple discrete delays \(r_1 < \dots < r_H\) where \[ \vi f = \vi f(t, \vi y(t), \vi y(t - r_1), \dots, \vi y(t - r_H)) \] with little to no change.

So far, we have presented a method to calculate the solution of the IVP given by (1.11) and (1.12). Per Note 1.4, we can slightly generalize the studied equation and consider \[ \rcases{ \dvi x(t) = \vi f(t, \vi x(t), \vi x(t - r)), && t_0 \leq t \leq t_f, \\ \vi x(t) = \vi \phi(t), && t \leq t_0, } \tag{1.14}\] where \(\vi x(\cdot) \in \R^n\) and \(\vi f : [t_0, t_f] \times \R^n \times \R^n \to \R^n\). From the discussion above follows that a unique solution exists for continuously differentiable function \(\vi f\). Together with (1.14), we further obtain that \(\vi x(\cdot)\) is also differentiable on each interval \((t_0 + kr, t_0 + (k+1)r) \subseteq (t_0, t_f)\) for appropriate \(k \in \N_0\). Moreover, \(\vi x(\cdot)\) is necessarily continuous on \([t_0, t_f]\) by (1.13) and Theorem 1.9.

However, the initial history function \(\vi \phi\) might not link smoothly to the actual solution \(\vi x\), i.e., \[ \dvi \phi(t_0)^- \neq \dvi x(t_0)^+ = \vi f(t_0, \vi \phi (t_0), \vi \phi(t_0 - r)), \tag{1.15}\] where \(\dvi \phi(\cdot)^-\) denotes the left derivative of \(\vi \phi\) with respect to \(t\) at \(t_0\) and analogously for \(\dvi x(t_0)^+\). As such, the discontinuity is then propagated to \(t_0 + r\), where, if \(\vi f\) and \(\vi \phi\) are continuous, \(\dvi x\) is now continuous but not differentiable. In general, the number and location of discontinuity points depends, in principle, on the so-called deviated arguments, \[ \alpha(t) = t - r(t, \vi x(t)), \tag{1.16}\] where we assume a more general version of (1.14) with \(\dvi x(t) = \vi f(t, \vi x(t), \vi x(\alpha(t)))\). For our purposes, we have \(\alpha(t) = t - r\) (then \(\alpha \in \contf {\R} {\infty}\)).

Definition 1.25 (Vanishing delay) Consider the generalized version of (1.14) with \(\dvi x(t) = \vi f(t, \vi x(t), \vi x(\alpha(t)))\). Then we say such system has a vanishing delay, if there exists a point \(s \in (t_0, t_f)\) such that the delay \(\alpha(t)\) vanishes, i.e., \[ \alpha(s) = s. \]

Theorem 1.11 Suppose \(s\) is a vanishing delay point of (1.14) with \(\dvi x(t) = \vi f(t, \vi x(t), \vi x(\alpha(t)))\) and \(\alpha\) is continuous. Moreover assume (1.15) holds, i.e., the history \(\vi \phi\) does not link smoothly to the actual solution \(\vi x\). Then, there are infinitely many discontinuity points in any left neighborhood of \(s\).

Proof. See [18], page 24.

Should \(\vi f\), \(\vi \phi\) and \(\alpha\) be also differentiable, then \(\ddot{\vi x}(t)\) exists for any \(t\) except the solutions \(s_{1, i} \in (t_0, t_f)\) of \[ \alpha(s_{1,i}) = t_0 \quad \& \quad \alpha'(s_{1,i}) \neq 0, \tag{1.17}\] i.e., the simple roots of \(\alpha(s) - t_0 = 0\). We call \(s_{1,i}\) the first level primary discontinuities (of \(\vi x(\cdot)\)). Indeed, for any smooth function \(\vi f(t, \vi x, \vi \xi)\) we can formally write (see (1.14) and (1.16)) \[ \begin{aligned} \ddot{\vi x}(t)^{\pm} ={}& \jacobi_t \vi f(t, \vi x(t), \vi x(\alpha(t))) \\ &+ \jacobi_{\vi x} \vi f(t, \vi x(t), \vi x(\alpha(t))) \dvi x(t) \\ &+ \jacobi_{\vi \xi} \vi f(t, \vi x(t), \vi x(\alpha(t))) \dvi x(\alpha(t))^{\pm} \alpha'(t), \end{aligned} \tag{1.18}\] therefore by (1.17) \[ \begin{aligned} \ddot{\vi x}(s_{1,i})^{+} ={}& \jacobi_t \vi f(s_{1,i}, \vi x(s_{1,i}), \vi x(t_0)) \\ &+ \jacobi_{\vi x} \vi f(s_{1,i}, \vi x(s_{1,i}), \vi x(t_0)) \dvi x(s_{1,i}) \\ &+ \jacobi_{\vi \xi} \vi f(s_{1,i}, \vi x(s_{1,i}), \vi x(t_0)) \dvi x(t_0)^{+} \alpha'(s_{1,i}) \end{aligned} \tag{1.19}\] and \[ \begin{aligned} \ddot{\vi x}(s_{1,i})^{-} ={}& \jacobi_t \vi f(s_{1,i}, \vi x(s_{1,i}), \vi x(t_0)) \\ &+ \jacobi_{\vi x} \vi f(s_{1,i}, \vi x(s_{1,i}), \vi x(t_0)) \dvi x(s_{1,i}) \\ &+ \jacobi_{\vi \xi} \vi f(s_{1,i}, \vi x(s_{1,i}), \vi x(t_0)) \dvi \phi(t_0)^{-} \dot{\alpha}(s_{1,i}). \end{aligned} \tag{1.20}\]

By comparing (1.19) and (1.20) using (1.15), we show that \(\ddot{\vi x}(s_{1, i})^+ \neq \ddot{\vi x}(s_{1,2})^-\), i.e., the second derivative of \(\vi x(\cdot)\) has a jump discontinuity at \(s_{1,i}\) as we claimed. Furthermore, by differentiating (1.18) we obtain that 1-level primary discontinuities propagate to 2-level primary discontinuities in \(\dddot{\vi x}(\cdot)\) at any point \(s_{2,j} > s_{1, i}\), such that \(s_{2,j}\) is a simple root of \(\alpha(s) - s_{1,i} = 0\) for some \(i\).

In other words, if \(\vi f\) is smooth and \(\vi \phi\) is differentiable for (1.14), then \(t_0 + r\) is the only 1-level primary discontinuity, \(t_0 + 2r\) is the 2-level primary discontinuity, etc. Therefore, the solution \(\vi x(t)\) of (1.14) is progressively smoother as the primary discontinuity level increases, i.e., as we go forward in time. This phenomenon is often called the smoothing of the solution of a DDE.

1.2.3 Notes on Delay Differential Equations

Consider a DDE of the form \[ \dvi x(t) = \vi f(t, \vi x(t), \vi x(t - \tau_1), \dots, \vi x(t - \tau_H)), \tag{1.21}\] with \(0 = \tau_0 < \tau_1 < \dots < \tau_H\). For brevity, we will denote \(\tau := \tau_H\) and \[ \vi x_t(s) := \vi x(t+s), \; -\tau \leq s \leq 0, \] which, informally, prescribes the value of solution \(\vi x\) up to time \(t\) delayed by \(s\). In other words, we can view the trajectory of our solution as the curve \(t \to \vi x_t\) in the state space \(\contStateSpace = \contf{[-\tau, 0], \R^n}{}\) with the supremum norm \[ \norm{\vi \phi} = \sup_{-\tau \leq s \leq 0} \norm{\vi \phi(s)}, \] a Banach space of continuous functions mapping the interval \([-\tau, 0]\) to \(\R^n\) with the topology of uniform convergence, see [15]. Thus, we can reformulate (1.21) as \[ \dvi x(t) = \vi F(t, \vi x_t), \tag{1.22}\] which can generally represent any RFDE, with \(\vi F : \timeSet \times \contStateSpace \to \R^n\) defined as \[ \vi F(t, \vi \phi) = \vi f(t, \vi \phi(-\tau_0), \vi \phi(-\tau_1), \dots, \vi \phi(-\tau_H)). \tag{1.23}\]

Most often, we will encounter the following autonomous initial-value problem \[ \begin{aligned} \dvi x(t) &= \vi f(\vi x(t), \vi x(t - \tau_1), \dots, \vi x(t - \tau_H)), \\ \vi x_{\sigma} &= \vi \phi, \end{aligned} \tag{1.24}\] where \(\sigma \in \R\) is the initial time and \(\vi \phi \in \contStateSpace\) is the state of system at time \(\sigma\), i.e., the \(\tau\)-long history function corresponding to the interval \([\sigma - \tau, \sigma]\). For completeness’ sake, we also add that general the RFDE initial-value problem reads as follows \[ \begin{aligned} \dvi x(t) &= \vi F(t, \vi x_t), \\ \vi x_{\sigma} &= \vi \phi. \end{aligned} \tag{1.25}\]

Last, but not least, we shall denote by \[ \jacobi_{\vi \tau_i} \vi f = \partialOp{\vi \xi_{i}} \vi f(t, \overbrace{\vi x(t - \tau)}^{\vi \xi_0}, \overbrace{\vi x(t - \tau_1)}^{\vi \xi_1}, \dots, \overbrace{\vi x(t - \tau_H)}^{\vi \xi_H}) \] the Jacobian matrix of \(\vi f\) corresponding to the \(i\)-th delayed state input. In other words, we may write the linearization of (1.24) around equilibrium \(\vi x^*\), i.e., \(\vi f(\vi x^*, \dots, \vi x^*) = \vi 0\), as \[ \vi f(\vi x(t)) \approx \vi f(\vi x^*, \dots, \vi x^*) + \sum_{i = 0}^H \jacobi_{\vi \tau_i} \vi f(\vi x^*, \dots, \vi x^*) \vi x(t - \tau_i), \tag{1.26}\] which follows from the Taylor expansion of \(\vi f\).

Backward Extension

Let us for a moment return to (1.25). So far, we have discussed its solution for \(t > \sigma\), but, sometimes, we may solve it backward in time for \(t < \sigma\), i.e., perform a backward extension of the initial history \(\phi\). Importantly, it should be evident there is a fundamental asymmetry between the past and the future for DDEs (recall they are only a special case of RFDEs), which is non-existent in the case of ODEs. Indeed, as one can see from, for example, [10], there is no fundamental distinction between solving forward or backward in time.

Per [6] and [15], we say that \(\vi x : [\sigma - \tau - \alpha, \sigma] \to \R^n\) for \(\alpha > 0\) is a (backward) solution of the initial-value problem given by (1.25) if \(\vi x\) is continuous, \(\vi x\) satisfies the initial condition and \[ \dvi x(t) = \vi F(t, \vi x_t), \; t \in [\sigma - \alpha, \sigma]. \] Because \(\vi x_{\sigma} = \vi \phi\) must hold as well, due to the initial condition, we get that \(\vi x(t) = \vi \phi(t - \sigma)\) must be continuously differentiable for \(t \in [\sigma - \alpha, \sigma]\). As \(\vi \phi \in \contStateSpace\), we can equivalently say that \(\vi \phi\) must be continuously differentiable on \([-\minOf{\alpha, \tau}, 0]\). This significantly constrains the space of all admissible elements in \(\contStateSpace\). Furthermore, because the equality must hold at \(t = \sigma\), we get \[ \dvi \phi(0) = \dvi x(\sigma) = \vi f(\sigma, \vi x_{\sigma}) = \vi f(\sigma, \vi \phi), \] where \(\dvi \phi(0)\) denotes the left-hand derivative at \(0\). In total, the history function \(\vi \phi\) must thus belong to a very limited submanifold \(\manifold\) of \(\contStateSpace\) where \[ \manifold = \set{\vi \psi \in \contStateSpace \cap \contf{[-\minOf{\tau, \alpha}, 0], \R^n}{1} : \dvi \psi(0) = \vi f(\sigma, \vi \psi)}. \]

It is reasonable to assume that for a typical history function \(\vi \phi \in \contStateSpace\), these conditions will not be met, and therefore backward extension will not be possible. However, if \(\vi x: [\sigma - \tau, \sigma + A)\), \(A > 0\) is a (forward) solution of (1.25) and if \(\sigma_1 \in (\sigma, \sigma + A)\), then the initial-value problem corresponding to this intermezzo initial condition \((\sigma_1, \vi \psi)\), where \(\vi \psi = \vi x_{\sigma_1}\), has a backward solution \(\vi x\). It can be shown that many (but not all) solutions of autonomous systems (1.24) extend to all \(t \in \R\) — steady-state and periodic solutions can be given as examples.

Stability

We have already discussed stability for dynamical systems in Definition 1.11. Stability for DDEs can be defined analogously, which will be summarized in the following definition (per [6] and [15]).

Definition 1.26 (Stability in Delay Differential Equations) Consider (1.25) and suppose that \(\vi F(t, \vi 0) = \vi 0\), \(t \in \R\), is satisfied so that \(\vi x(t) = \vi 0\) is a solution. The solution \(\vi x = \vi 0\) is called

stable if for any \(\sigma \in \R\) and \(\ve > 0\), there exists \(\delta = \delta(\sigma, \ve) > 0\) such that \(\vi \phi \in \contStateSpace\) and \(\norm{\vi \phi} < \delta\) implies that \(\norm{\vi x_t(\sigma, \vi \phi)} < \ve\), \(t \geq \sigma\);¹⁰
asymptotically stable if it is stable and if there exists \(b(\sigma) > 0\) such that whenever \(\vi \phi \in \contStateSpace\) and \(\norm{\vi \phi} < b(\sigma)\), then \(\vi x(t, \sigma, \vi \phi) \to 0\) as \(t \to \infty\);
unstable if it is not stable.

For a non-zero solution \(\vi y(t)\) of (1.25) defined on \(t \in \R\), its stability properties mimic those of the zero solution of \[ \dvi z(t) = \vi f(t, \vi z_t + \vi y_t) - \vi f(t, \vi y_t). \tag{1.27}\] Indeed, let \(\vi \xi(t)\) be another solution of (1.25) and put \(\vi z(t) = \vi \xi(t) - \vi y(t)\). Then \(\vi z_t = \vi \xi_t - \vi y_t\) and thus (by using (1.25) on \(\vi \xi_t\) or \(\vi y_t\)) \[ \dvi \xi(t) - \dvi y(t) = \vi f(t, \vi \xi_t - \vi y_t + \vi y_t) - \vi f (t, \vi y_t), \] which yields that \(\vi z\) is a solution to (1.27), which was to be shown.

Moreover, we shall also define a solution operator (or solution map) for autonomous RFDEs. This will become useful later when discussing the stability of equilibria in the case of DDEs from the perspective of their linearization, see Section 3.1.2.1.

Definition 1.27 Consider the autonomous DDE problem (1.24). The solution operator \(\solOp(t, \sigma) : \contStateSpace \to \contStateSpace\) from time \(\sigma\) to \(t\), such that \(t \geq \sigma\), is defined as \[ \solOp(t,\sigma) \vi \phi \letDef \vi x_t(\sigma, \vi \phi). \tag{1.28}\]

In the case of \(\sigma = 0\), we abuse the notation to write \(\solOp(t)\).

In other words, the solution operator \(\solOp(t)\) maps the initial segment \(\vi \phi\) onto the solution segment \(\vi x_t\) at time \(t\). Note that (1.28) is equivalent to (considering \(\sigma = 0\)) \[ (\solOp(t) \vi \phi)(\tht) = \vi x(t + \tht), \; - \tau \leq \tht \leq 0, t \geq 0. \tag{1.29}\] It can be shown that, see [15], \[ \begin{aligned} \solOp(0) &= \id, \\ \solOp(t) \solOp(s) &= \solOp(t + s), \; t,s \geq 0, \\ \solOp(t)\vi \phi &\text{ is continuous in } (t, \vi \phi), \end{aligned} \tag{1.30}\] thus \(\set{\solOp(t)}\), \(t \geq 0\), is a strongly continuous semigroup of transformations on a subset of \(\contStateSpace\) (see [15] or [19] for more information on the subject). Let us remark we assume \(t,s\) are allowed to range over an interval which may be dependent on \(\vi \phi \in \contStateSpace\).

1.2.4 Semidynamical Systems Induced by Delay Differential Equations

So far, we have only seen examples of continuous-time dynamical systems, which have inherently been symmetric in time, i.e., invertible. In other words, these systems did not distinguish between the future and the past. As an example, consider the following ODE and its induced continuous-time dynamical system \[ \dvi x(t) = \vi f(t, \vi x), \; \vi x(s) = \vi x_0. \] When solving such ODE, it is entirely up to us to integrate forward in time, but backward integration is equally possible. On the other hand, as we have seen in Section 1.2.3.1, delay differential equations are not always solvable backward in time.

Definition 1.28 Semidynamical system is a triple¹¹ \((\widetilde{\timeSet}, \contStateSpace, \evolOp)\), where \[ \widetilde{\timeSet} = \set{(t,s) \in \timeSet \times \timeSet \divider t \geq s}, \] \(\contStateSpace\) is a metric space and \(\evolOp : \widetilde{\timeSet} \times \contStateSpace \to \contStateSpace\) is continuous map, which satisfies the deterministic and composition properties from Definition 1.1.

The difference between Definition 1.1 and Definition 1.28 is rather subtle. It lies in the restriction to \(\widetilde{\timeSet}\), which permits only projection forward in time. Indeed, by using \(\widetilde{\timeSet} = \timeSet \times \timeSet\), we get the usual dynamical system. Note that an autonomous semidynamical system is defined analogously as in Definition 1.2. Also, basic concepts defined for dynamical systems, see Section 1.1.1, mostly extend to semidynamical systems (for example, we only consider \(\omega\)-limit point and set).

Lemma 1.2 Semidynamical system \((\widetilde{\timeSet}, \contStateSpace, \evolOp)\) is autonomous if and only if whenever \(\vi x(t)\) is a solution defined on subset \(I \subseteq \timeSet\) and \(\tau \in \timeSet_G\), then \(\vi x(t + \tau)\) is a solution on \(I - \tau\), where \(I - \tau = \set{t - \tau \divider t \in I}\).

Proof. Can be adapted from [6], page 63.

As an example of an autonomous system, one can consider the delayed logistic equation, see [20], \[ \dot{n}(t) = a \cdot \brackets{1 - \frac{n(t - T)} K} n(t), \] which assumes, among other facts, a constant growth rate \(a\). Should the growth rate \(a = a(t)\) be subject to change in time (for example by external influences), yielding \[ \dot{n}(t) = a(t) \cdot \brackets{1 - \frac{n(t - T)} K} n(t), \] then it prescribes a non-autonomous semidynamical system. Analogously to the dynamical system case, the evolution operator \(\evolOp\) of an autonomous semidynamical system is called a semiflow.

For compatibility with the source material, see [6], we will now consider an initial-value problem (1.25) for an RFDE, though for our purposes a DDE would suffice, \[ \dvi x(t) = \vi F(t, \vi x_t), \; \vi x_\sigma = \vi \phi, \] where \(\sigma \in \R\), \(\vi F\) is continuous, and \(\vi \phi \in \contStateSpace\). Let us assume there is a unique solution \(\vi x(t, \sigma, \vi \phi)\) of (1.25) defined for \(t \geq \sigma\). While this condition is hard to ensure in practice, it is nonetheless necessary for the definition of the induced semidynamical system. Denote by \(\vi x_t(\sigma, \vi \phi) \in \contStateSpace\) the state of the system at time \(t\), defined, as usual, by \[ \vi x_t(\sigma, \vi \phi)(\tht) = \vi x(t + \theta, \sigma, \vi \phi), \; -\tau \leq \tht \leq 0. \]

Proposition 1.1 The map \(\evolOp(t,\sigma,\vi \phi) = \vi x_t(\sigma, \vi \phi)\) defines a semidynamical system on \(\contStateSpace\) with \(\timeSet = [\sigma - \tau, \infty)\) and its corresponding \(\widetilde{\timeSet}\).

Proof. See [6], page 65.

1.3 Numerical Methods

As the title may suggest, computational aspects of ODEs, DDEs, and other mathematical problems will play a key role in this thesis. For completeness’ sake, a brief overview of numerical integrators (also called solvers) for ODEs, DDEs, and stochastic differential equations (SDEs) will be given, as well as basic methods of one-dimensional optimization. Lastly, the Newton-Raphson method will be discussed.

1.3.1 Numerical integration of ODEs

Solving, or numerically integrating, ordinary differential equations can be seen as the backbone of this thesis. As such, we will introduce two approaches, the Euler method and the Runge-Kutta method. This subsection is mostly based on [21], [22], and [23].

Consider an ODE of the form \[ \dvi x(t) = \vi F(t, \vi x(t)), \; \vi x(t_0) = \vi x_0, \tag{1.31}\] which we want to integrate on the interval \([t_0, t_N]\). By Taylor expansion at \(t_0\), we can write \[ \vi x(t) = \vi x(t_0) + \dvi x(t_0) (t - t_0) + O(t^2). \tag{1.32}\] Setting \(t = t_0 + \diff t_1\) yields \[ \vi x(t_0 + \diff t_1) \approx \vi x(t_0) + \dvi x(t_0) (t_0 + \diff t_1 - t_0) = \vi x(t_0) + \vi F(t_0, \vi x(t_0)) \diff t_1. \]

Repeating this process on the entire interval \([t_0, t_N]\) gives us Algorithm 1.1 called the explicit Euler method. Let us note that often an equidistant stepsize is used, e.g., \(\diff t_k = \diff t = \frac {t_N - t_0} N\).

Algorithm 1.1: Euler method (explicit)

Input: ODE in form (1.31), stepsizes \(\seqnc{\diff t}{i}{1}{N}\)
Output: Sequence of \(N+1\) points \(\seqnc{\vi x}{i}{0}{N}\)
begin
for \(i = 1\) to \(N\) do
\(\vi x_i \gets \vi x_{i - 1} + \vi F(t_{i - 1}, \vi x_{i - 1}) \diff t_i\)
end
end

Whilst the Euler method is simple, it does have its drawbacks mainly in its accuracy. For quantification of this issue, we shall define several notions of errors of a numerical integration method.

Definition 1.29 Let us consider an ODE problem of the form (1.31) and the solution \(\vi x(t)\) to the initial-value problem. Assume \(\vi x_k\) is an approximate solution at time \(t_k\) and that \(\vi x_{k+1}\) is given by \(\vi x_{k+1} = \vi x_k + \diff t_{k+1} \cdot \incrementF{t_k, \vi x (t_k), \diff t_{k+1}, \vi F}\), where \(\incrementFunc\) is the increment function¹², which propagates the solution from \(t_k\) to \(t_{k+1}\) and represents the chosen method. Local truncation error (LTE) is defined as \[ \locTrErr_{k} = \vi x(t_{k+1}) - \vi x(t_k) - \diff t_{k+1} \cdot \incrementF{t_k, \vi x (t_k), \diff t_{k+1}, \vi F} \tag{1.33}\] and represents the error the increment function causes when performing a single step. Similarly, global error is defined as the cumulative LTE from the initial condition, \[ \globErr_k = \vi x(t_k) - \vi x_k. \tag{1.34}\]

In general, we seek such a method, which fulfills given accuracy while using as few evaluations of \(\vi F\) as possible — the assumption is that the evaluation of \(\vi F\) is so costly the remaining operations are almost negligible.

By using (1.33) on Algorithm 1.1 to obtain LTE for the Euler method, we get (recalling the Taylor expansion (1.32)) \[ \locTrErr_k = \vi x(t_{k+1}) - \vi x(t_k) - \diff t_{k+1} \cdot \vi F(t_k, \vi x(t_k)) \implies \locTrErr_k = O(\diff t_{k+1}^2). \] We say that a method is of order \(p\) if \(\locTrErr = O(\diff t^{p+1})\), i.e., the Euler method is of first order.

This can surely be improved by using more terms of the Taylor series expansion, see (1.32), but the necessity of computing the derivatives of \(\vi F\) is a major drawback, often outweighing any potential benefits. Another possibility would be to allow the computation of the next value to be dependent on multiple previous values, leading to multistep methods, see Section 1.3.2. The last option — and the one we shall use — is to modify a one-step method to produce intermediate estimates of the solution at multiple different points (called stages), which are then combined into the final approximation of the solution at the next step — this way, we obtain so-called Runge-Kutta methods.

Algorithm 1.2: \(s\)-stage explicit Runge-Kutta (ERK) method

Input: ODE in form (1.31), stepsizes \(\seqnc{\diff t}{i}{1}{N}\), \(s \in \N\) (the “number of stages”), real coefficients \(a_{21}, a_{31}, a_{32},\dots, a_{s1}, a_{s2}, \dots, a_{s,s-1}\), \(b_1, \dots, b_s\), and \(c_2, \dots, c_s\)
Output: Sequence of \(N+1\) points \(\seqnc{\vi x}{i}{0}{N}\)
begin
for \(i = 1\) to \(N\) do
calculate \[\begin{align*} \vi k_1 &\gets \vi F (t_0, \vi x_{i-1}), \\ \vi k_2 &\gets \vi F (t_0 + c_2 \diff t_i, \vi x_{i-1} + \diff t_i a_{21} \vi k_1), \\ \vi k_3 &\gets \vi F (t_0 + c_3 \diff t_i, \vi x_{i-1} + \diff t_i (a_{31} \vi k_1 + a_{32} \vi k_2)), \\ &\;\vdots \\ \vi k_s &\gets \vi F (t_0 + c_s \diff t_i, \vi x_{i-1} + \diff t_i (a_{s1} \vi k_1 + \dots + a_{s, s-1} \vi k_{s-1})); \end{align*}\] set \(\vi x_i \gets \vi x_{i-1} + \diff t_i (b_1 \vi k_1 + \dots + b_s \vi k_s)\)
end
end

Usually, we constrain the coefficients to fulfill \[ c_m = \sum_{n = 1}^{m - 1} a_{mn}, \tag{1.35}\] which represents the assumption that all points, where \(\vi F\) is evaluated, are first-order approximations of the solution. It can be shown that (1.35) significantly simplifies the derivation of order conditions (1.36), but for low order, it is not necessary.

Our task is now to choose the coefficients in Algorithm 1.2 so that the order of the method is maximized. This corresponds to computing the Taylor series expansion for each \(\vi k_j\) in the state variable of \(\vi F\). By combining these results, we can compare them with the Taylor expansion of \(\vi x_i\). Hence, we obtain the following conditions (given for a 4-stage method), using \(c_1 = 0\): \[ \begin{aligned} \sum_{j} b_j &= 1, & \sum_{j} b_j c_j &= \frac 1 2, \\ \sum_{j} b_j c_j^2 &= \frac 1 3, & \sum_{j,m} b_j a_{jm} c_m &= \frac 1 6, \\ \sum_{j} b_j c_j^3 &= \frac 1 4, & \sum_{j,m} b_j c_j a_{jm} c_m &= \frac 1 8, \\ \sum_{j,m} b_j a_{jm} c_m^2 &= \frac 1 {12}, & \sum_{j,m,n} b_j a_{jm} a_{mn} c_n &= \frac 1 {24}. \end{aligned} \tag{1.36}\]

In other words, if the coefficients of the 4-stage ERK method comply with conditions (1.36), the method is of fourth order, i.e., \(\locTrErr_i = O(\diff t_i^5)\). Typically, “the Runge-Kutta” method is used: \[ \rcases{ \vi k_1 &= \vi F\brackets{t_0, \vi x_0} \\ \vi k_2 &= \vi F\brackets{t_0 + \frac 1 2 \diff t, \vi x_0 + \frac 1 2 \vi k_1 \diff t} \\ \vi k_3 &= \vi F\brackets{t_0 + \frac 1 2 \diff t, \vi x_0 + \frac 1 2 \vi k_2 \diff t} \\ \vi k_4 &= \vi F\brackets{t_0 + \diff t, \vi x_0 + \vi k_3 \diff t} } \; \vi x_1 = \vi x_0 + \diff t \brackets{\frac 1 6 \vi k_1 + \frac 2 6 \vi k_2 + \frac 2 6 \vi k_3 + \frac 1 6 \vi k_4}. \tag{1.37}\]

Throughout this thesis, we will use solvers available in DifferentialEquations.jl, see [24], for Julia programming language, see [25], and ode45 (or ode23 for stiff ODEs) for MATLAB, see [26].

1.3.2 Numerical Integration of DDEs

As invested readers can surely recall, we have already discussed the method of steps in Section 1.2.2 as an analytical way to solve DDEs. There, we reformulated the problem of integrating a DDE forward in time as a sequence of ODE initial-value problems, which needed to be integrated on smaller intervals. A detailed treatment of the numerical integration of DDEs can be found in [18].

Consider the following DDE IVP¹³, \[ \rcases{ \dvi x(t) = \vi f(t, \vi x(t), \vi x(t - \tau)), && t_0 \leq t \leq t_f, \\ \vi x_{\tau} = \phi, && } \tag{1.38}\] where \(\vi x(\cdot) \in \R^n\) and \(\vi f: [t_0, t_f] \times \R^n \times \R^n \to \R^n\), and a set of mesh points \(\mcal T = \set{t_0, t_1, \dots, t_N = t_f}\), such that either \(t_n - \tau < t_0\) or \(t_n - \tau \in \mcal T\). Note that the presented methods can be extended to IVPs with multiple delays, e.g., (1.24).

Once such a method is available, any ODE numerical integrator that operates strictly on the mesh \(\mcal T\) may be used — these are typically called linear multistep (LMS) methods. For instance, Algorithm 1.3 shows the explicit (forward) Euler method adapted for (1.38).

Algorithm 1.3: Euler method (explicit) for DDEs

Input: DDE in form (1.38), time discretization mesh \(\mcal T = \set{t_0, \dots, t_N = t_f}\)
Output: Sequence of \(N+1\) points \(\seqnc {\vi x} i 0 N\)
begin
for \(i = 1\) to \(N\) do
\(\vi x_i \gets \vi x_{i - 1} + \vi f(t_{i - 1}, \vi x_{i - 1}, \vi x_j) \diff t_i\)
end
end

However, this approach puts strict requirements on the mesh \(\mcal T\), which renders the method unusable in many cases. For example, for the IVP \[ \begin{aligned} \dot{x}(t) &= x(t/2), && 0 \leq t \leq 1, \\ x(0) &= 1, && \end{aligned} \] no finite mesh \(\mcal T\) can be found, as for any \(t \in \mcal T\) also \(t/2\) must belong to the mesh (thus no initial stepsize exists). Nevertheless, in general, one can construct a suitable finite mesh \(\mcal T\) for any bounded interval \([t_0, t_f]\) if there is no vanishing delay point in \([t_0, t_f]\), see Definition 1.25 and Theorem 1.11. Bellen and Zennaro [18], see page 38, outline a possible strategy for doing so, along with its pitfalls. Also, note that this approach can be rather easily extended to Adams-like methods and even some Runge-Kutta formulae (such as the Heun method).

Another possible approach is to replace the strict constraints on the time mesh \(\mcal T\) by an interpolation method. Nonetheless, most often the discontinuity points belonging to \([t_0, t_f]\) are included in \(\mcal T\), as ODE solvers generally require smoothness of the solution (or at least such property is highly beneficial for accuracy).

Throughout the years, the theory of continuous ODE methods, see Definition 1.30, has been developed. Continuous ODE methods build on a very general class of \(k\)-step numerical ODE integrators for (1.31) of the form \[ \vi x_{i+1} = \alpha_{i,1} \vi x_n + \dots + \alpha_{i,k} \vi x_{i - k+1} + \diff t_{i+1} \incrementF{\vi x_i, \dots, \vi x_{i - k + 1}; \mcal T_i, \vi F}, \tag{1.39}\] \(i \geq k - 1\), where \(\mcal T_i = \set{t_{i - k + 1}, \dots, t_i, t_{i+1}}\) and where the increment function \(\incrementFunc\) is globally Lipschitz continuous with respect to \(\vi x\) arguments (\(\vi F\) is also assumed to be globally Lipschitz continuous in \(\vi x\)).

Definition 1.30 (Continuous ODE method [18]) We call continuous extension, or interpolant, of the ODE method (1.39) the piecewise polynomial function \(\vi \eta(t)\) given by the restrictions on each interval \([t_i, t_{i+1}]\) of any interpolant based on values computed in a possibly larger interval \([t_{i - k_i}, t_{i+l_i + 1}]\), \(k_i, l_i \geq 0\), of the form \[ \begin{aligned} \vi \eta(t_i + \tht \diff t_{i+1}) ={}& \beta_{i,1}(\tht) \vi x_{i + l_i} + \dots + \beta_{i, l_i + k_i + 1}(\tht) \vi x_{i - k_i} \\ &+ \diff t_{i+1} \incrementF{\vi x_{i + l_i}, \dots, \vi x_{i - k_i}; \tht, \vi F, \mcal T'_i}, \quad 0 \leq \tht \leq 1, \end{aligned} \tag{1.40}\] where \(\mcal T'_i = \set{t_{i - k_i}, \dots, t_{i + j_i}, t_{i + j_i + 1}}\), satisfying the continuity conditions \[ \vi \eta(t_i) = \vi x_i \quad \& \quad \vi \eta(t_{i+1}) = \vi x_{i+1}. \tag{1.41}\]

If \(l_i = k_i = 0\), then the interpolation procedure is based on values belonging to the sole interval \([t_i, t_{i+1}]\) and is referred to as one-step interpolation. Otherwise, the interpolation procedure is called multistep interpolation. In short, any ODE method (1.39) endowed with its continuous extension (1.40) will be called a continuous ODE method.

An arbitrary ODE solver can then be used as a continuous ODE method (with \(j_i = 0\) for computational reasons). Note that even explicit ODE methods in general lead to implicit continuous ODE methods. For brevity, discussion on convergence and other numerical properties is omitted. In implementations, we will predominantly use DelayDiffEq.jl package by [27] in Julia or dde23 in MATLAB.

Tip 1.2: Improvements to dde23

Throughout the work on this thesis and related publications, we have noticed a significantly worse performance of the dde23 method compared to the equivalent DDE solver in Julia, see this Git repository. Humusoft, the distributor of MATLAB for the Czech Republic and Slovakia, determined that the issue lies in the code to find appropriate index \(i\) such that \(t - \tau \in [t_i, t_{i+1}]\) for a given \(t\) (necessary for determining, e.g., \(\mcal T'_i\)). The main inefficiency was caused by searching for \(i\) starting from the most distant past, see [28] for reference on the implementation. This problem is fixed in MATLAB version R2024b onward.

1.3.3 Numerical Integration of SDEs

In Section 1.1.5, we have presented a framework for dealing with stochastic dynamical systems. These very often arise from noisy measurements of the underlying system. Using the discussed concepts, we can represent this as a discrete-time approximation of an Itô process. In particular, we will use the Euler-Maruyama approximation. Consider an Itô process \(\rX = \set{\rX(t) \divider t_0 \leq t \leq T}\) satisfying the scalar SDE \[ \dd \rX(t) = a(t, \rX(t)) \dd t + b(t, \rX(t)) \dd \rW(t), \; \rX(t_0) = \rX_0 \tag{1.42}\] on \(t_0 \leq t \leq T\).

For a given discretization \(t_0 < t_1 < \dots < t_n = T\) of the time interval \([t_0, T]\), an Euler-Maruyama approximation is a continuous-time stochastic process \(\rY = \set{\rY(t) \divider t_0 \leq t \leq T}\) satisfying the following iterative scheme (using \(\rY(t_n) = \rY_n\)) \[ \rY_{n+1} = \rY_n + a(t_n, \rY_n) (t_{n+1} - t_n) + b(t_n, \rY_n)(\rW(t_{n+1}) - \rW(t_n)) \tag{1.43}\] for \(n = 0, 1, \dots, N-1\) with initial value \(\rY_0 = \rX_0\). For conciseness, we shall denote \(\diff t_n = t_{n+1} - t_n\) and call \(\delta = \max_n \diff t_n\) the maximum time step. Most often, we shall use equidistant discretization, i.e., \(t_n = t_0 + n \delta\), and in such case we will use \(\diff t = \delta \equiv \diff t_n = (T - t_0) / N\). Similarly, let \(\diff \rW_n \letDef \rW(t_{n+1}) - \rW(t_n)\) be the random independent increments of the Wiener process, see Definition 1.24, corresponding to the discretization. From the definition immediately follows that \[ \diff \rW_n \sim \normalD{0, \diff t_n} \quad \& \quad \diff \rW \sim \normalD{0, \diff t}. \] Thus, for equidistant discretization, (1.43) becomes \[ \rY_{n+1} = \rY_n + a(t_n, \rY_n) \diff t + b(t_n, \rY_n) \diff \rW, \tag{1.44}\] and as such \(\set{\rY_n}_n\) can be interpreted as a Markov chain. This can be easily extended to a vector-valued SDE, \[ \dd \vr X(t) = \vi a(t, \vr X(t)) \dd t + \vi B(t, \vr X(t)) \dd \vr W(t), \; \vr X(t_0) = \vr X_0, \tag{1.45}\] where, assuming \(\vr X(t, \omega) \in \R^n\), we have \(\vi a(t, \vr X(t, \omega)) \in \R^n\), \(\vi B(t, \vr X(t, \omega)) \in \R^{n \times n}\), and \(\dd \vr W(t)\) is a differential corresponding to a vector of Wiener processes with prescribed correlation (and hence \(\diff \vr W_n \in \R^{n \times n}\) has a matrix structure relating to the underlying correlation). Finally, this is captured in Algorithm 1.4.

Algorithm 1.4: Euler-Maruyama method

Input: SDE in form (1.45), time discretization \(\seqnc{\diff t}{i}{1}{N}\)
Output: Discretized random process \(\vr Y\) to \(\seqnc t i 0 N\)
begin
set \(\vr Y_0 \gets \vr X_0\) by (1.45)
for \(i = 0\) to \(N - 1\) do
\(\vr Y_{n+1} \gets \vr Y_n + \vi a(t_n, \vr Y_n) \diff t_n + \vi B(t_n, \vr Y_n) \diff \vr W_n\)
end
end

Let us note that for implementation, we use the EM method from DifferentialEquations.jl by [24] for Julia and a custom implementation for MATLAB, see [26]. Furthermore, the relevant theory of stochastic functional differential equations (SFDEs) will not be covered in this document to ensure a focused and concise scope. To an interested reader, we recommend [29] for a general overview and [30] for more details on linear and affine SFDEs. As an example of actual use, in [31], we have employed the Euler-Maruyama method to compare stochastic and deterministic simulations of the Pinsky-Rinzel model, see [32] and [33], for specific values of parameters, deduced from preceding bifurcation analysis, producing notable signal patterns comparable with those measured experimentally.

1.3.4 One-Dimensional Optimization

Optimization is nigh ubiquitous in both nature and mathematics. For us, it will prove most fruitful to study optimization in one dimension, where the situation is, fortunately, the easiest. Most of the material is taken from [34] and [35].

The methods we will present can be collectively called bracketing, as they rely on iteratively shrinking an interval inside which a local minimum lies. Furthermore, a single-minimum requirement shall be placed on functions to be optimized. For functions on \(\R\), this translates to so-called unimodality.

Definition 1.31 (Unimodal function) Let \(I \subset \R\) be an interval and \(f: I \to \R\) a function. We say \(f\) is unimodal if there exists \(x^* \in I\) such that

\(f(x_1) > f(x_2)\) for arbitrary \(x_1, x_2 \in I\) satisfying \(x_1 < x_2 < x^*\);
\(f(x_1) < f(x_2)\) for arbitrary \(x_1, x_2 \in I\) satisfying \(x^* < x_1 < x_2\).

Note 1.5: Unimodality

As a careful reader might notice, a unimodal function is necessarily monotone decreasing on \((-\infty, x^*) \cap I\) and monotone increasing on \(I \cap (x^*, \infty)\). Moreover, unimodality implies only quasi-convexity¹⁴ and, on the other hand, only strict convexity implies unimodality of a function on \(f: I \to \R\).

Lemma 1.3 Let \(f: I \to \R\) be unimodal on \(I\) and consider \(x_1, x_2 \in I\) such that \(x_1 < x_2\), then

\(f(x_1) \leq f(x_2) \implies x^* \leq x_2\);
\(f(x_1) \geq f(x_2) \implies x_1 \leq x^*\).

Proof. Follows directly from Definition 1.31.

Consider now the following optimization problem \[ \min_{x \in I} f(x), \quad \text{resp.} \quad \argmin_{x \in I} f(x), \tag{1.46}\] for an unimodal function \(f: I \to \R\), \(I = [a,b]\). Furthermore, let \(N\) denote the allowed number of evaluations of \(f\) and assume the accuracy of presented methods is \(\absval{\bar x - x^*}\), where \(x^*\) is the exact solution of (1.46) and \(\bar x\) its approximation found by the used method. Using Lemma 1.3, one can easily find the initial bracket to be successively shrunk if it is not provided in the definition of the optimization problem (1.46).

Bisection Method

Consider now \(N = 2k\), \(k \in \N\), choose \(\delta \in \left(0, \frac{b - a} 2\right]\) sufficiently small and let \(a_0 \letDef a, b_0 \letDef b\) set the initial bracket \([a_0, b_0]\). We then define \[ x_i^- \letDef \frac {a_{i - 1} + b_{i - 1}} 2 - \delta \; \& \; x_i^+ \letDef \frac {a_{i - 1} + b_{i - 1}} 2 + \delta \tag{1.47}\] for \(i = 1, \dots, k\). Now we can examine in which “half” of the interval (including the offset \(\delta\)) the minimum lies, i.e., we evaluate \(f(x_i^-), f(x_i^+)\) and adjust the bracket accordingly (by Lemma 1.3)

\(f(x_i^-) < f(x_i^+) \implies [a_i, b_i] = [a_{i - 1}, x_i^+]\);
\(f(x_i^-) > f(x_i^+) \implies [a_i, b_i] = [x_i^-, b_{i - 1}]\).

The center \(\bar x_k\) of the \(k\)-th bracket is assumed to approximate the exact solution \(x^*\) with accuracy \(l_k/2\), where \[ l_k = \frac {b - a} {2^k} + \frac {(2^k - 1)\delta} {2^{k - 1}} \] denotes the length of the \(k\)-th bracket. It can be seen that \(l_k \to 2 \delta\) for \(k \to \infty\).

Algorithm 1.5: Bisection method

Input: Optimization problem (1.46), initial interval \(I = [a,b]\), \(\delta \in \left(0, \frac {b - a} 2\right]\), number of steps \(k \in \N\)
Output: Approximation of minimum \(\bar{x}_k\)
begin
set \(a_0 \gets a, b_0 \gets b\)
for \(i = 1\) to \(k\) do
calculate \(x_i^- \gets \frac {a_{i - 1} + b_{i - 1}} 2 - \delta\) & \(x_i^+ \gets \frac {a_{i - 1} + b_{i - 1}} 2 + \delta\)
if \(f(x_i^-) < f(x_i^+)\) do
set \([a_i, b_i] \gets [a_{i - 1}, x_i^+]\)
else if \(f(x_i^-) > f(x_i^+)\) do
set \([a_i, b_i] \gets [x_i^-, b_{i - 1}]\)
else do
set¹⁵ \([a_i, b_i] \gets [x_i^-, b_{i - 1}]\)
end
set \(\bar{x}_i \gets \frac {a_i + b_i} 2\)
end
end

Golden Section Method

The bisection method we have presented above has one significant flaw. Every iteration, i.e., every bracket improvement, requires two separate evaluations which are, in general, never used again. Thus, the Golden section method was constructed, requiring only a single evaluation.

Tip 1.3: Golden ratio

Before delving into the technical details, let us note in this subsection \(\goldRatio \approx 1.618\) denotes the positive solution of \[ \goldRatio^2 - \goldRatio - 1 = 0, \tag{1.48}\] i.e., \(\frac 1 \goldRatio \approx 0.618\), and is generally referred to as the golden ratio. Equivalently, it is also given as \[ \frac {a + b} a = \frac a b = \goldRatio \tag{1.49}\] for \(a > b > 0\) which satisfy the above equality.

Consider a unimodal function \(f: I \to \R\) on the interval \(I = [a,b]\) subject to an optimization problem (1.46) and evaluation-count restriction \(N \geq 2\) (or desired accuracy \(\ve\)). Let us relax the bisection method such that \[ x_i^- = a_{i-1} + \delta_- (b_{i-1} - a_{i-1}), \; x_i^+ = a_{i-1} + \delta_+ (b_{i-1} - a_{i-1}) \tag{1.50}\] and we shall attempt to find a values of \(\delta_-, \delta_+ \in (0,1)\) such that only 1 iteration is needed. Assume \(f(x_i^-) < f(x_i^+)\), then (by Lemma 1.3 as in Algorithm 1.5) \[ [a_i, b_i] = [a_{i-1}, x_i^+] = \parentheses{a_{i-1}, a_{i-1} + \delta_+ (b_{i-1} - a_{i-1})} \] and \[ x_{i+1}^- = a_i + \delta_- (b_i - a_i) \; x_{i+1}^+ = a_i + \delta_+ (b_i - a_i). \] If we want to reuse the \(f(x_i^-)\) evaluation, then necessarily \[ \begin{aligned} x_{i+1}^+ &= x_i^- \\ a_i + \delta_+ (b_i - a_i) &= a_{i-1} + \delta_- (b_{i-1} - a_{i-1}) \\ a_{i-1} + \delta_+ (a_{i-1} + \delta_+ (b_{i-1} - a_{i-1}) - a_{i-1}) &= a_{i-1} + \delta_- (b_{i-1} - a_{i-1}) \\ \delta_+^2 (b_{i-1} - a_{i-1}) &= \delta_- (b_{i-1} - a_{i-1}) \\ &\implies \delta_+^2 = \delta_-. \end{aligned} \] Now we shall focus on \(f(x_i^-) \geq f(x_i^+)\), denoting \(\delta = \delta_+\), which by the same argument leads to \[ [a_i, b_i] = [x_i^-, b_{i-1}] = \parentheses{a_{i-1} + \delta^2 (b_{i-1} - a_{i-1}), b_{i-1}}, \] which together with the analogous condition \(x_{i+1}^- = x_i^+\) yields (using \(l_i = b_i - a_i\)) \[ \begin{aligned} x_{i+1}^- &= x_i^+ \\ a_i + \delta^2 (b_i - a_i) &= a_{i-1} + \delta (b_{i - 1} - a_{i - 1}) \\ a_{i-1} + \delta^2 (b_{i - 1} - a_{i - 1}) + \delta^2 \brackets{b_{i - 1} - a_{i-1} - \delta^2 (b_{i - 1} - a_{i - 1})} &= a_{i - 1} + \delta(b_{i - 1} - a_{i - 1}) \\ \delta^2 l_{i - 1} + \delta^2 l_{i - 1} - \delta^4 l_{i-1} &= \delta l_{i - 1} \\ \delta (\delta^3 - 2 \delta + 1) &= 0 \\ \delta (\delta - 1) (\delta^2 + \delta - 1) &= 0. \end{aligned} \] Recall we defined \(\delta_-, \delta_+ \in (0,1)\), thus \(\delta \in (0,1)\) ruling out \(\delta = 0\) and \(\delta = 1\) roots. What remains is \(\delta^2 + \delta - 1 = 0\), which has the solution \(\delta = \frac {\sqrt{5} - 1} 2\).

Moreover, notice that \[ \frac{x_{i+1}^- - a_i} {x_{i+1}^+ - a_i} = \frac {a_i + \delta^2 l_i - a_i} {a_i + \delta l_i - a_i} = \delta = \frac {a_i + \delta l_i - a_i} {l_i} = \frac {x_{i+1}^+ - a_i} {l_i}, \] i.e., we split the interval \([a_i, b_i]\) such that the ratio between the whole \([a_i, b_i]\) and the bigger part \([a_i, x_{i+1}^+]\) is the same as between the smaller part \([a_i, x_{i+1}^-]\) and the bigger part. Rephrasing this ratio equality as (using \(\delta^2 = 1 - \delta\)) \[ \begin{aligned} \frac 1 {\delta} = \frac{x_{i+1}^+ - a_i} {x_{i+1}^- - a_i} &= \frac {l_i} {x_{i+1}^+ - a_i} \\ &= \frac {x_{i-1}^+ - a_i + b_i - x_{i-1}^+} {x_{i+1}^+ - a_i} \\ &= \frac {x_{i-1}^+ - a_i + b_i - a_i - \delta l_i} {x_{i+1}^+ - a_i} \\ &= \frac {x_{i-1}^+ - a_i + l_i - \delta l_i} {x_{i+1}^+ - a_i} \\ &= \frac {x_{i-1}^+ - a_i + (1 - \delta) l_i} {x_{i+1}^+ - a_i} \\ &= \frac {x_{i-1}^+ - a_i + \delta^2 l_i} {x_{i+1}^+ - a_i} \\ &= \frac {x_{i-1}^+ - a_i + a_i + \delta^2 l_i - a_i} {x_{i+1}^+ - a_i} \\ &= \frac {x_{i-1}^+ - a_i + x_{i-1}^- - a_i} {x_{i+1}^+ - a_i} \end{aligned} \] gives us the relation (1.49) defining the golden ratio, i.e., \(\goldRatio = \frac 1 \delta\). The method bears its name due to this fact. Lastly, see Algorithm A.1 for implementation.

Fibonacci method

Let us now approach the problem from a different direction. Given the minimization problem (1.46) and maximum evaluations \(N\), we want to shrink the initial bracket as much as possible.

Suppose only 2 evaluations are allowed. Then one possibility is to query \(f\) at thirds of the initial interval. The final bracket will have a two-thirds length of the initial interval, but it can be improved. Consider the distance between the evaluations \(\ve\), then the final bracket is guaranteed to shrink to half of the original interval (i.e., shrinkage by a factor of 2) as \(\ve \to 0\) in the limit sense, see Figure 1.4.

Figure 1.4: Shrinkage by \(\ve\)-distance between the only two allowed evaluations.

Suppose \(N = 3\) evaluations are now permitted. Then the best course of action is to divide the initial into thirds, which results in the next bracket having \(\frac 2 3\)-length. If we perform the last \(f\)-query in the \(\ve\)-distance from the interior evaluation of the previous round (one is guaranteed to lie inside the next-iteration bracket), then by \(\ve \to 0\), in the limit sense, we get a shrinkage by a factor of 3, see Figure 1.5.

Figure 1.5: Sketch of the Fibonacci method bracket shrinkage when \(N = 3\).

In general, this leads to evaluation given by the Fibonacci numbers \[ F_n = \lcases{ 1, & n \leq 2, \\ F_{n-1} + F_{n_2}, & n > 2, } \] which prescribe the queries at \[ x_i^- = a_{i-1} + \frac {F_{N - i - 1}} {F_{N - i + 1}} l_{i - 1}, \quad x_i^+ = a_{i - 1} + \frac {F_{N - i}} {F_{N - i + 1}} l_{i-1}. \tag{1.51}\]

For more details on the derivation, see [35]. Moreover, the Fibonacci series can be written explicitly using Binet’s formula: \[ F_n = \frac {\goldRatio^n - (1 - \goldRatio)^n} {\sqrt{5}}, \tag{1.52}\] where \(\goldRatio\) is the golden ratio, see Tip 1.3, from which can be shown \[ \frac 1 {\rho(i)} = \frac {F_i} {F_{i - 1}} = \goldRatio \frac {1 - s^i} {1 - s^{i - 1}}, \tag{1.53}\] where \(s = \frac {1 - \sqrt{5}} {1 + \sqrt{5}}\). The algorithm of the Fibonacci method, heavily based on implementation from [34], is provided in Algorithm A.2.

1.3.5 Newton’s Method

Last, but not least, we shall discuss Newton’s, or Newton-Raphson, method for finding the roots of a given equation (or a system thereof). The treatment provided here follows [36]. Suppose we are given a system of equations \[ \begin{aligned} f_1(x_1, \dots, x_n) &= 0 \\ f_2(x_1, \dots, x_n) &= 0 \\ &\vdots \\ f_n(x_1, \dots, x_n) &= 0 \end{aligned} \iff \vi F(\vi x) = \vi 0. \tag{1.54}\]

Let \(\vi x^i \in \R^n\) be known and fixed, for example, a previous iteration, then (1.54) can be approximated using Taylor’s series in the neighborhood of \(\vi x^i\). Corresponding first-order Taylor expansion is \[ f_k(\vi x) \approx f_k(\vi x^i) + \sum_{j = 1}^n \pDeriv {f_k(\vi x^i)}{x_j} (x_j - x^i_j). \]

Using the vector notation, we can collectively approximate (1.54) as \[ \vi F(\vi x^i) + \Jacobi_i (\vi x - \vi x^i) = \vi 0, \tag{1.55}\] where \(\Jacobi_i = \jacobi_{\vi x} \vi F(\vi x^i)\), i.e., \((\Jacobi_i)_{k,j} = \pDeriv {f_k(\vi x^i)}{x_j}\). Certainly a solution to (1.55) approximates the true solution to (1.54). Then, by sequentially approximating the true solution from points getting closer, the approximations also get more accurate. In other words, we can write the following iterative process to find the true solution (to an arbitrary accuracy) \[ \vi x^{i+1} = \vi x^i - \Jacobi_i\Inv \vi F(\vi x^i). \tag{1.56}\]

While this formula for Newton’s method is very common, we shall also present the residual form \[ \Jacobi_i \cdot \diff \vi x^{i+1} = \vi R(\vi x^i), \tag{1.57}\] where \(\vi R(\vi x) = - \vi F(\vi x)\) is the residual, and \(\diff \vi x^{i+1} = \vi x^{i+1} - \vi x^i\) is the displacement, which play the role of unknowns in (1.57). As a careful reader might notice, instead of requiring an inverse of the Jacobian matrix \(\Jacobi_i\), we have translated it into a linear system of equations. If \(\Jacobi_i\Inv\) exists and is numerically stable¹⁶, then we can use it to solve (1.57). On the other hand, if \(\Jacobi_i\) is badly conditioned (or singular), so \(\Jacobi_i\Inv\) cannot be reliably found, then one can use iterative methods to find an approximate solution. Assuming \(\diff \vi x^{i+1}\) is the solution to (1.57), next iteration for (1.56) is computed as \[ \vi x^{i+1} = \vi x^i + \diff \vi x^{i+1}. \]

There are many possible stopping conditions for this iterative Newton’s method, but often one can see \[ \norm{\vi x^i - \vi x^{i-1}} \leq \ve_{\text{err}} \; \& \; \norm{\vi R(\vi x^i)} \leq \ve_{\text{res}}, \] where \(\ve_{\text{err}}\) is the given tolerance for the solution error and \(\ve_{\text{res}}\) for the residual.

Note 1.6: Modifications to Newton’s method

Here, we have only discussed the standard Newton’s method, but there exist several modifications including the Newton-chord method, Newton-Broyden method, and various quasi-Newton methods.

1.4 Neuronal Dynamics

Neuron is the fundamental building block of the neural system of many living organisms on Earth. As such, how we understand it directly impacts a significant portion of our (perception of the) world. Naturally, mathematical models and concepts from computational neuroscience play a key role in this understanding. Let us note this section is based on [37] and [38], with relevant publications referenced when needed.

1.4.1 Physiology of a Neuron

Every cell has a cell membrane on its boundary, consisting of a phospholipid bilayer and irregularly spaced ion channels, which separate the internal structures of the cell from the surrounding external environment. However, from our point of view, its most important feature is the selective permeability, which allows it to permit the passage of certain chemicals while restricting others. As a simplification, we may assume it lets the potassium cations \(\ce{K^+}\) go through freely but heavily obstructs the passage of sodium cations \(\ce{Na^+}\). Thanks to this property, a membrane potential emerges for most cells. The potential is induced by the lower concentration of positively charged particles (typically potassium cations \(\ce{K^+}\) diffuse outward) inside the cell than outside. By the interplay of diffusion and Coulomb’s law acting on particles in and outside of the cell, the membrane potential stabilizes at \(-70\) mV (typically called the resting potential).

Certain cells, among which are neurons, possess the ability to control the permeability of their membrane. A local change of the permeability is called the action potential if the change triggers a characteristic sequence of events resulting in the return to the resting state. On the contrary, a small change is generally compensated by slow regulating mechanisms.

Note 1.7: All-or-nothing law

The threshold for activation of the action potential and its constant characteristic behavior induce the so-called all-or-nothing law, which says that a neuron sends a signal if and only if it receives a strong enough impulse and that the signal is de facto the same every time, see [39].

During the action potential, the corresponding part of the cell membrane opens its ion channels permeable to \(\ce{Na^+}\). Driven by the electrochemical gradient, these ions enter the cell, making the membrane potential more positive, a process known as depolarization. This change triggers the opening of voltage-gated potassium (\(\ce{K^+}\)) channels, allowing \(\ce{K^+}\) ions to flow out of the cell, thereby restoring the membrane potential in a process called repolarization. This, in turn, hyperpolarizes the cell membrane (the membrane potential is even lower than the resting value at this point). The entire process is then slowly reversed as sodium cations exit the cell and are replaced by potassium cations. Note that the overpopulation of \(\ce{Na^+}\) inside the cell restricts its ability to undergo the action potential until it sufficiently lowers, limiting the possible spiking frequency of an individual neuron.

Although the action potential is fundamentally a local behavior of the neuronal cell membrane, it triggers a chain reaction around the initial point of excitation, which causes neighboring parts of the membrane to undergo the same process.

1.4.2 Conductance-based Models

As can be seen from the previous subsection, a key quantity for a mathematical model of a single isolated neuron is likely to be the membrane potential, ideally affected by variables describing the permeability of various ion channels. Indeed, conductance-based models, see [40], are established on an equivalent circuit representation of a cell membrane, introduced by Hodgkin and Huxley [41]. Here, ion channels of an excitable cell are modeled as conductances and its lipid bilayer by a capacitor. In its simplest form, a conductance-based model represents a neuron by a single compartment with homogeneous potential, disregarding ion movements between subcellular compartments, and captures only ion movements between the inside and outside of the cell.

The original model by Hodgkin and Huxley [41], constructed based on measurements of the giant axon of a squid, reads as \[ \begin{aligned} I_{\text{ext}} &= C \dot{V} + g_K n^4 (V - V_K) + g_{Na} m^3 h (V - V_{Na}) + g_L(V - V_L), \\ \dot n &= \alpha_n (V) (1 - n) - \beta_n(V)n, \\ \dot m &= \alpha_m (V) (1 - m) - \beta_m(V)m, \\ \dot h &= \alpha_h (V) (1 - h) - \beta_h(V)h, \\ \end{aligned} \tag{1.58}\] where \(I_c = C_m \dot{V}\) describes the current flowing into the lipid bilayer capacitance (\(C\) is the capacity of the membrane) and \(I_{\text{ion}_i} = g_i (V - V_i)\) current through a given ion channel¹⁷ \(i\) as the product of its conductance \(g_i\) and the corresponding driving potential \(V - V_i\). Here, \(V_i\) denotes so-called reverse potentials, which capture associated resting potentials. Lastly, \(m,n,h\) model the probabilities of respective ion channels being open. The exact formulas for functions \(\alpha_i, \beta_i\) with \(i = m,n,h\), and parameter values can be found in the original article.

The several conductance-based models, founded on the Hodgkin–Huxley formalism, appeared after the breakthrough publication of Hodgkin and Huxley. Among the most influential is the Morris–Lecar model, see [42], which they formulated after studying the muscle fiber of a barnacle. Nevertheless, nowadays it is used to model the dynamics of human neurons. It reduces the 4-dimensional Hodgkin–Huxley model into 2 dimensions, see [43], \[ \begin{aligned} C \dot V &= - g_{Ca} m\inSS(V) (V - V_{Ca}) - g_K w(V - V_K) - g_L(V - V_L) + I_{\text{ext}}, \\ \dot w &= \frac {w\inSS(V) - w} {T_w(V)}, \end{aligned} \tag{1.59}\] where \(w\) is the “recovery variable” for the \(\ce{K^+}\) channel prescribing the probability of the corresponding ion channel being open. Moreover, the open-state probability functions \(m\inSS, w\inSS\) are derived from assumptions about the steady-state and its distribution of open and closed ion channels. Finally, \(T_w\) is used to set a correct timescale for \(w\).

The last model we shall consider is the interneuron model proposed by White et al., see [44]. It was used to describe the synchronization of oscillations and the loss thereof in a network of inhibitory neurons. Nonetheless, it allows us to fast-spiking neuron cells. The governing equations read as, see [45], \[ \begin{aligned} C \dot V &= I_{\text{ext}} - \overbrace{g_L (V - V_L)}^{I_L} - \overbrace{g_{Na} m^3 h (V - V_{Na})}^{I_{Na}} - \overbrace{g_K n^4 (V - V_K)}^{I_K}, \\ \dot h &= \frac {h\inSS(V) - h}{T_h(V)}, \\ \dot n &= \frac {n\inSS(V) - n}{T_n(V)}, \end{aligned} \tag{1.60}\] where \(C\) again indicates the capacitance of the neuronal membrane, \(I_{\text{ext}}\) is the external stimulus and \(I_L, I_{Na}, I_K\) incorporate the relationship between the membrane voltage \(V\) (typically in \(mV\)) and currents through respective channels. Analogously, \(g_i\) and \(V_i\) act as the maximum conductance and reverse potential, respectively, for an ion channel \(i\). Researchers noticed that the recovery variable \(m\) in (1.58) almost immediately converges to its corresponding steady state, thus using the slow-fast dimensionality reduction method, they replaced it with its asymptotic value (so-called quasi-equilibrium)¹⁸, \[ m = m\inSS(V) = \frac 1 {1 + \Exp{-0.08(V + 26)}}. \] The dynamics of activation variables \(h,n\) are further related to their corresponding steady-state distributions and correctly scaled in time, i.e., \[ \begin{aligned} h\inSS(V) &= \frac 1 {1 + \Exp{0.13(V + 38)}}, & T_h(V) &= \frac {0.6} {1 + \Exp{-0.12(V + 67)}}, \\ n\inSS(V) &= \frac 1 {1 + \Exp{-0.045(V + 10)}}, & T_n(V) &= 0.5 + \frac {2} {1 + \Exp{0.045(V - 50)}}. \end{aligned} \] Our choice of parameters is based on [46], \[ \begin{aligned} C &= 1, & g_L &= 0.1, & g_{Na} &= 30, & g_K &= 20, \\ I_{\text{ext}} &= 24, & V_L &= -60, & V_{Na} &= 45, & V_K &= -80. \end{aligned} \tag{1.61}\]

1.4.3 Couplings

So far, we have only studied single neurons, but in nature, a far more common occurrence is a (large) coupled network of neurons. However, for simplicity, we shall devote our attention only to a case of 2 weakly coupled neurons. For more information, see [47].

In general, couplings can be either chemical or electrical. In the case of electrical coupling, there is a direct connection from one neuron to another, facilitated via a channel or a gap junction. Although more common is the (slower) chemical synapse (also called the synaptic coupling), per [44] and [48], we will consider solely the electrical coupling (our goal is to study high-frequency oscillations, which would be inhibited by the use of slow coupling).

We model the gap-junctional coupling effect as external currents directly proportional to the differences between membrane potentials for both coupled neurons. In particular, it yields \[ \begin{aligned} C_i \dot V_i(t) &= I_{\text{ext}} - I^{\text{channels}}_i(t) - \lambda (V_i(t) - V_{\hat i}(t - \tau)), \\ &\vdots \end{aligned} \tag{1.62}\] where \(V_i\) marks the membrane potential of the \(i\)-th neuron, \(C_i\) capacity of its membrane, \(\lambda\) is the coupling strength, \(\hat{i}\) denotes the “opposite” coupled neuron, and \(\tau > 0\) a possible delay in the information introduced by the coupling. As such, the system of two \(\lmbd\)-weakly coupled interneurons, see (1.60), becomes \[ \rcases{ C_1 \dot V_1(t) &= I_{\text{ext}} - g_L (V_1(t) - V_L) - g_{Na} m\inSS^3(V_1(t)) h_1(t) (V_1(t) - V_{Na}) \\ &\quad {} - g_K n_1^4(t) (V_1(t) - V_K) - \lmbd (V_1(t) - V_2(t - \tau)), \\ \dot h_1(t) &= \frac {h\inSS(V_1(t)) - h_1(t)}{T_h(V_1(t))}, \\ \dot n_1(t) &= \frac {n\inSS(V_1(t)) - n_1(t)}{T_n(V_1(t))}, \\ C_2 \dot V_2(t) &= I_{\text{ext}} - g_L (V_2(t) - V_L) - g_{Na} m\inSS^3(V_2(t)) h_2(t) (V_2(t) - V_{Na}) \\ &\quad {} - g_K n_2^4(t) (V_2(t) - V_K) - \lmbd (V_2(t) - V_1(t - \tau)), \\ \dot h_2(t) &= \frac {h\inSS(V_2(t)) - h_2(t)}{T_h(V_2(t))}, \\ \dot n_2(t) &= \frac {n\inSS(V_2(t)) - n_2(t)}{T_n(V_2(t))}. } \tag{1.63}\]

In practice, numerical experiments in neuroscience are often hard to set up due to the complicated nature of the formulas of the systems and sheer number of their parameters. As such, throughout the work on this thesis, we developed a Julia package NeuronToolbox.jl, see its Git repository, to aid in this process. We illustrate this on a coupled pair of an interneuron model (1.60) and a Morris-Lecar model (1.59). The coupling is facilitated through a gap junction, see (1.62), with the coupling strength \(\lmbd = 0.01\) and corresponding delay \(\tau = 0.01\).

Code

using NeuronToolbox, ComponentArrays

IN_params = ComponentVector(C=1, I=24, gL=0.1, EL=-60, 
    gNa=30, ENa=45, gK=20, EK=-80)
ML_params = Models.MorrisLecarUtils.example_parameters()

IN = Models.Interneuron(IN_params)
ML = Models.MorrisLecar(ML_params)

lambda = 0.01
tau = 0.01
sync_coupling = Couplings.FirstVarDifference(lambda)
async_coupling = Couplings.OneStepDelayed(sync_coupling, 
    partition(2, [3, 2]), tau)

coupled_pair = Networks.CoupledPair(IN, ML, async_coupling)

The resulting network coupled_pair can then be used as a normal function prescribing a (delay) differential equation with DifferentialEquations.jl. Also note that we provide various different models and network types, e.g., an \(m\times m\) adjacency matrix coupling for a system of \(m\) models.

The evolution map \(\evolOp\) is continuous, if \(t,s \in \timeSet\), \(x \in \stateSpace\) and if \(\set{(t_n, s_n)}_n\) is a sequence in \(\timeSet \times \timeSet\) and \(\Seqnc {x_n} n {}\) is a sequence in \(\stateSpace\) such that \((t_n, s_n) \to (t,s)\) and \(x_n \to x\), then \(\evolOp(t_n, s_n, x_n) \to \evolOp(t,s,x)\).↩︎
The equation (1.2) describes the Verhulst model of a population (and its growth, characterized by \(r_0\)) in some closed environment with some finite capacity (controlled by \(K\)).↩︎
For a careful reader, let us note we assume \(t_k \to \infty\) is well-defined, though it might restrict full generality of Definition 1.1.↩︎
In the context of topology, a homeomorphism (also called a bicontinuous function) is a bijective and continuous function, such that its inverse is also continuous.↩︎
For conciseness, we use the following notation \(\oneToN{n} := \set{1, \dots, n}.\)↩︎
Eigenvalues of fixed points of discrete-time dynamical systems are often called multipliers, see [7].↩︎
We denote \(\borelAlgebra\) the set of all Borel sets, i.e., Borel \(\sigma\)-algebra.↩︎
Let us note \(\acov(s,t) = \cov(\rX(s), \rX(t)) = \Expect{(\rX(s) - \expect \rX(s))(\rX(t) - \expect \rX(t))}\) denotes the autocovariance function and \(\acf(s,t) = \corr(\rX(s), \rX(t)) = \frac {\cov (\rX(s), \rX(t))} {\sqrt{\variance \rX(s) \variance \rX(t)}}\) the autocorrelation function of a stochastic process \(X(t)\).↩︎
A random variable is called Gaussian white noise if it has normal distribution with zero mean.↩︎
For clarity, we use \(\vi x(t, \sigma, \vi \phi)\) to denote the solution of (1.25).↩︎
For semidynamical systems, we will use \(\contStateSpace\) instead of \(\stateSpace\) to emphasize that \(\vi x \in \contStateSpace\) will typically be some (vector-valued) history function.↩︎
In general, the increment function \(\incrementFunc\) at the \(k+1\)-st step may depend on \(t_0, \dots, t_k\), \(\vi x_0, \dots, \vi x_k\), and \(\diff t_1, \dots, \diff t_{k + 1}\).↩︎
In general, methods discussed in this subsection allow a time-dependent delay \(\tau\), i.e., \(\tau(t)\).↩︎
A function \(f: S \to \R\), where \(S\) is a real vector space, is convex on \(S\) if for all \(\vi x, \vi y \in S\) and \(\lmbd \in [0,1]\) holds \(f(\lmbd \vi x + (1 - \lmbd) \vi) \leq \lmbd f(\vi x) + (1 - \lmbd) f(\vi y)\) and strictly convex if the inequality is strict. Furthermore, we call it quasi-convex on \(S\) if for any \(\vi x, \vi y \in S\) and \(\lmbd \in [0,1]\) we have \(f(\lmbd \vi x + (1 - \lmbd) \vi y) \leq \minOf{f(\vi x), f(\vi y)}\).↩︎
In the case of equality, both options for a new bracket are equally valid.↩︎
In such case we can calculate it straight or implicitly via so-called left division, which in Julia relies on pivoted QR decomposition.↩︎
Note that \(L\) denotes the abstract leak ion channel, which aggregates all the smaller, less-significant ion channels into one.↩︎
An attentive reader may notice that a deviation from the quasi-equilibrium is often used for modelling of recovery variables, see, for instance, (1.59) or (1.60).↩︎