JavedA's picture
init
a67ae61
raw
history blame
11.4 kB
% =================================================
% ================ Meet \gls{cnm} =======================
% =================================================
\section{Cluster-based Network Modeling (CNM)}
\label{sec_1_1_2_CNM}
In this subsection, the workflow of \gls{cnm} \cite{Fernex2021} will be elaborated, as well as the previous attempt to expand the algorithm to accommodate a range of model parameter values $\vec{\beta}$.
\gls{cnm} \cite{Fernex2021} is the basis on which \gls{cnmc} is built or rather
\gls{cnmc} invokes \gls{cnm} multiple times for one of its preprocessing steps.
CNM can be split up into 4 main tasks, which are
data collection, clustering, calculating
transition properties and propagation.
The first step is to collect the data, which can be provided from any dynamic system or numerical simulations.
In this study, only dynamical systems are investigated.
Once the data for the dynamical system is passed to \gls{cnm}, the data is clustered, e.g., with k-means++ algorithm \cite{Arthur2006}.
A detailed elaboration about this step is given in section \ref{sec_2_3_Clustering}. \gls{cnm} exploits graph theory for approximating the trajectory as a movement on nodes.
These nodes are equivalent to the centroids, which are acquired through clustering.
Next, the motion, i.e., movement from one centroid to another, shall be clarified.\newline
In order to fully describe the motion on the centroids, the time at which
one centroid is visited is exited, and also the order of movement must be known.
Note, when saying the motion is on the centroids, that
means the centroids or characteristic nodes do not move
at all. The entire approximated motion of the original trajectory
on the nodes is described with the transition
property matrices $\bm Q$ and $\bm T$.
The matrices $\bm Q$ and $\bm T$ are the transition probability and transition time matrices, respectively.
$\bm Q$ is used to apply probability theory for predicting the next following most likely centroid. In other words, if
the current location is at any node $c_i$,
$\bm Q$ will provide all possible successor centroids
with their corresponding transition probabilities.
Thus, the motion on the centroids
through $\bm Q$ is probability-based.
In more detail, the propagation of the motion on the centroids can be described as equation \eqref{eq_34}.
The variables are denoted as the propagated $\vec{x}(t)$ trajectory, time $t$, centroid positions $\vec{c}_k,\, \vec{c}_j$, the time $t_j$ where centroid $\vec{c}_j$ is left and the transition time $T_{k,j}$ from $\vec{c}_j$ to $\vec{c}_k$ \cite{Fernex2021}.
Furthermore, for the sake of a smooth trajectory, the motion between the centroids is interpolated through a spline interpolation.\newline
\begin{equation}
\vec{x}(t) = \alpha_{kj} (t) \, \vec{c}_k + [\, 1 - \alpha_{kj} (t)\,] \, \vec{c}_j, \quad \alpha_{kj} (t) = \frac{t-t_j}{T_{k,j}}
\label{eq_34}
\end{equation}
The $\bm Q$ matrix only contains non-trivial transitions, i.e.,
if after a transition the centroid remains on the same centroid, the transition is not considered to be a real transition in \gls{cnm}.
This idea
is an advancement to the original work of Kaiser et al. \cite{Kaiser2014}.
In Kaiser et al. \cite{Kaiser2014} the transition is modeled
as a Markov model. Markov models enable non-trivial transitions. Consequently,
the diagonals of the resulting non-direct transition matrix $\bm{Q_n}$
exhibits the highest values. The diagonal elements stand for non-trivial
transitions which lead to idling on the same centroid
many times. Such behavior is encountered and described by Kaiser et al. \cite{Kaiser2014}.\newline
There are 3 more important aspects that come along when
adhering to Markov models. First, the propagation of motion is done
by matrix-vector multiplication. In the case of the existence of a
stationary state, the solution
will converge to the stationary state, with an increasing number of iterations, where no change with time happens.
A dynamical system can only survive as long as change with time exists.
In cases where no change with respect to time is encountered, equilibrium
or fixed points are found.
Now, if a stationary state or fixed point
exists in the considered dynamical system, the propagation
will tend to converge to this fixed point. However, the nature of
Markov models must not necessarily be valid for general dynamical systems.
Another way to see that is by applying some linear algebra. The
long-term behavior of the Markov transition matrix can be obtained
with equation \eqref{eq_3_Infinite}. Here, $l$ is the number
of iterations to get from one stage to another. Kaiser et al.
\cite{Kaiser2014} depict in a figure, how the values of
$\bm{Q_n}$ evolves after $1 \mathrm{e}{+3}$ steps. $\bm{Q_n}$ has
become more uniform.
\begin{equation}
\label{eq_3_Infinite}
\lim\limits_{l \to \infty} \bm{Q_n}^l
\end{equation}
If the number of steps is increased even further
and all the rows would have the same probability value,
$\bm{Q_n}$ would converge to a stationary point. What
also can be concluded from rows being equal is that it does not matter
from where the dynamical system was started or what its
initial conditions were. The probability
to end at one specific state or centroid is constant as
the number of steps approaches infinity. Following that,
it would violate the sensitive dependency on initial conditions,
which often is considered to be mandatory for modeling chaotic systems. Moreover, chaotic
systems amplify any perturbation exponentially, whether at time
$t = 0$ or at time $t>>0$. \newline
Thus, a stationary transition matrix $\bm{Q_n}$ is prohibited by chaos at any time step.
This can be found to be one of the main reasons, why
the \textbf{C}luster \textbf{M}arkov based \textbf{M}odeling (\gls{cmm})
often fails to
predict the trajectory.
Li et al. \cite{Li2021} summarize this observation
compactly as after some time the initial condition
would be forgotten and the asymptotic distribution would be reached.
Further, they stated, that due to this fact, \gls{cmm} would
not be suited for modeling dynamical systems.
The second problem which is involved, when deploying
regular Markov modeling is that the future only depends
on the current state. However, \cite{Fernex2021} has shown
with the latest \gls{cnm} version that incorporating also past
centroid positions for predicting the next centroid position
increases the prediction quality. The latter effect is especially
true when systems are complex.\newline
However, for multiple consecutive time steps
the trajectories position still could be assigned to the same
centroid position (trivial transitions).
Thus, past centroids are those centroids that are found when going
back in time through only non-trivial transitions. The number of incorporated
past centroids is given as equation \eqref{eq_5_B_Past}, where $L$ is denoted
as the model order number. It represents the number of all
considered centroids, where the current and all the past centroids are included, with which the prediction of the successor centroid
is made.
\begin{equation}
B_{past} = L -1
\label{eq_5_B_Past}
\end{equation}
Furthermore, in \cite{Fernex2021} it is not simply believed that an
increasing model
order $L$ would increase the outcome quality in every case.
Therefore, a study on the number of $L$ and the clusters $K$
was conducted. The results proved that the choice of
$L$ and $K$ depend on the considered dynamical system.
\newline
The third problem encountered when Markov models are used is
that the time step must be provided. This time step is used
to define when a transition is expected. In case
the time step is too small, some amount of iterations is
required to transit to the next centroid. Thus, non-trivial
transitions would occur. In case the time step is too high,
the intermediate centroids would be missed. Such behavior
would be a coarse approximation of the real dynamics. Visually this can
be thought of as jumping from one centroid to another while
having skipped one or multiple centroids. The reconstructed
trajectory could lead to an entirely wrong representation of the
state-space.
CNM generates the transition time matrix $\bm T$ from data
and therefore no input from the user is required.\newline
A brief review of how the $\bm Q$ is built shall be provided.
Since the concept of
model order, $L$ has been explained, it can be clarified that
it is not always right to call $\bm Q$ and $\bm T$ matrices.
The latter is only correct, if $L = 1$, otherwise it must be
denoted as a tensor. $\bm Q$ and $\bm T$ can always be
referred to as tensors since a tensor incorporates matrices, i.e., a matrix is a tensor of rank 2.
In order to generate $\bm Q$,
$L$ must be defined, such that the shape of $\bm Q$ is
known. The next step is to gather all sequences of clusters
$c_i$. To understand that, we imagine the following scenario,
$L = 3$, which means 2 centroids from the past and the
current one are
incorporated to predict the next centroid.
Furthermore, imagining that two cluster sequence scenarios were found,
$c_0 \rightarrow c_1 \rightarrow c_2 $ and $c_5 \rightarrow c_1 \rightarrow c_2 $.
These cluster sequences tell us that the current centroid is $c_2$ and the remaining centroids belong to the past.
In order to complete the sequence for $L = 3$, the successor cluster also needs
to be added, $c_0 \rightarrow c_1 \rightarrow c_2 \rightarrow c_5 $ and $c_5 \rightarrow c_1 \rightarrow c_2 \rightarrow c_4$.
The following step is to calculate the likelihood
of a transition to a specific successor cluster. This is done with equation \eqref{eq_4_Poss}, where $n_{k, \bm{j}}$
is the amount of complete sequences, where also the successor
is found. The index $j$ is written as a vector in order
to generalize the equation for $L \ge 1$. It then contains
all incorporated centroids from the past and the current centroid.
The index $k$ represents the successor centroid ($\bm{j} \rightarrow k$).
Finally, $n_{\bm{j}}$ counts all the matching incomplete sequences.
\begin{equation}
\label{eq_4_Poss}
P_{k, \bm j} = \frac{n_{k,\bm{j}}}{n_{\bm{j}}}
\end{equation}
After having collected all the possible complete cluster sequences with their corresponding probabilities $\bm Q$, the transition time tensors $\bm T$ can be inferred from the data.
With that, the residence time on each cluster is known and can be
used for computing the transition times for every
single transition. At this stage, it shall be highlighted again,
CNM approximates its data fully with only two
matrices or when $L \ge 2$ tensors, $\bm Q$ and $\bm T$. The
final step is the prorogation following equation \eqref{eq_34}.
For smoothing the propagation between two centroids the B-spline interpolation
is applied.
% It can be concluded that one of the major differences between \gls{cnm} and \gls{cmm} is that {cnm} dismissed Markov modeling.
% Hence, only direct or non-trivial transition are possible.
% Fernex et al. \cite{Fernex2021} improved \cite{Li2021} by
% rejecting one more property of Markov chains, namely
% that the future state could be inferred exclusively from the current state.
% Through the upgrade of \cite{Fernex2021}, incorporating past states for the prediction of future states could be exploited.