Spaces:

JavedA
/

master_Thesis

Running

App Files Files Community

master_Thesis / Data /0_Latex_True /1_Task /3_CNM.tex

JavedA

init

a67ae61 over 1 year ago

raw

history blame

11.4 kB

	% =================================================
	% ================ Meet \gls{cnm} =======================
	% =================================================
	\section{Cluster-based Network Modeling (CNM)}
	\label{sec_1_1_2_CNM}
	In this subsection, the workflow of \gls{cnm} \cite{Fernex2021} will be elaborated, as well as the previous attempt to expand the algorithm to accommodate a range of model parameter values $\vec{\beta}$.
	\gls{cnm} \cite{Fernex2021} is the basis on which \gls{cnmc} is built or rather
	\gls{cnmc} invokes \gls{cnm} multiple times for one of its preprocessing steps.
	CNM can be split up into 4 main tasks, which are
	data collection, clustering, calculating
	transition properties and propagation.
	The first step is to collect the data, which can be provided from any dynamic system or numerical simulations.
	In this study, only dynamical systems are investigated.
	Once the data for the dynamical system is passed to \gls{cnm}, the data is clustered, e.g., with k-means++ algorithm \cite{Arthur2006}.
	A detailed elaboration about this step is given in section \ref{sec_2_3_Clustering}. \gls{cnm} exploits graph theory for approximating the trajectory as a movement on nodes.
	These nodes are equivalent to the centroids, which are acquired through clustering.
	Next, the motion, i.e., movement from one centroid to another, shall be clarified.\newline

	In order to fully describe the motion on the centroids, the time at which
	one centroid is visited is exited, and also the order of movement must be known.
	Note, when saying the motion is on the centroids, that
	means the centroids or characteristic nodes do not move
	at all. The entire approximated motion of the original trajectory
	on the nodes is described with the transition
	property matrices $\bm Q$ and $\bm T$.
	The matrices $\bm Q$ and $\bm T$ are the transition probability and transition time matrices, respectively.
	$\bm Q$ is used to apply probability theory for predicting the next following most likely centroid. In other words, if
	the current location is at any node $c_i$,
	$\bm Q$ will provide all possible successor centroids
	with their corresponding transition probabilities.
	Thus, the motion on the centroids
	through $\bm Q$ is probability-based.
	In more detail, the propagation of the motion on the centroids can be described as equation \eqref{eq_34}.
	The variables are denoted as the propagated $\vec{x}(t)$ trajectory, time $t$, centroid positions $\vec{c}_k,\, \vec{c}_j$, the time $t_j$ where centroid $\vec{c}_j$ is left and the transition time $T_{k,j}$ from $\vec{c}_j$ to $\vec{c}_k$ \cite{Fernex2021}.
	Furthermore, for the sake of a smooth trajectory, the motion between the centroids is interpolated through a spline interpolation.\newline

	\begin{equation}
	\vec{x}(t) = \alpha_{kj} (t) \, \vec{c}_k + [\, 1 - \alpha_{kj} (t)\,] \, \vec{c}_j, \quad \alpha_{kj} (t) = \frac{t-t_j}{T_{k,j}}
	\label{eq_34}
	\end{equation}


	The $\bm Q$ matrix only contains non-trivial transitions, i.e.,
	if after a transition the centroid remains on the same centroid, the transition is not considered to be a real transition in \gls{cnm}.
	This idea
	is an advancement to the original work of Kaiser et al. \cite{Kaiser2014}.
	In Kaiser et al. \cite{Kaiser2014} the transition is modeled
	as a Markov model. Markov models enable non-trivial transitions. Consequently,
	the diagonals of the resulting non-direct transition matrix $\bm{Q_n}$
	exhibits the highest values. The diagonal elements stand for non-trivial
	transitions which lead to idling on the same centroid
	many times. Such behavior is encountered and described by Kaiser et al. \cite{Kaiser2014}.\newline


	There are 3 more important aspects that come along when
	adhering to Markov models. First, the propagation of motion is done
	by matrix-vector multiplication. In the case of the existence of a
	stationary state, the solution
	will converge to the stationary state, with an increasing number of iterations, where no change with time happens.
	A dynamical system can only survive as long as change with time exists.
	In cases where no change with respect to time is encountered, equilibrium
	or fixed points are found.
	Now, if a stationary state or fixed point
	exists in the considered dynamical system, the propagation
	will tend to converge to this fixed point. However, the nature of
	Markov models must not necessarily be valid for general dynamical systems.
	Another way to see that is by applying some linear algebra. The
	long-term behavior of the Markov transition matrix can be obtained
	with equation \eqref{eq_3_Infinite}. Here, $l$ is the number
	of iterations to get from one stage to another. Kaiser et al.
	\cite{Kaiser2014} depict in a figure, how the values of
	$\bm{Q_n}$ evolves after $1 \mathrm{e}{+3}$ steps. $\bm{Q_n}$ has
	become more uniform.

	\begin{equation}
	\label{eq_3_Infinite}
	\lim\limits_{l \to \infty} \bm{Q_n}^l
	\end{equation}

	If the number of steps is increased even further
	and all the rows would have the same probability value,
	$\bm{Q_n}$ would converge to a stationary point. What
	also can be concluded from rows being equal is that it does not matter
	from where the dynamical system was started or what its
	initial conditions were. The probability
	to end at one specific state or centroid is constant as
	the number of steps approaches infinity. Following that,
	it would violate the sensitive dependency on initial conditions,
	which often is considered to be mandatory for modeling chaotic systems. Moreover, chaotic
	systems amplify any perturbation exponentially, whether at time
	$t = 0$ or at time $t>>0$. \newline

	Thus, a stationary transition matrix $\bm{Q_n}$ is prohibited by chaos at any time step.
	This can be found to be one of the main reasons, why
	the \textbf{C}luster \textbf{M}arkov based \textbf{M}odeling (\gls{cmm})
	often fails to
	predict the trajectory.
	Li et al. \cite{Li2021} summarize this observation
	compactly as after some time the initial condition
	would be forgotten and the asymptotic distribution would be reached.
	Further, they stated, that due to this fact, \gls{cmm} would
	not be suited for modeling dynamical systems.
	The second problem which is involved, when deploying
	regular Markov modeling is that the future only depends
	on the current state. However, \cite{Fernex2021} has shown
	with the latest \gls{cnm} version that incorporating also past
	centroid positions for predicting the next centroid position
	increases the prediction quality. The latter effect is especially
	true when systems are complex.\newline


	However, for multiple consecutive time steps
	the trajectories position still could be assigned to the same
	centroid position (trivial transitions).
	Thus, past centroids are those centroids that are found when going
	back in time through only non-trivial transitions. The number of incorporated
	past centroids is given as equation \eqref{eq_5_B_Past}, where $L$ is denoted
	as the model order number. It represents the number of all
	considered centroids, where the current and all the past centroids are included, with which the prediction of the successor centroid
	is made.

	\begin{equation}
	B_{past} = L -1
	\label{eq_5_B_Past}
	\end{equation}

	Furthermore, in \cite{Fernex2021} it is not simply believed that an
	increasing model
	order $L$ would increase the outcome quality in every case.
	Therefore, a study on the number of $L$ and the clusters $K$
	was conducted. The results proved that the choice of
	$L$ and $K$ depend on the considered dynamical system.
	\newline

	The third problem encountered when Markov models are used is
	that the time step must be provided. This time step is used
	to define when a transition is expected. In case
	the time step is too small, some amount of iterations is
	required to transit to the next centroid. Thus, non-trivial
	transitions would occur. In case the time step is too high,
	the intermediate centroids would be missed. Such behavior
	would be a coarse approximation of the real dynamics. Visually this can
	be thought of as jumping from one centroid to another while
	having skipped one or multiple centroids. The reconstructed
	trajectory could lead to an entirely wrong representation of the
	state-space.
	CNM generates the transition time matrix $\bm T$ from data
	and therefore no input from the user is required.\newline

	A brief review of how the $\bm Q$ is built shall be provided.
	Since the concept of
	model order, $L$ has been explained, it can be clarified that
	it is not always right to call $\bm Q$ and $\bm T$ matrices.
	The latter is only correct, if $L = 1$, otherwise it must be
	denoted as a tensor. $\bm Q$ and $\bm T$ can always be
	referred to as tensors since a tensor incorporates matrices, i.e., a matrix is a tensor of rank 2.
	In order to generate $\bm Q$,
	$L$ must be defined, such that the shape of $\bm Q$ is
	known. The next step is to gather all sequences of clusters
	$c_i$. To understand that, we imagine the following scenario,
	$L = 3$, which means 2 centroids from the past and the
	current one are
	incorporated to predict the next centroid.
	Furthermore, imagining that two cluster sequence scenarios were found,
	$c_0 \rightarrow c_1 \rightarrow c_2 $ and $c_5 \rightarrow c_1 \rightarrow c_2 $.
	These cluster sequences tell us that the current centroid is $c_2$ and the remaining centroids belong to the past.
	In order to complete the sequence for $L = 3$, the successor cluster also needs
	to be added, $c_0 \rightarrow c_1 \rightarrow c_2 \rightarrow c_5 $ and $c_5 \rightarrow c_1 \rightarrow c_2 \rightarrow c_4$.
	The following step is to calculate the likelihood
	of a transition to a specific successor cluster. This is done with equation \eqref{eq_4_Poss}, where $n_{k, \bm{j}}$
	is the amount of complete sequences, where also the successor
	is found. The index $j$ is written as a vector in order
	to generalize the equation for $L \ge 1$. It then contains
	all incorporated centroids from the past and the current centroid.
	The index $k$ represents the successor centroid ($\bm{j} \rightarrow k$).
	Finally, $n_{\bm{j}}$ counts all the matching incomplete sequences.

	\begin{equation}
	\label{eq_4_Poss}
	P_{k, \bm j} = \frac{n_{k,\bm{j}}}{n_{\bm{j}}}
	\end{equation}

	After having collected all the possible complete cluster sequences with their corresponding probabilities $\bm Q$, the transition time tensors $\bm T$ can be inferred from the data.
	With that, the residence time on each cluster is known and can be
	used for computing the transition times for every
	single transition. At this stage, it shall be highlighted again,
	CNM approximates its data fully with only two
	matrices or when $L \ge 2$ tensors, $\bm Q$ and $\bm T$. The
	final step is the prorogation following equation \eqref{eq_34}.
	For smoothing the propagation between two centroids the B-spline interpolation
	is applied.

	% It can be concluded that one of the major differences between \gls{cnm} and \gls{cmm} is that {cnm} dismissed Markov modeling.
	% Hence, only direct or non-trivial transition are possible.
	% Fernex et al. \cite{Fernex2021} improved \cite{Li2021} by
	% rejecting one more property of Markov chains, namely
	% that the future state could be inferred exclusively from the current state.
	% Through the upgrade of \cite{Fernex2021}, incorporating past states for the prediction of future states could be exploited.