Spaces:

JavedA
/

master_Thesis

Running

App Files Files Community

master_Thesis / Data /1_Writing /1_Task /4_CNMc.qmd

JavedA

unable to cross ref for quarto bug

6b7a84b over 1 year ago

raw

history blame

8.56 kB

	### First version of CNMc {#sec-subsec_1_1_3_first_CNMc}
	Apart from this thesis, there already has been an
	attempt to build \glsfirst{cnmc}.
	The procedure, progress and results of the most recent effort are described in [@Max2021].
	Also, in the latter, the main idea was to predict the trajectories
	for dynamical systems with a control term or a model parameter value $\beta$.
	In this subsection, a review of
	[@Max2021] shall be given with pointing out which parts need to be improved. In addition, some distinctions between the previous version of \gls{cnmc} and the most recent version are named.
	Further applied modifications are provided in chapter [-@sec-chap_2_Methodology].\newline

	To avoid confusion between the \gls{cnmc} version described in this thesis and the prior \gls{cnmc} version, the old version will be referred to as first CNMc.
	First CNMc starts by defining a range of model parameter values
	$\vec{\beta}$.
	It was specifically designed to only be able to make predictions for the Lorenz attractor [@lorenz1963deterministic], which is described with the set of equations @eq-eq_6_Lorenz given in section [-@sec-sec_2_2_Data_Gen].
	An illustrative trajectory is of the Lorenz system [@lorenz1963deterministic] with $\beta = 28$ is depicted in figure @fig-fig_2_Lorenz_Example .\newline

	<!-- % ============================================================================== -->
	<!-- % ============================ PLTS ============================================ -->
	<!--% ==============================================================================-->

	![Illustrative trajectory of the Lorenz attractor [@lorenz1963deterministic], $\beta = 28$](../../3_Figs_Pyth/1_Task/2_Lorenz.svg){#fig-fig_2_Lorenz_Example}

	Having chosen a range of model parameter values $\vec{\beta}$, the Lorenz system was solved numerically and its solution was supplied to \gls{cnm} in order to run k-means++ on all received trajectories.
	<!-- % It assigns each data point to a cluster and -->
	<!-- % calculates all the $K$ cluster centroids for all provided trajectories. -->
	<!-- % Each cluster has an identity that in literature is known as a label, with which it can be accessed. -->
	The centroid label allocation by the k-means+ algorithm is conducted randomly.
	Thus, linking or matching centroid labels from one model parameter value $\beta_i$ to another model parameter value $\beta_j$, where $i \neq j$, is performed in 3 steps.
	The first two steps are ordering the $\vec{\beta}$ in ascending
	order and transforming the Cartesian coordinate system into a spherical coordinate system.
	With the now available azimuth angle, each centroid is labeled in increasing order of the azimuth angle.
	The third step is to match the centroids across $\vec{\beta}$, i.e., $\beta_i$ with $\beta_j$.
	For this purpose, the centroid label from the prior model parameter value
	is used as a reference to match its corresponding nearest centroid in the next model parameter value.
	As a result, one label can be assigned to one centroid across the available $\vec{\beta}$.\newline


	Firstly, [@Max2021] showed that ambiguous regions can
	occur. Here the matching of the centroids across the $\vec{\beta}$ can
	not be trusted anymore.
	Secondly, the deployed coordinate transformation is assumed to only work properly in 3 dimensions. There is the possibility to set one
	or two variables to zero in order to use it in two or one dimension, respectively.
	However, it is not known, whether such an artificially decrease of dimensions yields a successful outcome for lower-dimensional (2- and 1-dimensional) dynamical systems. In the event of a 4-dimensional or even higher dimensional case, the proposed coordinate transformation cannot be used anymore.
	In conclusion, the transformation is only secure to be utilized in 3 dimensions.
	Thirdly, which is also acknowledged by [@Max2021] is that the
	coordinate transformation forces the dynamical system to have
	a circular-like trajectory, e.g., as the in figure @fig-fig_2_Lorenz_Example depicted Lorenz system does.
	Since not every dynamical system is forced to have a circular-like trajectory, it is one of the major parts which needs to be improved, when first CNMc is meant to be leveraged for all kinds of dynamical systems.
	Neither the number of dimensions nor the shape of the trajectory should matter for a generalized \gls{cnmc}.\newline


	Once the centroids are matched across all the available $\vec{\beta}$ pySINDy [@Brunton2016; @Silva2020; @Kaptanoglu2022] is used
	to build a regression model. This regression model serves the purpose
	of capturing all centroid positions of the calculated model parameter
	values $\vec{\beta }$ and making predictions for unseen $\vec{\beta}_{unseen}$.
	Next, a preprocessing step is performed on the
	transition property tensors $\boldsymbol Q$ and $\boldsymbol T$. Both are
	scaled, such that the risk of a bias is assumed to be reduced.
	Then, on both \glsfirst{nmf} [@Lee1999] is
	applied.
	Following equation @eq-eq_5_NMF \gls{nmf} [@Lee1999] returns
	two matrices, i.e., $\boldsymbol W$ and $\boldsymbol H$.
	The matrices exhibit a physically
	relevant meaning. $\boldsymbol W$ corresponds to a mode collection and $\boldsymbol H$ contains
	the weighting factor for each corresponding mode.\newline
	$$
	\begin{equation}
	\label{eq_5_NMF}
	\boldsymbol {A_{i \mu}} \approx \boldsymbol A^{\prime}_{i \mu} = (\boldsymbol W \boldsymbol H)_{i \mu} = \sum_{a = 1}^{r}
	\boldsymbol W_{ia} \boldsymbol H_{a \mu}
	\end{equation}
	$$ {#eq-eq_5_NMF}

	The number of modes $r$ depends on the underlying dynamical system.
	Firstly, the \gls{nmf} is utilized by deploying optimization.
	The goal is to satisfy the condition that, the deviation between the original matrix and the approximated matrix shall be below a chosen threshold.
	For this purpose, the number of required optimization iterations easily can be
	in the order of $\mathcal{O} (1 \mathrm{e}+7)$. The major drawback here is that such a high number of iterations is computationally very expensive.
	Secondly, for first CNMc the number of modes $r$ must be known beforehand.
	Since in most cases this demand cannot be fulfilled two issues arise.
	On the one hand, running \gls{nmf} on a single known $r$ can already be considered to be computationally expensive.
	On the other hand, conducting a study to find the appropriate $r$ involves even more computational effort.
	Pierzyna [@Max2021] acknowledges this issue and defined it to be one of the major limitations. \newline


	The next step is to generate a regression model with \glsfirst{rf}.
	Some introductory words about \gls{rf} are given in subsection [-@sec-subsec_2_4_2_QT].
	As illustrated in [@Max2021], \gls{rf} was able to reproduce the training data reasonably well.
	However, it faced difficulties to approximate spike-like curves.
	Once the centroid positions and the two transitions property tensors $\boldsymbol Q$ and $\boldsymbol T$ are known, they are passed to \gls{cnm} to calculate the predicted trajectories.
	For assessing the prediction quality two methods are used, i.e., the autocorrelation and the \glsfirst{cpd}.
	\gls{cpd} outlines the probability of being on one of the $K$ clusters.
	The autocorrelation given in equation @eq-eq_35 allows comparing two trajectories with a phase-mismatch [@protas2015optimal] and it measures how well a point in trajectory correlates with a point that is some time steps ahead.
	The variables in equation @eq-eq_35 are denoted as time lag $\tau$, state space vector $\boldsymbol x$, time $t$ and the inner product $(\boldsymbol x, \boldsymbol y) = \boldsymbol x \cdot \boldsymbol{y}^T$. \newline
	$$
	\begin{equation}
	R(\tau) = \frac{1}{T - \tau} \int\limits_{0}^{T-\tau}\, (\boldsymbol{x} (t), \boldsymbol{x}(t+ \tau)) dt, \quad \tau \in [\, 0, \, T\,]
	\label{eq_35}
	\end{equation}
	$$ {#eq-eq_35}

	First CNMc proved to work well for the Lorenz system only for the number of centroids up to $K=10$ and small $\beta$.
	Among the points which need to be improved is the method to match the centroids across the chosen $\vec{\beta}$.
	Because of this, two of the major problems occur, i.e., the limitation to 3 dimensions and the behavior of the trajectory must be circular, similar to the Lorenz system [@lorenz1963deterministic].
	These demands are the main obstacles to the application of first CNMc to all kinds of dynamical systems.
	The modal decomposition with \gls{nmf} is the most computationally intensive part and should be replaced by a faster alternative.