Spaces:

JavedA
/

master_Thesis

Running

App Files Files Community

master_Thesis / Data /1_Writing /2_Task /0_Methodlogy.qmd

JavedA

unable to cross ref for quarto bug

6b7a84b over 1 year ago

raw

history blame

5.23 kB

	# Methodology {#sec-chap_2_Methodology}
	In this chapter, the entire pipeline for designing the proposed
	\gls{cnmc} is elaborated. For this purpose, the ideas behind
	the individual processes are explained.
	Results from the step tracking onwards will be presented in chapter [-@sec-ch_3].

	Having said that, \gls{cnmc} consists of multiple main process steps or stages.
	First, a broad overview of the \gls{cnmc}'s workflow shall be given.
	Followed by a detailed explanation for each major operational step. The
	implemented process stages are presented in the same order as they are
	executed in \gls{cnmc}. However, \gls{cnmc} is not forced
	to go through each stage. If the output of some steps is
	already available, the execution of the respective steps can be skipped. \newline

	The main idea behind such an implementation is to prevent computing the same task multiple times.
	Computational time can be reduced if the output of some \gls{cnmc} steps are available.
	Consequently, it allows users to be flexible in their explorations.
	It could be the case that only one step of \textsc{CNMc} is desired to be examined with different settings or even with newly implemented functions without running the full \gls{cnmc} pipeline.
	Let the one \gls{cnmc} step be denoted as C, then it is possible to skip steps A and B if their output is already calculated and thus available.
	Also, the upcoming steps can be skipped or activated depending on the need for their respective outcomes.
	Simply put, the mentioned flexibility enables to load data for A and B and execute only C. Executing follow-up steps or loading their data is also made selectable.
	<!-- % -->
	<!-- %------------------------------- SHIFT FROM INTRODUCTION ---------------------- -->
	<!-- % -->
	Since the tasks of this thesis required much coding,
	it is important to
	mention the used programming language and the dependencies.
	As for the programming language,
	Python 3 [@VanRossum2009] was chosen. For the libraries, only a few important libraries will be mentioned, because the number of used libraries is high. Note, each used module is
	freely available on the net and no licenses are required to be purchased.
	\newline

	The important libraries in terms of performing actual calculations are
	NumPy [@harris2020array], SciPy [@2020SciPy-NMeth], Scikit-learn [@scikit-learn], pySindy [@Silva2020; @Kaptanoglu2022], for multi-dimensional sparse matrix management sparse and for plotting only plotly [@plotly] was deployed. One of the reason why plotly is preferred over Matplotlib [@Hunter:2007] are post-processing capabilities, which now a re available. Note, the previous \gls{cmmc} version used Matplotlib [@Hunter:2007], which in this work has been fully replaced by plotly [@plotly]. More reasons why this modification is useful and new implemented post-processing capabilities will be given in the upcoming sections.\newline

	For local coding, the author's Linux-Mint-based laptop with the following hardware was deployed: CPU: Intel Core i7-4702MQ \gls{cpu}@ 2.20GHz × 4, RAM: 16GB.
	The Institute of fluid dynamics of the Technische Universität Braunschweig
	also supported this work by providing two more powerful computation resources.
	The hardware specification will not be mentioned, due to the fact, that all computations and results elaborated in this thesis can be obtained by
	the hardware described above (authors laptop). However, the two provided
	resources shall be mentioned and explained if \gls{cnmc} benefits from
	faster computers. The first bigger machine is called Buran, it is a
	powerful Linux-based working station and access to it is directly provided by
	the chair of fluid dynamics. \newline

	The second resource is the high-performance
	computer or cluster available across the Technische Universität Braunschweig
	Phoenix. The first step, where the dynamical systems are solved through an \gls{ode} solver
	is written in a parallel manner. This step can if specified in the settings.py file, be performed in parallel and thus benefits from
	multiple available cores. However, most implemented \gls{ode}s are solved within
	a few seconds. There are also some dynamical systems implemented whose
	ODE solution can take a few minutes. Applying \gls{cnmc} on latter dynamical
	systems results in solving their \gls{ode}s for multiple different model parameter values. Thus, deploying the parallelization can be advised in the latter mentioned time-consuming \gls{ode}s.\newline

	By far the most time-intensive part of the improved \gls{cnmc} is the clustering step. The main computation for this step is done with
	Scikit-learn [@scikit-learn]. It is heavily parallelized and the
	computation time can be reduced drastically when multiple threads are available.
	Other than that, NumPy and SciPy are well-optimized libraries and
	are assumed to benefit from powerful computers. In summary, it shall be stated that a powerful machine is for sure advised when multiple dynamical
	systems with a range of different settings shall be investigated since parallelization is available. Yet executing \gls{cnmc} on a single dynamical system, a regular laptop can be regarded as
	a sufficient tool.