Spaces:
Running
Running
# Methodology {#sec-chap_2_Methodology} | |
In this chapter, the entire pipeline for designing the proposed | |
\gls{cnmc} is elaborated. For this purpose, the ideas behind | |
the individual processes are explained. | |
Results from the step tracking onwards will be presented in chapter [-@sec-ch_3]. | |
Having said that, \gls{cnmc} consists of multiple main process steps or stages. | |
First, a broad overview of the \gls{cnmc}'s workflow shall be given. | |
Followed by a detailed explanation for each major operational step. The | |
implemented process stages are presented in the same order as they are | |
executed in \gls{cnmc}. However, \gls{cnmc} is not forced | |
to go through each stage. If the output of some steps is | |
already available, the execution of the respective steps can be skipped. \newline | |
The main idea behind such an implementation is to prevent computing the same task multiple times. | |
Computational time can be reduced if the output of some \gls{cnmc} steps are available. | |
Consequently, it allows users to be flexible in their explorations. | |
It could be the case that only one step of \textsc{CNMc} is desired to be examined with different settings or even with newly implemented functions without running the full \gls{cnmc} pipeline. | |
Let the one \gls{cnmc} step be denoted as C, then it is possible to skip steps A and B if their output is already calculated and thus available. | |
Also, the upcoming steps can be skipped or activated depending on the need for their respective outcomes. | |
Simply put, the mentioned flexibility enables to load data for A and B and execute only C. Executing follow-up steps or loading their data is also made selectable. | |
<!-- % --> | |
<!-- %------------------------------- SHIFT FROM INTRODUCTION ---------------------- --> | |
<!-- % --> | |
Since the tasks of this thesis required much coding, | |
it is important to | |
mention the used programming language and the dependencies. | |
As for the programming language, | |
*Python 3* [@VanRossum2009] was chosen. For the libraries, only a few important libraries will be mentioned, because the number of used libraries is high. Note, each used module is | |
freely available on the net and no licenses are required to be purchased. | |
\newline | |
The important libraries in terms of performing actual calculations are | |
*NumPy* [@harris2020array], *SciPy* [@2020SciPy-NMeth], *Scikit-learn* [@scikit-learn], *pySindy* [@Silva2020; @Kaptanoglu2022], for multi-dimensional sparse matrix management *sparse* and for plotting only *plotly* [@plotly] was deployed. One of the reason why *plotly* is preferred over *Matplotlib* [@Hunter:2007] are post-processing capabilities, which now a re available. Note, the previous *\gls{cmm*c} version used *Matplotlib* [@Hunter:2007], which in this work has been fully replaced by *plotly* [@plotly]. More reasons why this modification is useful and new implemented post-processing capabilities will be given in the upcoming sections.\newline | |
For local coding, the author's Linux-Mint-based laptop with the following hardware was deployed: CPU: Intel Core i7-4702MQ \gls{cpu}@ 2.20GHz × 4, RAM: 16GB. | |
The Institute of fluid dynamics of the Technische Universität Braunschweig | |
also supported this work by providing two more powerful computation resources. | |
The hardware specification will not be mentioned, due to the fact, that all computations and results elaborated in this thesis can be obtained by | |
the hardware described above (authors laptop). However, the two provided | |
resources shall be mentioned and explained if \gls{cnmc} benefits from | |
faster computers. The first bigger machine is called *Buran*, it is a | |
powerful Linux-based working station and access to it is directly provided by | |
the chair of fluid dynamics. \newline | |
The second resource is the high-performance | |
computer or cluster available across the Technische Universität Braunschweig | |
*Phoenix*. The first step, where the dynamical systems are solved through an \gls{ode} solver | |
is written in a parallel manner. This step can if specified in the *settings.py* file, be performed in parallel and thus benefits from | |
multiple available cores. However, most implemented \gls{ode}s are solved within | |
a few seconds. There are also some dynamical systems implemented whose | |
ODE solution can take a few minutes. Applying \gls{cnmc} on latter dynamical | |
systems results in solving their \gls{ode}s for multiple different model parameter values. Thus, deploying the parallelization can be advised in the latter mentioned time-consuming \gls{ode}s.\newline | |
By far the most time-intensive part of the improved \gls{cnmc} is the clustering step. The main computation for this step is done with | |
*Scikit-learn* [@scikit-learn]. It is heavily parallelized and the | |
computation time can be reduced drastically when multiple threads are available. | |
Other than that, *NumPy* and *SciPy* are well-optimized libraries and | |
are assumed to benefit from powerful computers. In summary, it shall be stated that a powerful machine is for sure advised when multiple dynamical | |
systems with a range of different settings shall be investigated since parallelization is available. Yet executing \gls{cnmc} on a single dynamical system, a regular laptop can be regarded as | |
a sufficient tool. | |