Spaces:
Running
Running
## Transition properties modeling {#sec-sec_3_3_SVD_NMF} | |
In the subsection [-@sec-subsec_2_4_2_QT], it has been explained that \gls{cnmc} has two built-in modal decomposition methods for the $\boldsymbol Q / \boldsymbol T$ tensors, i .e., \gls{svd} and NMF. | |
There are two main concerns for which performance measurements are needed. | |
First, in subsection [-@sec-subsec_3_3_1_SVD_Speed], the computational costs of both methods are examined . | |
Then in subsection [-@sec-subsec_3_3_2_SVD_Quality], the \gls{svd} and \gls{nmf} prediction quality will be presented and assessed . | |
### Computational cost {#sec-subsec_3_3_1_SVD_Speed} | |
In this subsection, the goal is to evaluate the computational cost of the two decomposition methods implemented in \gls{cnmc}. | |
\gls{nmf} was already used in *first CNMc* and it was found to be one of the most computational expensive tasks. | |
With an increasing model order $L$ it became the most computational task by far, which is acknowledged by [@Max2021]. | |
The run time was one of the main reasons why \gls{svd} should be implemented in \gls{cnmc}. | |
To see if \gls{svd} can reduce run time, both methods shall be compared.\newline | |
First, it is important to mention that \gls{nmf} is executed for one single predefined mode number $r$. | |
It is possible that a selected $r$ is not optimal, since $r$ is a parameter that depends not only on the chosen dynamical system but also on other parameters, e.g., the number of centroids $K$ and training model parameter values $n_{\beta, tr}$, as well as \gls{nmf} specific attributes. | |
These are the maximal number of iterations in which the optimizer can converge and tolerance convergence. | |
However, to find an appropriate $r$, \gls{nmf} can be executed multiple times with different values for $r$. | |
Comparing the execution time of \gls{nmf} with multiple invocations against \gls{svd} can be regarded as an unbalanced comparison. | |
Even though for a new dynamical system and its configuration the optimal $r_{opt}$ for \gls{nmf} is most likely to be found over a parameter study, for the upcoming comparison, the run time of one single \gls{nmf} solution is measured.\newline | |
The model for this purpose is *SLS*. Since *SLS* is trained with the output of 7 pairwise different model parameter values $n_{\beta,tr} = 7$, the maximal rank in \gls{svd} is limited to 7. | |
Nevertheless, allowing \gls{nmf} to find a solution $r$ was defined as $r=9$, the maximal number of iterations in which the optimizer can converge is 10 million and the convergence tolerance is $1\mathrm{e}{-6}$. | |
Both methods can work with sparse matrices. | |
However, the \gls{svd} solver is specifically designed to solve sparse matrices. | |
The measured times for decomposing the $\boldsymbol Q / \boldsymbol T$ tensors for 7 different $L$ are listed in table @tbl-tab_6_NMF_SVD . | |
It can be observed that for \gls{svd} up to $L=6$, the computational time for both $\boldsymbol Q / \boldsymbol T$ tensors is less than 1 second. | |
Such an outcome is efficient for science and industry applications. | |
With $L=7$ a big jump in time for both $\boldsymbol Q / \boldsymbol T$ is found. | |
However, even after this increase, the decomposition took around 5 seconds, which still is acceptable.\newline | |
**$L$** | **SVD** $\boldsymbol Q$ | **NMF** $\boldsymbol Q$ |**SVD** $\boldsymbol T$ | **NMF** $\boldsymbol T$ | |
---------|-------------------------|-------------------------|--------------------------|-------------------------| | |
$1$ | $2 \,\mathrm{e}{-4}$ s | $64$ s | $8 \, \mathrm{e}{-05}$ s | $3 \, \mathrm{e}{-2}$ s | |
$2$ | $1 \, \mathrm{e}{-4}$ s | $8 \, \mathrm{e}{-2}$ s | $1 \, \mathrm{e}{-4}$ s | $1$ h | |
$3$ | $2 \, \mathrm{e}{-4}$ s | $10$ s | $2 \, \mathrm{e}{-4}$ s | $0.1$ s | |
$4$ | $4 \, \mathrm{e}{-3}$ s | $20$ s | $7 \, \mathrm{e}{-3}$ s | $1.5$ h | |
$5$ | $6 \, \mathrm{e}{-2}$ s | $> 3$ h | $3 \, \mathrm{e}{-2}$ s | - | |
$6$ | $0.4$ s | - | $0.4$ s | - | |
$7$ | $5.17$ s | - | $4.52$ s | - | |
: Execution time for *SLS* of \gls{nmf} and \gls{svd} for different $L$ {#tbl-tab_6_NMF_SVD} | |
Calculating $\boldsymbol Q$ with \gls{nmf} for $L=1$ already takes 64 seconds. | |
This is more than \gls{svd} demanded for $L=7$. | |
The $\boldsymbol T$ tensor on the other is much faster and is below a second. | |
However, as soon as $L=2$ is selected, $\boldsymbol T$ takes 1 full hour, $L=4$ more than 1 hour. | |
The table for \gls{nmf} is not filled, since running $\boldsymbol Q$ for $L=5$ was taking more than 3 hours, but still did not finish. | |
Therefore, the time measurement was aborted. | |
This behavior was expected since it was already mentioned in [@Max2021]. | |
Overall, the execution time for \gls{nmf} is not following a trend, e.g., computing $\boldsymbol T$ for $L=3$ is faster than for $L=2$ and $\boldsymbol Q$ for $L=4$ is faster than for $L=1$. | |
In other words, there is no obvious rule, on whether even a small $L$ could lead to hours of run time.\newline | |
It can be concluded that \gls{svd} is much faster than \gls{nmf} and it also shows a clear trend, i.e. the computation time is expected to increase with $L$. | |
\gls{nmf} on the other hand first requires an appropriate mode number $r$, which most likely demands a parameter study. | |
However, even for a single \gls{nmf} solution, it can take hours. | |
With increasing $L$ the amount of run time is generally expected to increase, even though no clear rule can be defined. | |
Furthermore, it needs to be highlighted that \gls{nmf} was tested on a small model, where $n_{\beta,tr} = 7$. The author of this thesis experienced an additional increase in run time when $n_{\beta,tr}$ is selected higher. | |
Also, executing \gls{nmf} on multiple dynamical systems or model configurations might become infeasible in terms of time. | |
Finally, with the implementation of \gls{svd}, the bottleneck in modeling $\boldsymbol Q / \boldsymbol T$ could be eliminated. | |
### Prediction quality {#sec-subsec_3_3_2_SVD_Quality} | |
In this subsection, the quality of the \gls{svd} and \gls{nmf} $\boldsymbol Q / \boldsymbol T$ predictions are evaluated. | |
The used model configuration for this aim is *SLS*. | |
First, only the $\boldsymbol Q$ output with \gls{svd} followed by \gls{nmf} shall be analyzed and compared. Then, the same is done for the $\boldsymbol T$ output.\newline | |
In order to see how many modes $r$ were chosen for \gls{svd} the two figures @fig-fig_54 and @fig-fig_55 are shown. | |
It can be derived that with $r = 4$, $99 \%$ of the information content could be captured. The presented results are obtained for $\boldsymbol Q$ and $L =1$.\newline | |
::: {layout-ncol="2"} | |
![*SLS*, \gls{svd}, cumulative energy of $\boldsymbol Q$ for $L=1$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/10_lb_Q_Cumlative_E.svg){#fig-fig_54} | |
![*SLS*, \gls{svd}, singular values of $\boldsymbol Q$ for $L=1$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/11_lb_Q_Sing_Val.svg){#fig-fig_55} | |
::: | |
Figures @fig-fig_56 to @fig-fig_58 depict the original $\boldsymbol{Q}(\beta_{unseen} = 28.5)$, which is generated with CNM, the \gls{cnmc} predicted $\boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5)$ and their deviation $| \boldsymbol{Q}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5) |$, respectively. | |
In the graphs, the probabilities to move from centroid $c_p$ to $c_j$ are indicated. | |
Contrasting figure @fig-fig_56 and @fig-fig_57 exhibits barely noticeable differences. | |
For highlighting present deviations, the direct comparison between the \gls{cnm} and \gls{cnmc} predicted $\boldsymbol Q$ tensors is given in figure @fig-fig_58 . | |
It can be observed that the highest value is $max( \boldsymbol{Q}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5) |) \approx 0.0697 \approx 0.07$. | |
Note that all predicted $\boldsymbol Q$ and $\boldsymbol T$ tensors are obtained with \gls{rf} as the regression model. | |
\newline | |
:::{#fig-fig_58_Full layout="[[1,1], [1]]"} | |
![Original $\boldsymbol{Q}(\beta_{unseen} = 28.5)$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/12_lb_0_Q_Orig_28.5.svg){#fig-fig_56} | |
![\gls{cnmc} predicted $\boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5)$ ](../../3_Figs_Pyth/3_Task/2_Mod_CPE/13_lb_2_Q_Aprox_28.5.svg){#fig-fig_57} | |
![Deviation $| \boldsymbol{Q}(\beta_{unseen}) - \boldsymbol{\tilde{Q}}(\beta_{unseen}) |$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/14_lb_4_Delta_Q_28.5.svg){#fig-fig_58} | |
*SLS*, \gls{svd}, original $\boldsymbol{Q}(\beta_{unseen} = 28.5)$ , \gls{cnmc} predicted $\boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5)$ and deviation $| \boldsymbol{Q}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5) |$ for $L=1$ | |
::: | |
The same procedure shall now be performed with NMF. | |
The results are depicted in figures @fig-fig_59 and @fig-fig_60 . | |
Note that the original \gls{cnm} $\boldsymbol{Q}(\beta_{unseen} = 28.5)$ does not change, thus figure @fig-fig_56 can be reused. By exploiting | |
figure @fig-fig_61, it can be observed that the highest deviation for the \gls{nmf} version is $max( \boldsymbol{Q}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5) |) \approx 0.0699 \approx 0.07$. | |
The maximal error of \gls{nmf} $(\approx 0.0699)$ is slightly higher than that of \gls{svd} $(\approx 0.0697)$. | |
Nevertheless, both methods have a very similar maximal error and seeing visually other significant differences is hard. | |
\newline | |
:::{layout="[[1,1]]"} | |
![\gls{cnmc} predicted $\boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5)$ ](../../3_Figs_Pyth/3_Task/2_Mod_CPE/15_lb_2_Q_Aprox_28.5.svg){#fig-fig_59} | |
![Deviation $| \boldsymbol{Q}(\beta_{unseen} ) - \boldsymbol{\tilde{Q}}(\beta_{unseen} ) |$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/16_lb_4_Delta_Q_28.5.svg){#fig-fig_60} | |
*SLS*, \gls{nmf}, \gls{cnmc} predicted $\boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5)$ and deviation $| \boldsymbol{Q}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{Q}}(\beta_{unseen} = 28.5) |$ for $L=1$ | |
::: | |
In order to have a quantifiable error value, the Mean absolute error (MAE) following equation @eq-eq_23 is leveraged. | |
The MAE errors for \gls{svd} and \gls{nmf} are $MAE_{SVD} = 0.002 580 628$ and $MAE_{NMF} = 0.002 490 048$, respectively. | |
\gls{nmf} is slightly better than \gls{svd} with $MAE_{SVD} - MAE_{NMF} \approx 1 \mathrm{e}{-4}$, which can be considered to be negligibly small. | |
Furthermore, it must be stated that \gls{svd} was only allowed to use $r_{SVD} = 4$ modes, due to the $99 \%$ energy demand, whereas \gls{nmf} used $r_{NMF} = 9$ modes. | |
Given that \gls{svd} is stable in computational time, i.e., it is not assumed that for low $L$, the computational cost scales up to hours, \gls{svd} is the clear winner for this single comparison. \newline | |
For the sake of completeness, the procedure shall be conducted once as well for the $\boldsymbol T$ tensor. | |
For this purpose figures @fig-fig_61 to @fig-fig_65 shall be considered. | |
It can be inspected that the maximal errors for \gls{svd} and \gls{nmf} are $max( \boldsymbol{T}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{T}}(\beta_{unseen} = 28.5) |) \approx 0.126$ and | |
$max( \boldsymbol{T}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{T}}(\beta_{unseen} = 28.5) | ) \approx 0.115$, respectively. | |
The MAE errors are, $MAE_{SVD} = 0.002 275 379$ and $MAE_{NMF} = 0.001 635 510$. | |
\gls{nmf} is again slightly better than \gls{svd} with $MAE_{SVD} - MAE_{NMF} \approx 6 \mathrm{e}{-4}$, which is a deviation of $\approx 0.06 \%$ and might also be considered as negligibly small. \newline | |
<!--%------------------------------------- SVD T -------------------------------------> | |
:::{layout="[[1,1], [1]]"} | |
![Original \gls{cnm} $\boldsymbol{T}(\beta_{unseen} = 28.5)$ ](../../3_Figs_Pyth/3_Task/2_Mod_CPE/17_lb_1_T_Orig_28.5.svg){#fig-fig_61} | |
![\gls{cnmc} predicted $\boldsymbol{\tilde{T}}(\beta_{unseen} = 28.5)$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/18_lb_3_T_Aprox_28.5.svg){#fig-fig_62} | |
![Deviation $| \boldsymbol{T}(\beta_{unseen}) - \boldsymbol{\tilde{T}}(\beta_{unseen}) |$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/19_lb_5_Delta_T_28.5.svg){#fig-fig_63} | |
*SLS*, \gls{svd}, original $\boldsymbol{T}(\beta_{unseen} = 28.5)$, predicted $\boldsymbol{\tilde{T}}(\beta_{unseen} = 28.5)$ and deviation $| \boldsymbol{T}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{T}}(\beta_{unseen} = 28.5) |$ for $L=1$ | |
::: | |
<!--%------------------------------------- NMF T -------------------------------------> | |
::: {layout="[[1,1]]"} | |
![\gls{cnmc} predicted $\boldsymbol{\tilde{T}}(\beta_{unseen} = 28.5)$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/20_lb_3_T_Aprox_28.5.svg){#fig-fig_64} | |
![Deviation $| \boldsymbol{T}(\beta_{unseen}) - \boldsymbol{\tilde{T}}(\beta_{unseen}) |$](../../3_Figs_Pyth/3_Task/2_Mod_CPE/21_lb_5_Delta_T_28.5.svg){#fig-fig_65} | |
*SLS*, \gls{nmf}, \gls{cnmc} predicted $\boldsymbol{\tilde{T}}(\beta_{unseen} = 28.5)$ and deviation $| \boldsymbol{T}(\beta_{unseen} = 28.5) - \boldsymbol{\tilde{T}}(\beta_{unseen} = 28.5) |$ for $L=1$ | |
::: | |
Additional MAE errors for different $L$ and $\beta_{unseen}= 28.5,\, \beta_{unseen}= 32.5$ are collected in table @tbl-tab_7_NMF_SVD_QT . | |
First, it can be outlined that regardless of the chosen method, \gls{svd} or \gls{nmf}, all encountered MAE errors are very small. | |
Consequently, it can be recorded that \gls{cnmc} convinces with an overall well approximation of the $\boldsymbol Q / \boldsymbol T$ tensors. | |
Second, comparing \gls{svd} and \gls{nmf} through their respective MAE errors, it can be inspected that the deviation of both is mostly in the order of $\mathcal{O} \approx 1 \mathrm{e}{-2}$. | |
It is a difference in $\approx 0.1 \%$ and can again be considered to be insignificantly small.\newline | |
Despite this, \gls{nmf} required the additional change given in equation @eq-eq_33, which did not apply to \gls{svd}. | |
The transition time entries at the indexes where the probability is positive should be positive as well. Yet, this is not always the case when \gls{nmf} is executed. To correct that, these probability entries are manually set to zero. | |
This rule was also actively applied to the results presented above. | |
Still, the outcome is very satisfactory, because the modeling errors are found to be small. | |
\newline | |
**$L$** | $\beta_{unseen}$ | $\boldsymbol{MAE}_{SVD, \boldsymbol Q}$ | $\boldsymbol{MAE}_{NMF, \boldsymbol Q}$ | $\boldsymbol{MAE}_{SVD, \boldsymbol T}$| $\boldsymbol{MAE}_{NMF, \boldsymbol T}$ | |
---------|----------------- |---------------------------------------------- | --------------------------------------------- | -------------------------------------------- | ---------------------------------------------- | |
$1$ | $28.5$ | $0.002580628$ | $0.002490048$ | $0.002275379$| $0.001635510$ | |
$1$ | $32.5$| $0.003544923$| $0.003650155$ | $0.011152145$ | $0.010690052$ | |
$2$ | $28.5$| $0.001823848$| $0.001776276$ | $0.000409955$| $0.000371242$ | |
$2$ | $32.5$| $0.006381635$| $0.006053059$ | $0.002417142$| $0.002368680$ | |
$3$ | $28.5$ | $0.000369228$ | $0.000356817$ | $0.000067680$ | $0.000062964$ | |
$3$ | $32.5$ | $0.001462458$ | $0.001432738$ | $0.000346298$ | $0.000343520$ | |
$4$ | $28.5$ | $0.000055002$| $0.000052682$| $0.000009420$| $0.000008790$ | |
$4$ | $32.5$ | $0.000215147$| $0.000212329$| $0.000044509$| $0.000044225$ | |
: *SLS*, Mean absolute error for different $L$ and two $\beta_{unseen}$ {#tbl-tab_7_NMF_SVD_QT} | |
$$ | |
\begin{equation} | |
\begin{aligned} | |
TGZ := \boldsymbol T ( \boldsymbol Q > 0) \leq 0 \\ | |
\boldsymbol Q ( TGZ) := 0 | |
\end{aligned} | |
\label{eq_33} | |
\end{equation} | |
$$ {#eq-eq_33} | |
In summary, both methods \gls{nmf} and \gls{svd} provide a good approximation of the $\boldsymbol Q / \boldsymbol T$ tensors. | |
The deviation between the prediction quality of both is negligibly small. | |
However, since \gls{svd} is much faster than \gls{nmf} and does not require an additional parameter study, the recommended decomposition method is \gls{svd}. | |
Furthermore, it shall be highlighted that \gls{svd} used only $r = 4$ modes for the $\boldsymbol Q$ case, whereas for \gls{nmf} $r=9$ were used. | |
Finally, as a side remark, all the displayed figures and the MAE errors are generated and calculated with \gls{cnm}'s default implemented methods. | |
\FloatBarrier |