Skip to content

Commit

Permalink
giovanni notes (#74)
Browse files Browse the repository at this point in the history
* Giovannis corrections + wrote the experiement section finish

* grammarly

* added the jajapy results and finished the section
  • Loading branch information
sebastianbot6969 authored Jan 12, 2025
1 parent fff4d39 commit c683e81
Show file tree
Hide file tree
Showing 8 changed files with 249 additions and 116 deletions.
86 changes: 86 additions & 0 deletions report/src/deprecated/04-experiments.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
\section{Experiments}\label{sec:experiments}
In this section, we describe the experiments conducted to evaluate the proposed symbolic implementation of the Baum-Welch algorithm in CuPAAL.
The experiments are designed to compare CuPAAL with two other implementations: Jajapy, the original matrix-based implementation, and SUDD, a hybrid symbolic and matrix-based implementation.
The experiments are conducted to answer the following research questions:

\begin{itemize}
\item \textbf{Question 1:} How does the symbolic CuPAAL implementation compare to Jajapy and SUDD in terms of parameter estimation accuracy and runtime?
\item \textbf{Question 2:} How does the scalability of the symbolic CuPAAL implementation compare to Jajapy and SUDD as the number of model states increases?
\end{itemize}

These questions evaluate both the computational efficiency and accuracy of the symbolic implementation in CuPAAL. The insights gained will highlight the strengths and potential trade-offs of symbolic approaches to the Baum-Welch algorithm.

\subsection{Experimental Setup}
Two experiments are conducted to address the research questions.

\textbf{Experiment 1: Parameter Estimation Accuracy.} This experiment compares the runtime, parameter estimation accuracy, and convergence properties of the Baum-Welch implementations.

\textbf{Experiment 2: Scalability.} This experiment evaluates how each implementation scales with an increasing number of model states.
We conduct two experiments to evaluate the proposed method.

All experiments are conducted on a machine with INSERT\_SPECS\_FOR\_MACHINE\_HERE.
The experiments are all run 10 times to ensure consitency, with the results averaged to account for any variance in runtime and estimation accuracy.
We use the following implementations for the experiments:
\begin{itemize}
\item \textbf{Jajapy}: The original implementation of the Baum-Welch algorithm~\cite{reynouard2023jajapy}.
\item \textbf{SUDD}: The SUDD implementation of the Baum-Welch algorithm~\cite{p7}.
\item \textbf{SUDD\_log}: The SUDD implementation of the Baum-Welch algorithm using a logsemiring for numerical stability.
\item \textbf{CuPAAL}: The symbolic implementation of the Baum-Welch algorithm proposed in the paper.
\end{itemize}

The chosen implementations represent a progression from fully matrix-based to hybrid, to fully symbolic approaches, providing a comprehensive comparison of methodologies.

\subsection{Experiment 1: Parameter Estimation Accuracy}
The first experiment measures runtime, accuracy, and convergence properties of the Baum-Welch algorithm across implementations.
We use the following models for the experiment:
\begin{itemize}
\item FIND THE MODELS USED IN THE EXPERIMENT HERE
\end{itemize}
We generate observation sequences from each model and use the Baum-Welch algorithm to estimate the parameters of the model.
We make 100 observations with a sequence length of 30 for each model.

The experiment is made with the following steps:
\begin{enumerate}
\item Load the model.
\item Generate an observation sequence from the model.
\item Generate initial parameters.
\item Use each implementation of the Baum-Welch algorithm to estimate model parameters, recording parameter estimates and runtime.
\item Repeat steps 2 to 4 ten times, averaging results.
\item Compare the estimated parameters with the true parameters.
\end{enumerate}
All the implementations are tested with the same observation sequences and initial parameters to ensure a fairness.

We use the following evaluation metrics to compare the Baum-Welch implementations:
\begin{itemize}
\item \textbf{Relative parameter error}: Accuracy of the estimated parameters compared to the true parameters.
Calculated as $\frac{|r - e|}{e}$, where $r$ is the real parameter and $e$ is the expected value of the parameter.
\item \textbf{Relative formula error}: Accuracy of the estimated formulas using Storm by providing estimated parameters and checking the formula error.
\item \textbf{Runtime}: Total computation time for parameter estimation.
\item \textbf{Iterations}: The number of iterations required to converge.
\end{itemize}

We use the relative parameter error and relative formula error to compare the accuracy of the estimated parameters and formulas across implementations, this will give us an idea of how accurate the Baum-Welch algorithm is in each implementation.

We use the runtime to compare the computational efficiency of the Baum-Welch algorithm across implementations.
We use the number of iterations to ensure that all implementations have converged to the same solution, meaning that the comparison is fair.

The results are presented in tables to facilitate a direct comparison of performance across models and between implementations.

\subsection{Scalability experiment}
The second experiment assesses scalability by increasing the number of states in a model \textit{MODEL NAME HERE} from 2 to 10,000.
Observation sequences are generated with the same settings as Experiment 1, and the process is repeated as follows:

\begin{enumerate}
\item Load the model.
\item Generate an observation sequence from the model.
\item Generate initial parameters.
\item Use each implementation of the Baum-Welch algorithm to estimate model parameters, recording runtime.
\item Repeat steps 2 to 4 ten times, averaging results.
\item Increase the number of states in the model and repeat the entire process.
\end{enumerate}

Scalability is evaluated by examining runtime trends as model size increases.
This gives insight into how each implementation scales with larger models and is critical for understanding the practical applicability of symbolic versus matrix-based implementations for large-scale problems.

Results will be visualized using plots to illustrate trends and highlight differences in scalability between implementations.

11 changes: 11 additions & 0 deletions report/src/figures/jajapy_scalability.data
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
20 14.313075
120 142.041224
220 344.057947
320 824.494872
420 1161.058289
520 1841.472321
620 2574.744506
720 3432.212629
820 5097.858485
920 5642.201
1020 8153.958789
20 changes: 20 additions & 0 deletions report/src/figures/scalability_graph.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
\begin{tikzpicture}
\begin{axis}[
title={Scalability of Baum-Welch Implementations},
xlabel={Number of States},
ylabel={Runtime (s)},
xmin=0, xmax=1050,
xtick={20,220,420,620,820,1020},
legend pos=north west,
ymajorgrids=true,
grid style=dashed,
]
\addplot[
color=red,
mark=square,
line width=1pt
]
table {figures/jajapy_scalability.data};
\legend{Jajapy}
\end{axis}
\end{tikzpicture}
6 changes: 3 additions & 3 deletions report/src/figures/vector_add_example_reduced.tex
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
\draw[zeroarrow] (x) -- (a);
\draw[zeroarrow] (a) -- (final-1);
\draw[onearrow] (a) -- (final-2);
\draw[zeroarrow] (x) -- (b);
\draw[onearrow] (b) -- (final-3);
\draw[zeroarrow] (b) -- (final-4);
\draw[onearrow] (x) -- (b);
\draw[zeroarrow] (b) -- (final-3);
\draw[onearrow] (b) -- (final-4);
\end{tikzpicture}
Loading

0 comments on commit c683e81

Please sign in to comment.