giovanni notes (#74)

* Giovannis corrections + wrote the experiement section finish * grammarly * added the jajapy results and finished the section
AAU-Dat · Jan 12, 2025 · c683e81 · c683e81
1 parent fff4d39
commit c683e81
Show file tree

Hide file tree

Showing 8 changed files with 249 additions and 116 deletions.
diff --git a/report/src/deprecated/04-experiments.tex b/report/src/deprecated/04-experiments.tex
@@ -0,0 +1,86 @@
+\section{Experiments}\label{sec:experiments}
+In this section, we describe the experiments conducted to evaluate the proposed symbolic implementation of the Baum-Welch algorithm in CuPAAL. 
+The experiments are designed to compare CuPAAL with two other implementations: Jajapy, the original matrix-based implementation, and SUDD, a hybrid symbolic and matrix-based implementation.
+The experiments are conducted to answer the following research questions:
+
+\begin{itemize}
+    \item \textbf{Question 1:} How does the symbolic CuPAAL implementation compare to Jajapy and SUDD in terms of parameter estimation accuracy and runtime?
+    \item \textbf{Question 2:} How does the scalability of the symbolic CuPAAL implementation compare to Jajapy and SUDD as the number of model states increases?
+\end{itemize}
+
+These questions evaluate both the computational efficiency and accuracy of the symbolic implementation in CuPAAL. The insights gained will highlight the strengths and potential trade-offs of symbolic approaches to the Baum-Welch algorithm.
+
+\subsection{Experimental Setup}
+Two experiments are conducted to address the research questions.
+
+\textbf{Experiment 1: Parameter Estimation Accuracy.} This experiment compares the runtime, parameter estimation accuracy, and convergence properties of the Baum-Welch implementations.
+
+\textbf{Experiment 2: Scalability.} This experiment evaluates how each implementation scales with an increasing number of model states.
+We conduct two experiments to evaluate the proposed method.
+
+All experiments are conducted on a machine with INSERT\_SPECS\_FOR\_MACHINE\_HERE.
+The experiments are all run 10 times to ensure consitency, with the results averaged to account for any variance in runtime and estimation accuracy.
+We use the following implementations for the experiments:
+\begin{itemize}
+    \item \textbf{Jajapy}: The original implementation of the Baum-Welch algorithm~\cite{reynouard2023jajapy}.
+    \item \textbf{SUDD}: The SUDD implementation of the Baum-Welch algorithm~\cite{p7}.
+    \item \textbf{SUDD\_log}: The SUDD implementation of the Baum-Welch algorithm using a logsemiring for numerical stability.
+    \item \textbf{CuPAAL}: The symbolic implementation of the Baum-Welch algorithm proposed in the paper.
+\end{itemize}
+
+The chosen implementations represent a progression from fully matrix-based to hybrid, to fully symbolic approaches, providing a comprehensive comparison of methodologies.
+
+\subsection{Experiment 1: Parameter Estimation Accuracy}
+The first experiment measures runtime, accuracy, and convergence properties of the Baum-Welch algorithm across implementations.
+We use the following models for the experiment:
+\begin{itemize}
+    \item FIND THE MODELS USED IN THE EXPERIMENT HERE
+\end{itemize} 
+We generate observation sequences from each model and use the Baum-Welch algorithm to estimate the parameters of the model. 
+We make 100 observations with a sequence length of 30 for each model.
+
+The experiment is made with the following steps:
+\begin{enumerate}
+    \item Load the model.
+    \item Generate an observation sequence from the model.
+    \item Generate initial parameters.
+    \item Use each implementation of the Baum-Welch algorithm to estimate model parameters, recording parameter estimates and runtime.
+    \item Repeat steps 2 to 4 ten times, averaging results.
+    \item Compare the estimated parameters with the true parameters.
+\end{enumerate}
+All the implementations are tested with the same observation sequences and initial parameters to ensure a fairness.
+
+We use the following evaluation metrics to compare the Baum-Welch implementations:
+\begin{itemize}
+    \item \textbf{Relative parameter error}: Accuracy of the estimated parameters compared to the true parameters.
+    Calculated as $\frac{|r - e|}{e}$, where $r$ is the real parameter and $e$ is the expected value of the parameter.
+    \item \textbf{Relative formula error}: Accuracy of the estimated formulas using Storm by providing estimated parameters and checking the formula error.
+    \item \textbf{Runtime}: Total computation time for parameter estimation.
+    \item \textbf{Iterations}: The number of iterations required to converge.
+\end{itemize}
+
+We use the relative parameter error and relative formula error to compare the accuracy of the estimated parameters and formulas across implementations, this will give us an idea of how accurate the Baum-Welch algorithm is in each implementation.
+
+We use the runtime to compare the computational efficiency of the Baum-Welch algorithm across implementations.
+We use the number of iterations to ensure that all implementations have converged to the same solution, meaning that the comparison is fair.
+
+The results are presented in tables to facilitate a direct comparison of performance across models and between implementations.
+
+\subsection{Scalability experiment}
+The second experiment assesses scalability by increasing the number of states in a model \textit{MODEL NAME HERE} from 2 to 10,000. 
+Observation sequences are generated with the same settings as Experiment 1, and the process is repeated as follows:
+
+\begin{enumerate}
+    \item Load the model.
+    \item Generate an observation sequence from the model.
+    \item Generate initial parameters.
+    \item Use each implementation of the Baum-Welch algorithm to estimate model parameters, recording runtime.
+    \item Repeat steps 2 to 4 ten times, averaging results.
+    \item Increase the number of states in the model and repeat the entire process.
+\end{enumerate}
+
+Scalability is evaluated by examining runtime trends as model size increases. 
+This gives insight into how each implementation scales with larger models and is critical for understanding the practical applicability of symbolic versus matrix-based implementations for large-scale problems.
+
+Results will be visualized using plots to illustrate trends and highlight differences in scalability between implementations.
+
diff --git a/report/src/figures/jajapy_scalability.data b/report/src/figures/jajapy_scalability.data
@@ -0,0 +1,11 @@
+20     14.313075
+120    142.041224
+220    344.057947
+320    824.494872
+420    1161.058289
+520    1841.472321
+620    2574.744506
+720    3432.212629
+820    5097.858485
+920    5642.201
+1020   8153.958789
diff --git a/report/src/figures/scalability_graph.tex b/report/src/figures/scalability_graph.tex
@@ -0,0 +1,20 @@
+\begin{tikzpicture}
+    \begin{axis}[
+        title={Scalability of Baum-Welch Implementations},
+        xlabel={Number of States},
+        ylabel={Runtime (s)},
+        xmin=0, xmax=1050,
+        xtick={20,220,420,620,820,1020},
+        legend pos=north west,
+        ymajorgrids=true,
+        grid style=dashed,
+    ]
+    \addplot[
+        color=red,
+        mark=square,
+        line width=1pt
+    ]
+    table {figures/jajapy_scalability.data};
+    \legend{Jajapy}
+    \end{axis}
+\end{tikzpicture}
diff --git a/report/src/figures/vector_add_example_reduced.tex b/report/src/figures/vector_add_example_reduced.tex
@@ -11,7 +11,7 @@
     \draw[zeroarrow] (x) -- (a);
     \draw[zeroarrow] (a) -- (final-1);
     \draw[onearrow] (a) -- (final-2);
-    \draw[zeroarrow] (x) -- (b);
-    \draw[onearrow] (b) -- (final-3);
-    \draw[zeroarrow] (b) -- (final-4);    
+    \draw[onearrow] (x) -- (b);
+    \draw[zeroarrow] (b) -- (final-3);
+    \draw[onearrow] (b) -- (final-4);    
 \end{tikzpicture}