Skip to content

Commit 5bd8f21

Browse files
committed
Merge branch 'report'
2 parents 4c5a861 + 06ae190 commit 5bd8f21

13 files changed

Lines changed: 344 additions & 57 deletions

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@ This project aims to explore and analyze metaheuristic search-based algorithms f
44
## Proposal
55
This is our [idea](./Project%20Proposal/Project%20Proposal%20-%20Fernando%20and%20Kelvin.pdf).
66

7+
## Report
8+
This is a [summary](./report/CSI5186_AI_Testing_Project_Report___Fernando__Kelvin.pdf) of our work with valid justifications.
9+
710
## Datasets
811
* [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html)
912
* Object Recognition

report/figures/convergence.png

100 KB
Loading

report/figures/evaluations.png

328 KB
Loading

report/figures/stability.png

43.5 KB
Loading
26.3 KB
Loading

report/main.tex

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,19 +12,17 @@
1212
\usepackage{amssymb}
1313
\usepackage{natbib}
1414

15-
\title{Testing the Effectiveness, Efficiency, and Stability\\of Search-Based Hyperparameter Optimizers }
15+
\title{Testing the Effectiveness, Convergence, and Stability\\of Search-Based Hyperparameter Optimizers }
1616
\author{
1717
Fernando Berti Cruz Nogueira (abert036@uottawa.ca),
1818
Kelvin Mock (kmock073@uOttawa.ca)
1919
}
20-
\date{October 2025}
2120

2221
\usepackage{fancyhdr}
2322
\setlength{\headheight}{12.5pt}
2423
\addtolength{\topmargin}{-0.5pt}
2524
\fancypagestyle{plain}{% the preset of fancyhdr
2625
\fancyhf{} % clear all header and footer fields
27-
\fancyfoot[L]{\thedate}
2826
\fancyhead[L]{CSI 5186 - AI-enabled Software Verification and Testing, Final Report (Fall 2025)}
2927
}
3028
\makeatletter
@@ -49,6 +47,8 @@
4947
\usepackage{booktabs}
5048
\usepackage{multirow}
5149
\usepackage{tabularx}
50+
\usepackage{tikz}
51+
\usetikzlibrary{positioning}
5252
\usepackage[colorlinks=true,citecolor=blue,linkcolor=blue]{hyperref}
5353
\begin{document}
5454
\maketitle
@@ -59,17 +59,17 @@
5959
\end{tabular}
6060

6161
\begin{abstract}
62-
Hyperparameter optimization is a critical but computationally expensive task for developing effective machine learning models. This report presents an empirical study comparing a Randomized Search (RS) baseline against two representative metaheuristics: a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). This selection is made to contrast two primary search strategies: global exploration (GA) and local exploitation (PSO). We evaluate the ability of these algorithms to optimize the hyperparameters of three distinct machine learning models (Decision Tree, k-Nearest Neighbors, and a Convolutional Neural Network) on the grayscale CIFAR-10 dataset~\cite{krizhevsky2009learning}. To ensure a fair and balanced assessment we define a composite fitness function. We evaluate the optimizers across three quality attributes: effectiveness (solution quality), efficiency (computational cost), and stability (consistency across runs). The empirical results will be validated using statistical tests to provide statistically grounded conclusions.
62+
Hyperparameter optimization is a critical but computationally expensive task for developing effective machine learning models. This report presents an empirical study comparing a Randomized Search (RS) baseline against two representative metaheuristics: a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). This selection is made to contrast two primary search strategies: global exploration (GA) and local exploitation (PSO). We evaluate the ability of these algorithms to optimize the hyperparameters of three distinct machine learning models (Decision Tree, k-Nearest Neighbors, and a Convolutional Neural Network) on the grayscale CIFAR-10 dataset~\cite{krizhevsky2009learning}. To ensure a fair and balanced assessment we define a composite fitness function. We evaluate the optimizers across three quality attributes: effectiveness (solution quality), convergence (improvement over a fixed evaluation budget), and stability (consistency across runs). The empirical results will be validated using statistical tests to provide statistically grounded conclusions.
6363
\end{abstract}
6464

6565
\input{sections/1_introduction}
6666
\input{sections/2_problem_formulation}
6767
\input{sections/3_experiment}
6868
\input{sections/4_results}
69-
69+
\input{sections/5_limitations}
70+
\input{sections/6_conclusion}
7071

7172
% BEFORE END
72-
\clearpage
7373
\bibliographystyle{plainnat}
7474
\bibliography{refs}
7575

report/refs.bib

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,61 @@ @techreport{krizhevsky2009learning
1414
year = {2009},
1515
type = {Technical Report}
1616
}
17+
18+
@conference{metaheuristics-cookbook,
19+
author = {Victoria Bibaeva},
20+
booktitle = {2018 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 17–20, 2018, AALBORG, DENMARK},
21+
title = {USING METAHEURISTICS FOR HYPER-PARAMETER OPTIMIZATION OF CONVOLUTIONAL NEURAL NETWORKS},
22+
year = {2018}
23+
}
24+
25+
@article{cnn-explained-for-metaheuristics,
26+
author = {Sajjad Nematzadeh and Farzad Kiani and Mahsa Torkamanian-Afshar and Nizamettin Aydin},
27+
title = {Tuning hyperparameters of machine learning algorithms and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases},
28+
journal = {Computational Biology and Chemistry Volume 97, April 2022, 107619},
29+
year = {2022},
30+
url = {https://doi.org/10.1016/j.compbiolchem.2021.107619}
31+
}
32+
33+
@article{hpo-experiment-on-cnn,
34+
author = {Mohammed Q. Ibrahim and Nazar K. Hussein and David Guinovart and Mohammed Qaraad},
35+
title = {Optimizing Convolutional Neural Networks: A Comprehensive Review of Hyperparameter Tuning Through Metaheuristic Algorithms},
36+
journal = {International Center for Numerical Methods in Engineering (CIMNE) 2025},
37+
year = {2025},
38+
url = {https://doi.org/10.1007/s11831-025-10292-x}
39+
}
40+
41+
@conference{autonomous-vehicle-appl,
42+
author = {Raja Ben Abdessalem, Shiva Nejati, Thomas Stifter},
43+
booktitle = {ICSE ’18: 40th International Conference on Software Engineering , May 27-June 3, 2018, Gothenburg, Sweden. ACM, New York, NY, USA, 11 pages.},
44+
title = {Testing Vision-Based Control Systems Using Learnable
45+
Evolutionary Algorithms},
46+
year = {2018},
47+
url = {https://doi.org/10.1145/3180155.3180160},
48+
}
49+
50+
@misc{dt-scikit,
51+
title = {DecisionTreeClassifier},
52+
author = {Scikit-Learn},
53+
year = {2025},
54+
url = {https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html}
55+
}
56+
57+
@misc{knn-scikit,
58+
title = {KNeighborsClassifier},
59+
author = {Scikit-Learn},
60+
year = {2025},
61+
url = {https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html}
62+
}
63+
64+
@misc{mygithub-drugconsumpML,
65+
title = {Drug-Consumption-Machine-Learning-analysis},
66+
author = {Kelvin Mock},
67+
year = {2024},
68+
url = {https://github.com/kmock930/Drug-Consumption-Machine-Learning-analysis}
69+
}
70+
71+
@misc{explainer-linear,
72+
title = {shap.LinearExplainer},
73+
url = {https://shap.readthedocs.io/en/latest/generated/shap.LinearExplainer.html}
74+
}

report/sections/1_introduction.tex

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,22 @@
11
\section{Introduction}
22

3-
The performance of machine learning models often depends on their hyperparameters' high-level configuration variables like learning rate or batch size that control the training process. Finding the optimal set of these configurations, or Hyperparameter Optimization (HPO), is a significant and resource-intensive bottleneck in model development.
4-
5-
HPO can be framed as a software verification problem. In this context, the model is the software under test and a "defect" being a suboptimal hyperparameter configuration that causes the model to fail its performance specifications, such as by exhibiting high loss, poor generalization or unstable training. HPO thus functions as a automated test drivers, searching the configuration space to find a set of hyperparameters that verifies the model's performance against a pre-defined quality specification.
3+
The performance of machine learning models relies heavily on hyperparameters—configuration variables like learning rate and batch size that control the training process. Identifying the optimal configuration is a significant bottleneck in model development due to the high computational cost of evaluation. To address this complexity, we frame Hyperparameter Optimization (HPO) as a software verification problem. In this context, the model functions as the “software under test,” where a suboptimal configuration is treated as a “defect” that causes the system to violate its performance specifications (e.g., high loss or instability). HPO therefore acts as an automated test driver, searching the configuration space to verify model performance against defined quality criteria.
64

75
\subsection{Evaluation Criteria}
86

9-
We evaluate the optimizers across three quality attributes, as defined in the project proposal:
7+
We evaluate the optimizers across 3 quality attributes:
108

119
\begin{itemize}
1210
\item \textbf{Effectiveness}: The quality of the final solution found (i.e., the best fitness score achieved).
13-
\item \textbf{Efficiency}: The computational cost required to find a solution, measured in both fitness evaluations and wall-clock time.
14-
\item \textbf{Stability (Consistency)}: The consistency and reliability of the algorithm's performance across multiple independent runs.
11+
\item \textbf{Convergence}: The rate at which the algorithm improves its best-found solution over the course of the fixed evaluation budget.
12+
\item \textbf{Stability}: The consistency and reliability of the algorithm's performance across multiple independent runs (measured by variance).
1513
\end{itemize}
1614

1715
\subsection{Research Questions}
1816

19-
This report seeks to answer the following research questions from the project proposal:
17+
This report seeks to answer the following research questions:
2018

2119
\begin{itemize}
22-
\item \textbf{RQ1}: How do representative metaheuristic algorithms compare against a randomized search baseline in terms of effectiveness and efficiency when performing HPO prior to training?
23-
\item \textbf{RQ2}: What is the difference in performance stability between the selected metaheuristic algorithms and traditional solutions like the randomized search baseline?
20+
\item \textbf{RQ1}: How do representative metaheuristic algorithms compare against a randomized search baseline in terms of effectiveness and convergence rates given a fixed evaluation budget?
21+
\item \textbf{RQ2}: What is the difference in performance stability between the selected metaheuristic algorithms and the randomized search baseline?
2422
\end{itemize}
Lines changed: 37 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,50 @@
1-
21
\section{Problem Formulation}
32

4-
\subsection{Objective Function}
3+
\subsection{Representation and Objective Function}
54

6-
HPO is a black-box optimization problem. The objective function $f(\theta)$, which represents the model's performance for a given hyperparameter configuration $\theta$, presents many challenges: it is computationally expensive to evaluate, it is non-differentiable, and the search space $\Theta$ is often complex and of mixed-types (continuous, discrete, and categorical). These properties make HPO suitable for search-based metaheuristic techniques.
5+
HPO is a black-box optimization problem. It happens \textbf{prior to} the actual training loop. The problem is represented by arrays of possible values of each parameter type in table \ref{tab:hparam_space}. The objective function $f(\theta)$, which represents the model's performance for a given hyperparameter configuration $\theta$, presents many challenges: it is computationally expensive to evaluate, it is non-differentiable, and the search space $\Theta$ is often complex and of mixed-types (continuous, discrete, and categorical). These properties make HPO suitable for search-based metaheuristic techniques.
76

87
\subsection{Algorithm Selection}
98

109
\subsubsection{Baseline: Randomized Search}
1110

12-
RS is the standard scientific baseline for HPO. \citet{bergstra2012random} demonstrated empirically that RS is more efficient than Grid Search for HPO, as it does not waste evaluations on unimportant parameters. Therefore, any intelligent algorithm must demonstrate superiority over RS to be considered effective.
11+
Random Search (RS) is the standard scientific baseline for HPO. \citet{bergstra2012random} demonstrated empirically that RS is more efficient than Grid Search for HPO. Therefore, any intelligent algorithm must demonstrate superiority over RS to be considered effective.
1312

14-
\subsubsection{Genetic Algorithm}
13+
\subsubsection{Evolutionary Genetic Algorithm}
1514

16-
TODO
15+
Inspired by Darwinian evolution, the Genetic Algorithm (GA) searches for optimal solutions using \textit{selection}, \textit{crossover}, and \textit{mutation}. We implement a \textbf{Memetic Algorithm} variant, which includes a local search component to escape fitness plateaus. As described in \cite{metaheuristics-cookbook}, a radius-based elitism is applied before crossover to refine the fittest individuals.
1716

1817
\subsubsection{Particle Swarm Optimization}
1918

20-
PSO models a swarm where individuals are strongly influenced by the single best-found solution. This behaviour leads to rapid convergence, often finding a "good-enough" solution quickly. This same strength can also be a weakness, as it may converge prematurely to a suboptimal solution. The swarm can rapidly cluster around the first local optimum it finds, losing diversity and becoming "stuck" before the true global optimum is found.
19+
PSO models a swarm where individuals are influenced by both their personal best (\texttt{p\_best}) and the global best (\texttt{g\_best}) solutions. The velocity of each particle is updated using inertia weight ($w$) and acceleration coefficients ($c_1, c_2$), balancing exploration and exploitation.
20+
21+
\begin{table}[htbp]
22+
\centering
23+
\caption{Optimizer Configuration Parameters}
24+
\label{tab:algo_params}
25+
\small
26+
\begin{tabularx}{\textwidth}{lllX}
27+
\toprule
28+
\textbf{Algorithm} & \textbf{Parameter} & \textbf{Value} & \textbf{Description} \\
29+
\midrule
30+
\multirow{4}{*}{Genetic Alg.}
31+
& Population & 30 & Number of individuals per generation. \\
32+
\cmidrule{2-4}
33+
& Generations & 10 & Maximum total evolutionary iterations. \\
34+
\cmidrule{2-4}
35+
& Elitism & 50\% & Proportion of population preserved/selected. \\
36+
\cmidrule{2-4}
37+
& Radius & 0.0 & Memetic-local-search radius (0.15 when memetic is enabled; 0 for standard GA runs). \\
38+
\midrule
39+
\multirow{4}{*}{PSO}
40+
& Particles & 10 & Size of the swarm. \\
41+
\cmidrule{2-4}
42+
& $w$ (Inertia) & 0.5 & Inertia weight controlling velocity retention. \\
43+
\cmidrule{2-4}
44+
& $c_1$ (Cognitive) & 1.5 & Weight for personal best influence. \\
45+
\cmidrule{2-4}
46+
& $c_2$ (Social) & 1.5 & Weight for global best influence. \\
47+
\bottomrule
48+
\end{tabularx}
49+
\end{table}
50+

0 commit comments

Comments
 (0)