Upper Confidence Bound (UCB)

A reinforcement learning solution to the multi-armed bandit problem, applied to ad click-through rate optimization.

What It Does

Given 10 ads and 10,000 simulated user rounds, the UCB algorithm learns which ad maximizes clicks by balancing exploration (trying less-known ads) with exploitation (favoring high-performing ones).

At each round, the algorithm selects the ad with the highest upper confidence bound:

UCB(i) = avg(i) + sqrt( (3/2) * ln(n) / N(i) )

avg(i) -- average reward of ad i so far
n -- current round number
N(i) -- number of times ad i has been selected

Ads with no selections yet are assigned infinite confidence, ensuring every ad gets tried at least once. Over time, the exploration bonus shrinks and the algorithm converges on the best ad.

Dataset

Ads_CTR_Optimisation.csv -- 10,000 rows x 10 columns. Each row is a user round; each column is an ad. Values are binary (1 = click, 0 = no click). The file must be in the same directory as the scripts.

Tech Stack

	Tool	Purpose
🐍	Python 3	Primary implementation
📈	Matplotlib	Histogram visualization
🗂️	Pandas	CSV data loading
📉	R	Alternative implementation (base R)

Getting Started

Python

pip install matplotlib pandas
python upper_confidence_bound.py

R

Rscript upper_confidence_bound.R

Both scripts output a histogram showing how often each ad was selected. The best-performing ad dominates the distribution.

Project Structure

upper_confidence_bound.py   # Python implementation
upper_confidence_bound.R    # R implementation
Ads_CTR_Optimisation.csv    # Simulated CTR dataset
LICENSE                     # MIT License
README.md

Known Issues

The simulation uses a fixed dataset rather than a live reward signal -- this is a batch replay, not true online learning.
Changing N or d beyond the dataset dimensions will cause index errors; the code now derives these from the CSV automatically.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Upper Confidence Bound (UCB)

What It Does

Dataset

Tech Stack

Getting Started

Python

R

Project Structure

Known Issues

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Ads_CTR_Optimisation.csv		Ads_CTR_Optimisation.csv
LICENSE		LICENSE
README.md		README.md
upper_confidence_bound.R		upper_confidence_bound.R
upper_confidence_bound.py		upper_confidence_bound.py

Folders and files

Latest commit

History

Repository files navigation

Upper Confidence Bound (UCB)

What It Does

Dataset

Tech Stack

Getting Started

Python

R

Project Structure

Known Issues

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages