local-ai-llm-playground

Experiments running offline LLMs in Python and Rust locally using Ollama and llama.cpp

Introduction

Collection of local AI experiments that should run on a recent home computer. Makes use of local Ollama and llama.cpp servers for running completion and chat tasks with Gemma4 and Mistral models. Code is written in Python and Rust and each example has a short description detailing how you can download the model and run the code.

Ollama is an open-source programming language designed for rapid prototyping, education, and research in the field of artificial intelligence (AI). Ollama:

has a simple API;
does not require a Python environment; and
has a model library, making it easy to discover and download new
models.

llama.cpp is a C++ implementation of LLaMA. It is:

extremely memory-efficient through quantisation;
works well on CPU-only setups; and
is available as a library for integration with other applications.

Source: Running Local LLMs

Setup

Prerequisites

Example run on Ollama or llama.cpp. Here’s a quick guide to getting those setup on macOS with Homebrew. Follow links for more detailed instructions, and for other operating systems. You will also need Rust or Python set up on your system (depending on which examples you want to run).

Ollama

brew install ollama

For other operating systems, or more details, see the Official Ollama Quickstart\Guide.

llama.cpp

brew install llama.cpp

For other operating systems, or more details, see the LLaMA.cpp HTTP Server Quick Start Guide.

Installation

Nothing to install beyond the prerequisites.

Examples

llamacpp-gemma4-e4b-completion: Gemma4 LLM completion demo calling local llama.cpp server from Rust code.
llamacpp_tts: Large Language Model text-to-speech (TTS) demo with voice cloning.
ollama-mistral-instruct-chat

🤔 Why run local LLMs?

Data sovereignty: you have more control over your data.
Offline support: great if you have an unstable connection or are temporarily offline.
Model fine-tuning: you also have more control over the model run.

You don't need the latest GPU: with llama.cpp or Ollama, smaller models (up to around 7 billion parameters) can run comfortably on a typical home computer.

For balance, however, running locally, you pay the one-off cost of downloading the model you want to run, you might not be able to run the largest models, depending on your machine’s spec. Also, a cloud service would be more scalable if you needed to step up model usage.

☎️ Issues and Support

Open an issue if something does not work as expected or if you have some improvements.

Feel free to jump into the Rodney Lab matrix chat room.

Feature requests

New feature suggestions are always welcome and will be considered, though please keep in mind that some of them may be out of scope for what the project is trying to achieve (or is reasonably capable of). If you have an idea for a new feature and would like to share it, you can create a feature request.

Feature requests are tagged with one of the following:

Roadmap - will be implemented in a future release
Backlog - may be implemented in the future but needs further feedback or interest from the community
Icebox - no plans to implement as it doesn't currently align with the project's goals or capabilities, may be revised at a later date

Contributions

Contributions welcome, write a short issue with your idea, before spending to much time on more involved additions.

Contributing guidelines

Before working on a new feature it's preferable to submit a feature request first and state that you'd like to implement it yourself
Please don't submit PRs for feature requests that are either in the roadmap^[1], backlog^[2] or icebox^[3]
Avoid introducing new dependencies
Avoid making backwards-incompatible configuration changes

^{[1] [2] [3]}

[1] The feature likely already has work put into it that may conflict with your implementation

[2] The demand, implementation or functionality for this feature is not yet clear

[3] No plans to add this feature for the time being

Acknowledgements

Inspired by:

llama.cpp; and
Ollama.

License

The project is licensed under BSD 3-Clause License — see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 944 Commits
.github		.github
assets		assets
crates/llamacpp_gemma4_e4b_completion		crates/llamacpp_gemma4_e4b_completion
images		images
python		python
.cz.toml		.cz.toml
.gitignore		.gitignore
.lycheeignore		.lycheeignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.rustfmt.toml		.rustfmt.toml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
_typos.toml		_typos.toml
deny.toml		deny.toml
dprint.json		dprint.json
justfile		justfile
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

local-ai-llm-playground

Introduction

Contents

Setup

Prerequisites

Ollama

llama.cpp

Installation

Examples

🤔 Why run local LLMs?

☎️ Issues and Support

Feature requests

Contributions

Contributing guidelines

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

local-ai-llm-playground

Introduction

Contents

Setup

Prerequisites

Ollama

llama.cpp

Installation

Examples

🤔 Why run local LLMs?

☎️ Issues and Support

Feature requests

Contributions

Contributing guidelines

Acknowledgements

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages