Skip to content

Commit 99abb9a

Browse files
authored
Merge pull request #1 from leanEthereum/snappy
implement snappy and tests
2 parents 4f2037d + c6953d0 commit 99abb9a

27 files changed

Lines changed: 39056 additions & 0 deletions

.github/workflows/ci.yml

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
workflow_dispatch:
9+
10+
permissions:
11+
contents: read
12+
13+
concurrency:
14+
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
15+
cancel-in-progress: true
16+
17+
jobs:
18+
lint:
19+
name: Lint and format checks - Python ${{ matrix.python-version }}
20+
runs-on: ubuntu-latest
21+
strategy:
22+
fail-fast: false
23+
matrix:
24+
python-version: ["3.12", "3.13", "3.14"]
25+
steps:
26+
- name: Checkout py-snappy
27+
uses: actions/checkout@v4
28+
29+
- name: Install uv and Python ${{ matrix.python-version }}
30+
uses: astral-sh/setup-uv@v4
31+
with:
32+
enable-cache: true
33+
cache-dependency-glob: "pyproject.toml"
34+
python-version: ${{ matrix.python-version }}
35+
36+
- name: Run all quality checks via tox
37+
run: uvx tox -e all-checks
38+
39+
test:
40+
name: Tests - Python ${{ matrix.python-version }} on ${{ matrix.os }}
41+
runs-on: ${{ matrix.os }}
42+
strategy:
43+
fail-fast: false
44+
matrix:
45+
os: [ubuntu-latest, macos-latest]
46+
python-version: ["3.12", "3.13", "3.14"]
47+
steps:
48+
- name: Checkout py-snappy
49+
uses: actions/checkout@v4
50+
51+
- name: Install uv and Python ${{ matrix.python-version }}
52+
uses: astral-sh/setup-uv@v4
53+
with:
54+
enable-cache: true
55+
cache-dependency-glob: "pyproject.toml"
56+
python-version: ${{ matrix.python-version }}
57+
58+
- name: Run tests via tox
59+
run: uvx tox -e pytest

.gitignore

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
share/python-wheels/
24+
*.egg-info/
25+
.installed.cfg
26+
*.egg
27+
MANIFEST
28+
29+
# PyInstaller
30+
*.manifest
31+
*.spec
32+
33+
# Installer logs
34+
pip-log.txt
35+
pip-delete-this-directory.txt
36+
37+
# Unit test / coverage reports
38+
htmlcov/
39+
.tox/
40+
.nox/
41+
.coverage
42+
.coverage.*
43+
.cache
44+
nosetests.xml
45+
coverage.xml
46+
*.cover
47+
*.py,cover
48+
.hypothesis/
49+
.pytest_cache/
50+
cover/
51+
52+
# Translations
53+
*.mo
54+
*.pot
55+
56+
# Django stuff:
57+
*.log
58+
local_settings.py
59+
db.sqlite3
60+
db.sqlite3-journal
61+
62+
# Flask stuff:
63+
instance/
64+
.webassets-cache
65+
66+
# Scrapy stuff:
67+
.scrapy
68+
69+
# Sphinx documentation
70+
docs/_build/
71+
72+
# PyBuilder
73+
.pybuilder/
74+
target/
75+
76+
# Jupyter Notebook
77+
.ipynb_checkpoints
78+
79+
# IPython
80+
profile_default/
81+
ipython_config.py
82+
83+
# pyenv
84+
.python-version
85+
86+
# pipenv
87+
Pipfile.lock
88+
89+
# poetry
90+
poetry.lock
91+
92+
# pdm
93+
.pdm.toml
94+
.pdm-python
95+
.pdm-build/
96+
97+
# PEP 582
98+
__pypackages__/
99+
100+
# Celery stuff
101+
celerybeat-schedule
102+
celerybeat.pid
103+
104+
# SageMath parsed files
105+
*.sage.py
106+
107+
# Environments
108+
.env
109+
.venv
110+
env/
111+
venv/
112+
ENV/
113+
env.bak/
114+
venv.bak/
115+
116+
# Spyder project settings
117+
.spyderproject
118+
.spyproject
119+
120+
# Rope project settings
121+
.ropeproject
122+
123+
# mkdocs documentation
124+
/site
125+
126+
# mypy
127+
.mypy_cache/
128+
.dmypy.json
129+
dmypy.json
130+
131+
# Pyre type checker
132+
.pyre/
133+
134+
# pytype static type analyzer
135+
.pytype/
136+
137+
# Cython debug symbols
138+
cython_debug/
139+
140+
# IDEs
141+
.idea/
142+
.vscode/
143+
*.swp
144+
*.swo
145+
*~
146+
.DS_Store
147+
148+
# UV
149+
.uv/
150+
151+
# Ruff
152+
.ruff_cache/

README.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# py-snappy
2+
3+
Pure Python implementation of Google's Snappy compression algorithm.
4+
5+
## Features
6+
7+
- **Pure Python**: No external dependencies or C extensions required
8+
- **Full compatibility**: Produces output compatible with Google's Snappy format
9+
- **Well documented**: Extensive inline documentation explaining the algorithm
10+
- **Thoroughly tested**: Comprehensive test suite using C++ Snappy test data
11+
12+
## Installation
13+
14+
```bash
15+
uv sync
16+
```
17+
18+
## Usage
19+
20+
```python
21+
from src import compress, decompress
22+
23+
# Compress data
24+
data = b"Hello, World!" * 100
25+
compressed = compress(data)
26+
27+
# Decompress data
28+
original = decompress(compressed)
29+
assert original == data
30+
```
31+
32+
## API
33+
34+
### Core Functions
35+
36+
- `compress(data: bytes) -> bytes`: Compress data using Snappy
37+
- `decompress(data: bytes) -> bytes`: Decompress Snappy-compressed data
38+
39+
### Utilities
40+
41+
- `max_compressed_length(size: int) -> int`: Maximum possible compressed size
42+
- `get_uncompressed_length(data: bytes) -> int`: Read uncompressed length from header
43+
- `is_valid_compressed_data(data: bytes) -> bool`: Quick validation check
44+
45+
### Exceptions
46+
47+
- `SnappyDecompressionError`: Raised when decompression fails
48+
49+
## Development
50+
51+
```bash
52+
# Install dependencies
53+
uv sync
54+
55+
# Run tests
56+
uv run pytest
57+
58+
# Run linter
59+
uv run ruff check src/ tests/
60+
61+
# Format code
62+
uv run ruff format src/ tests/
63+
```
64+
65+
## Algorithm
66+
67+
Snappy is an LZ77-variant compression algorithm that prioritizes speed over compression ratio. Key characteristics:
68+
69+
- **Block-based**: Data is processed in 64KB blocks
70+
- **Hash table**: O(1) match lookup using a simple hash function
71+
- **Greedy matching**: No optimal parsing or lazy evaluation
72+
- **Wire format**: Varint length prefix followed by literals and copy references
73+
74+
## References
75+
76+
- [Google Snappy](https://github.com/google/snappy)
77+
- [Format Description](https://github.com/google/snappy/blob/main/format_description.txt)
78+
79+
## License
80+
81+
MIT License

0 commit comments

Comments
 (0)