Skip to content

Commit c0500b1

Browse files
authored
Merge pull request #1 from DanielArndt0/main
Refactor project into a reusable Python library with CLI, docs, and release workflows
2 parents a671df5 + fd13c04 commit c0500b1

39 files changed

Lines changed: 1910 additions & 99 deletions

.github/workflows/ci.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main, master]
6+
pull_request:
7+
8+
jobs:
9+
test:
10+
runs-on: ubuntu-latest
11+
12+
strategy:
13+
fail-fast: false
14+
matrix:
15+
python-version: ["3.10", "3.11"]
16+
17+
steps:
18+
- name: Checkout
19+
uses: actions/checkout@v4
20+
21+
- name: Setup Python
22+
uses: actions/setup-python@v5
23+
with:
24+
python-version: ${{ matrix.python-version }}
25+
26+
- name: Install dependencies
27+
run: |
28+
python -m pip install --upgrade pip
29+
python -m pip install -e .[dev]
30+
31+
- name: Run tests
32+
run: python -m pytest -v
33+
34+
- name: Build package
35+
run: python -m build

.github/workflows/publish.yml

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
name: Publish Python Package
2+
3+
on:
4+
release:
5+
types: [published]
6+
7+
jobs:
8+
publish:
9+
if: ${{ !github.event.release.prerelease }}
10+
runs-on: ubuntu-latest
11+
12+
permissions:
13+
contents: write
14+
id-token: write
15+
16+
steps:
17+
- name: Checkout
18+
uses: actions/checkout@v4
19+
20+
- name: Setup Python
21+
uses: actions/setup-python@v5
22+
with:
23+
python-version: "3.11"
24+
25+
- name: Install build tools
26+
run: |
27+
python -m pip install --upgrade pip
28+
pip install build
29+
30+
- name: Build distributions
31+
run: python -m build
32+
33+
- name: Publish to PyPI
34+
uses: pypa/gh-action-pypi-publish@release/v1
35+
36+
- name: Upload dist files to GitHub Release
37+
uses: softprops/action-gh-release@v2
38+
with:
39+
files: dist/*

.github/workflows/release.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: Release Build
2+
3+
on:
4+
release:
5+
types: [published]
6+
7+
jobs:
8+
build-release-artifacts:
9+
if: ${{ !github.event.release.prerelease }}
10+
runs-on: ubuntu-latest
11+
12+
permissions:
13+
contents: write
14+
15+
steps:
16+
- name: Checkout
17+
uses: actions/checkout@v4
18+
19+
- name: Setup Python
20+
uses: actions/setup-python@v5
21+
with:
22+
python-version: "3.10"
23+
24+
- name: Install build dependencies
25+
run: |
26+
python -m pip install --upgrade pip
27+
python -m pip install build
28+
29+
- name: Build distributions
30+
run: python -m build
31+
32+
- name: Upload artifacts to GitHub Release
33+
uses: softprops/action-gh-release@v2
34+
with:
35+
files: dist/*

.gitignore

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
dist/
13+
downloads/
14+
eggs/
15+
.eggs/
16+
lib/
17+
lib64/
18+
parts/
19+
sdist/
20+
var/
21+
wheels/
22+
share/python-wheels/
23+
*.egg-info/
24+
*.egg
25+
MANIFEST
26+
27+
# Installer logs
28+
pip-log.txt
29+
pip-delete-this-directory.txt
30+
31+
# Unit test / coverage / pytest
32+
htmlcov/
33+
.tox/
34+
.nox/
35+
.coverage
36+
.coverage.*
37+
.cache/
38+
.pytest_cache/
39+
nosetests.xml
40+
coverage.xml
41+
*.cover
42+
*.py,cover
43+
.hypothesis/
44+
45+
# Type check / lint caches
46+
.mypy_cache/
47+
.ruff_cache/
48+
.pyre/
49+
.dmypy.json
50+
dmypy.json
51+
52+
# Virtual environments
53+
.venv/
54+
venv/
55+
env/
56+
ENV/
57+
58+
# Jupyter Notebook
59+
.ipynb_checkpoints/
60+
61+
# IDEs / editors
62+
.vscode/
63+
.idea/
64+
65+
# OS files
66+
.DS_Store
67+
Thumbs.db
68+
69+
# Local environment files
70+
.env
71+
.env.*
72+
*.local
73+
74+
# Logs
75+
*.log
76+
77+
# Temporary files
78+
tmp/
79+
temp/
80+
*.tmp
81+
82+
# Project generated files
83+
output/
84+
temp_uploads/
85+
generated/
86+
reports/
87+
88+
# Excel / export artifacts
89+
*.xlsx
90+
91+
# Database / local data
92+
*.db
93+
*.sqlite3
94+
95+
# Python build metadata
96+
.pybuild/
97+
98+
# Packaging tools
99+
pip-wheel-metadata/
100+
101+
# PyInstaller
102+
*.manifest
103+
*.spec

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2026 Daniel Arndt
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
# Pydf
2+
3+
[Documentação PT-BR](docs/README.pt-BR.md) | [English docs](docs/README.en.md)
4+
5+
A `pydf` é uma biblioteca Python leve para leitura de PDFs de faturas, extração de metadados com regex, exportação para Excel e persistência opcional em MySQL.
6+
7+
Esta versão reorganiza o projeto original como biblioteca e CLI, sem fugir da ideia central do script: **PDF -> extração -> Excel -> MySQL opcional**.
8+
9+
## Visão rápida
10+
11+
- Biblioteca Python reutilizável
12+
- CLI simples para uso no terminal
13+
- Regex configurável para número e data da fatura
14+
- Exportação para `.xlsx`
15+
- Persistência opcional em MySQL
16+
- Documentação em PT-BR e inglês
17+
- Workflows de CI e release para GitHub Actions
18+
19+
## Requisitos
20+
21+
- Python **3.10 ou superior**
22+
- `pip`
23+
- Recomendado: ambiente virtual (`venv`)
24+
25+
## Instalação local
26+
27+
Na raiz do projeto:
28+
29+
```bash
30+
pip install -e .
31+
```
32+
33+
Instalação com dependências de desenvolvimento:
34+
35+
```bash
36+
pip install -e .[dev]
37+
```
38+
39+
## Instalação da CLI via GitHub
40+
41+
Como o GitHub não oferece um registry Python suportado para `pip` no GitHub Packages, a forma recomendada para instalar a CLI a partir do GitHub é usar o próprio repositório Git.
42+
43+
### Instalar da branch padrão
44+
45+
```bash
46+
pip install "git+https://github.com/DanielArndt0/pydf.git"
47+
```
48+
49+
### Instalar de uma tag ou release específica
50+
51+
```bash
52+
pip install "git+https://github.com/DanielArndt0/pydf.git@v1.0.0"
53+
```
54+
55+
Depois disso, a CLI fica disponível como:
56+
57+
```bash
58+
pydf --help
59+
```
60+
61+
## Primeiros passos com venv no Windows
62+
63+
Se você tiver mais de uma versão do Python instalada, confira as versões disponíveis:
64+
65+
```powershell
66+
py -0p
67+
```
68+
69+
Crie e ative um ambiente virtual com Python 3.10:
70+
71+
```powershell
72+
py -3.10 -m venv .venv
73+
.venv\Scripts\Activate.ps1
74+
python -m pip install --upgrade pip
75+
python -m pip install -e .[dev]
76+
```
77+
78+
## Como executar a CLI
79+
80+
```bash
81+
pydf --help
82+
pydf examples/pdf_invoices --output output/invoices.xlsx
83+
```
84+
85+
## Como usar como biblioteca
86+
87+
```python
88+
from pydf import InvoiceProcessor, ProcessorConfig
89+
90+
config = ProcessorConfig(
91+
input_dir="examples/pdf_invoices",
92+
output_excel="output/invoices.xlsx",
93+
)
94+
95+
result = InvoiceProcessor(config).process()
96+
97+
print(result.output_excel)
98+
for record in result.records:
99+
print(record.file_name, record.invoice_number, record.invoice_date, record.status)
100+
```
101+
102+
## Rodando testes
103+
104+
```bash
105+
pytest -v
106+
```
107+
108+
Se o ambiente ainda não estiver preparado:
109+
110+
```bash
111+
pip install -e .[dev]
112+
pytest -v
113+
```
114+
115+
## Build local
116+
117+
```bash
118+
python -m build
119+
```
120+
121+
## CI e releases no GitHub
122+
123+
Este repositório inclui dois workflows:
124+
125+
- `ci.yml`: roda testes e build em todo push e pull request
126+
- `release.yml`: gera os artefatos e anexa `dist/*` a uma release publicada manualmente
127+
128+
Documentação detalhada:
129+
130+
- [Guia principal da documentação](docs/README.pt-BR.md)
131+
- [Guia da CLI](docs/CLI.pt-BR.md)
132+
- [Guia da API](docs/API.pt-BR.md)
133+
- [Arquitetura](docs/ARCHITECTURE.pt-BR.md)
134+
- [CI/CD e Releases](docs/CI-CD.pt-BR.md)
135+
- [Ambiente Python, venv e troubleshooting](docs/ENVIRONMENT.pt-BR.md)
136+
- [Exemplos](examples/README.md)
137+

0 commit comments

Comments
 (0)