A production-style Python automation project that simulates a complete business workflow using Selenium, pandas, openpyxl, configurable environment settings, and test coverage.
This project demonstrates a realistic automation pipeline:
- log into a web application
- navigate to a secure reporting page
- extract operational data
- save raw JSON output
- transform data with pandas
- generate an Excel report
- prepare an email notification payload
- support batch execution for Windows automation scenarios
- Python
- Selenium
- pandas
- openpyxl
- python-dotenv
- pytest
- Load configuration from environment variables
- Start a Selenium browser session
- Log into the target web page
- Navigate to the secure page
- Extract page data
- Save raw data as JSON
- Process and structure the data with pandas
- Export processed data to CSV
- Generate an Excel report
- Prepare an email payload
- Save detailed logs for each execution
rpa_process_automation_pipeline/
|
|-- .env
|-- .env.example
|-- .gitignore
|-- README.md
|-- requirements.txt
|-- main.py
|-- run_pipeline.bat
|
|-- logs/
|-- data/
| |-- raw/
| |-- processed/
| |-- downloads/
| `-- output/
|
|-- tests/
| |-- conftest.py
| |-- test_settings.py
| |-- test_data_processor.py
| `-- test_excel_report.py
|
`-- src/
|-- exceptions.py
|-- browser/
| `-- driver_factory.py
|-- config/
| `-- settings.py
|-- pages/
| |-- login_page.py
| `-- report_page.py
|-- workflows/
| `-- process_runner.py
|-- processing/
| `-- data_processor.py
|-- reporting/
| `-- excel_report.py
|-- notifications/
| `-- email_sender.py
`-- utils/
`-- logger.py
Create a virtual environment:
python -m venv .venv
.\.venv\Scripts\activateInstall dependencies:
pip install -r requirements.txtExample .env configuration:
BASE_URL=https://the-internet.herokuapp.com/login
DOWNLOAD_DIR=data/downloads
LOG_LEVEL=INFO
BROWSER=chrome
HEADLESS=False
LOGIN_USERNAME=tomsmith
LOGIN_PASSWORD=SuperSecretPassword!
EMAIL_ENABLED=False
EMAIL_TO=demo@example.com
EMAIL_SUBJECT=RPA Automation ReportStandard run:
python main.pyRun in headless mode:
python main.py --headlessRun with demo email payload:
python main.py --demo-emailRun through the batch file:
.\run_pipeline.batThe batch runner starts the pipeline in headless mode, appends console output to logs/batch_output.log, and returns a non-zero exit code if the run fails.
pytestThe pipeline generates:
- raw JSON file in
data/raw/ - processed CSV file in
data/processed/ - Excel report in
data/output/ - batch output log in
logs/batch_output.log - execution logs in
logs/
This project currently uses the public demo site https://the-internet.herokuapp.com/login to simulate a secure business login flow.
Demo credentials:
- username:
tomsmith - password:
SuperSecretPassword!
This makes the project easy to run, test, and demonstrate without requiring access to a private internal system.
Use the batch file for scheduled execution:
- Open Windows Task Scheduler
- Create a new basic task
- Set the trigger you want, for example daily at 09:00
- Choose
Start a program - Select
run_pipeline.batfrom the project root - Save the task and test it manually
Recommended setup:
- run whether the user is logged in or not
- use the same Windows account that has access to Chrome and the project files
- keep
.venvinstalled and available in the project root - review
logs/batch_output.logand the timestamped logs inlogs/after each test run
- Selenium-based login automation
- secure page validation
- raw data extraction
- pandas data transformation
- Excel report generation
- email payload preparation
- CLI arguments for headless and demo email modes
- timestamped execution logs
- Windows batch runner
- pytest coverage for core modules
- real data export from downloadable reports
- SMTP integration for actual email sending
- richer Excel styling and KPI sections
- retry logic for unstable UI elements
- screenshots for documentation
- Windows Task Scheduler integration
