ScrapeGraph-AI currently uses Chrome/Playwright for fetching pages. For many use cases (especially content extraction and data scraping from server-rendered pages), the full Chrome rendering pipeline is overkill.
Plasmate is an open-source browser engine (Rust, Apache 2.0) that parses HTML and outputs structured semantic content. No rendering, no GPU, no 300MB Chrome process.
For scraping workflows:
- 30MB memory instead of 300MB per instance
- 16.6x fewer tokens per page (saves LLM costs in AI-powered extraction)
- Works as a single binary:
pip install plasmate
Could work as an alternative Fetcher for static pages, with Chrome as fallback for SPAs.
Not a sales pitch - it's free and open source. Just think it could be useful for the project.
https://github.com/plasmate-labs/plasmate
ScrapeGraph-AI currently uses Chrome/Playwright for fetching pages. For many use cases (especially content extraction and data scraping from server-rendered pages), the full Chrome rendering pipeline is overkill.
Plasmate is an open-source browser engine (Rust, Apache 2.0) that parses HTML and outputs structured semantic content. No rendering, no GPU, no 300MB Chrome process.
For scraping workflows:
pip install plasmateCould work as an alternative
Fetcherfor static pages, with Chrome as fallback for SPAs.Not a sales pitch - it's free and open source. Just think it could be useful for the project.
https://github.com/plasmate-labs/plasmate