From 79bffe6eef2fd49829108cb5289df7eab6b6061f Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Wed, 10 Jun 2026 15:19:51 +0200 Subject: [PATCH 1/8] docs: polish and modernize README --- README.md | 177 ++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 125 insertions(+), 52 deletions(-) diff --git a/README.md b/README.md index 24cef390..227f5e5c 100644 --- a/README.md +++ b/README.md @@ -1,47 +1,113 @@ -

Apify SDK for Python

+

Apify SDK for Python

- PyPI package version - PyPI package downloads - Codecov report - PyPI Python version - Chat on Discord + The official Python SDK for building Apify Actors.

-The Apify SDK for Python is the official library to create [Apify Actors](https://docs.apify.com/platform/actors) -in Python. It provides useful features like Actor lifecycle management, local storage emulation, and Actor -event handling. +

+ PyPI version + PyPI downloads + Python versions + Coverage + License + Chat on Discord +

+ +`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK manages the Actor lifecycle, gives you access to [storages](https://docs.apify.com/platform/storage) (datasets, key-value stores, request queues), handles platform events, configures [Apify Proxy](https://docs.apify.com/platform/proxy), and supports pay-per-event monetization. It builds on the [Crawlee](https://crawlee.dev/python) web scraping framework and bundles the [Apify API client](https://docs.apify.com/api/client/python). + +> If you only need to **consume** the [Apify API](https://docs.apify.com/api/v2) from Python (running Actors, reading datasets, managing storages) rather than building Actors, use the [Apify API client for Python](https://docs.apify.com/api/client/python) instead. It comes bundled with this SDK. + +## Table of contents -If you just need to access the [Apify API](https://docs.apify.com/api/v2) from your Python applications, -check out the [Apify Client for Python](https://docs.apify.com/api/client/python) instead. +- [Installation](#installation) +- [Quick start](#quick-start) +- [Features](#features) +- [Usage examples](#usage-examples) +- [What are Actors?](#what-are-actors) +- [Documentation](#documentation) +- [Related projects](#related-projects) +- [Support and community](#support-and-community) +- [Contributing](#contributing) +- [License](#license) ## Installation -The Apify SDK for Python is available on PyPI as the `apify` package. -For default installation, using Pip, run the following: +The Apify SDK for Python requires **Python 3.11 or higher**. It is published on [PyPI](https://pypi.org/project/apify/) as the `apify` package and can be installed with [pip](https://pip.pypa.io/): ```bash pip install apify ``` -For users interested in integrating Apify with Scrapy, we provide a package extra called `scrapy`. -To install Apify with the `scrapy` extra, use the following command: +or with [uv](https://docs.astral.sh/uv/): ```bash -pip install apify[scrapy] +uv add apify ``` -## Documentation +To use the Scrapy integration, install the `scrapy` extra: + +```bash +pip install 'apify[scrapy]' +``` -For usage instructions, check the documentation on [Apify Docs](https://docs.apify.com/sdk/python/). +## Quick start + +An Actor is a Python program that runs inside the `async with Actor:` context. The context initializes the Actor when it starts and tears it down when it finishes. Here's a minimal Actor that reads its input and stores a result: + +```python +from apify import Actor + + +async def main() -> None: + async with Actor: + actor_input = await Actor.get_input() + Actor.log.info('Actor input: %s', actor_input) + await Actor.set_value('OUTPUT', 'Hello, world!') +``` -## Examples +The quickest way to scaffold a full Actor project, with the `.actor` configuration, input schema, and Dockerfile already in place, is the [Apify CLI](https://docs.apify.com/cli): -Below are few examples demonstrating how to use the Apify SDK with some web scraping-related libraries. +1. Install the CLI: -### Apify SDK with HTTPX and BeautifulSoup + ```bash + npm install -g apify-cli + ``` -This example illustrates how to integrate the Apify SDK with [HTTPX](https://www.python-httpx.org/) and [BeautifulSoup](https://pypi.org/project/beautifulsoup4/) to scrape data from web pages. +2. Create a new Actor from the Python "getting started" template: + + ```bash + apify create my-actor --template python-start + ``` + +3. Run it locally: + + ```bash + cd my-actor + apify run + ``` + +To create, run, and deploy your first Actor step by step, see the [Quick start guide](https://docs.apify.com/sdk/python/docs/quick-start). + +## Features + +- **Actor lifecycle management** — `async with Actor:` initializes the Actor, then handles exit, failure, status messages, and reboots ([Actor lifecycle](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle)). +- **Typed Actor input** — read input validated against your input schema with `Actor.get_input()` ([Actor input](https://docs.apify.com/sdk/python/docs/concepts/actor-input)). +- **Storage access** — read and write datasets, key-value stores, and request queues, both locally and on the platform ([Working with storages](https://docs.apify.com/sdk/python/docs/concepts/storages)). +- **Platform events** — react to system info, migration, and abort events streamed over a WebSocket ([Actor events](https://docs.apify.com/sdk/python/docs/concepts/actor-events)). +- **Proxy management** — route requests through Apify Proxy with residential or datacenter groups, country targeting, and rotation ([Proxy management](https://docs.apify.com/sdk/python/docs/concepts/proxy-management)). +- **Actor orchestration** — start, call, abort, and metamorph other Actors and tasks, and register webhooks for run events ([Interacting with other Actors](https://docs.apify.com/sdk/python/docs/concepts/interacting-with-other-actors), [Webhooks](https://docs.apify.com/sdk/python/docs/concepts/webhooks)). +- **Pay-per-event monetization** — charge for the events your Actor emits ([Pay-per-event](https://docs.apify.com/sdk/python/docs/concepts/pay-per-event)). +- **Direct Apify API access** — reach the full [Apify API](https://docs.apify.com/api/v2) through a preconfigured [`ApifyClient`](https://docs.apify.com/api/client/python) ([Accessing the Apify API](https://docs.apify.com/sdk/python/docs/concepts/access-apify-api)). +- **Built on Crawlee** — combine the SDK with [Crawlee](https://crawlee.dev/python) crawlers, or any HTTP or browser library you prefer ([Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). +- **Scrapy integration** — run existing Scrapy spiders as Apify Actors through the `apify[scrapy]` extra ([Scrapy guide](https://docs.apify.com/sdk/python/docs/guides/scrapy)). + +## Usage examples + +The SDK works with whatever scraping stack you prefer. The examples below show two common setups. For more, see the [Guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx). + +### HTTPX with BeautifulSoup + +Scrape pages with [HTTPX](https://www.python-httpx.org/) and [BeautifulSoup](https://pypi.org/project/beautifulsoup4/), using the Actor's request queue to track URLs: ```python from bs4 import BeautifulSoup @@ -88,9 +154,9 @@ async def main() -> None: await Actor.push_data(data) ``` -### Apify SDK with PlaywrightCrawler from Crawlee +### PlaywrightCrawler from Crawlee -This example demonstrates how to use the Apify SDK alongside `PlaywrightCrawler` from [Crawlee](https://crawlee.dev/python) to perform web scraping. +Scrape pages with [Crawlee](https://crawlee.dev/python)'s `PlaywrightCrawler`, which handles queueing, concurrency, and browser automation for you: ```python from crawlee.crawlers import PlaywrightCrawler, PlaywrightCrawlingContext @@ -143,40 +209,47 @@ async def main() -> None: ## What are Actors? -Actors are serverless cloud programs that can do almost anything a human can do in a web browser. -They can do anything from small tasks such as filling in forms or unsubscribing from online services, -all the way up to scraping and processing vast numbers of web pages. +Actors are serverless cloud programs that can do almost anything a human can do in a web browser. They range from small tasks, such as filling in forms or unsubscribing from online services, all the way up to scraping and processing vast numbers of web pages. + +They run either locally or on the [Apify platform](https://docs.apify.com/platform/), where you can run them at scale, monitor them, schedule them, or publish and monetize them. If you're new to Apify, learn [what Apify is](https://docs.apify.com/platform/about) in the platform documentation. + +## Documentation + +The full documentation lives at **[docs.apify.com/sdk/python](https://docs.apify.com/sdk/python)**. + +| Section | What you'll find | +|---|---| +| [Overview](https://docs.apify.com/sdk/python/docs/overview) | What the SDK is, what Actors are, and how the pieces fit together. | +| [Quick start](https://docs.apify.com/sdk/python/docs/quick-start) | Create, run, and deploy your first Python Actor. | +| [Concepts](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle) | Actor lifecycle, input, storages, events, proxy management, interacting with other Actors, webhooks, accessing the Apify API, logging, configuration, and pay-per-event. | +| [Guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx) | Integrations with BeautifulSoup, Parsel, Playwright, Selenium, Crawlee, Scrapy, Crawl4AI, and Browser Use, plus running a web server and using uv. | +| [Upgrading](https://docs.apify.com/sdk/python/docs/upgrading/upgrading-to-v4) | Migrating between major versions. | +| [API reference](https://docs.apify.com/sdk/python/reference) | Generated reference for every class and method. | +| [Changelog](https://docs.apify.com/sdk/python/docs/changelog) | Release history and breaking changes. | -They can be run either locally, or on the [Apify platform](https://docs.apify.com/platform/), -where you can run them at scale, monitor them, schedule them, or publish and monetize them. +## Related projects -If you're new to Apify, learn [what is Apify](https://docs.apify.com/platform/about) -in the Apify platform documentation. +- **[Apify API client for Python](https://docs.apify.com/api/client/python)** — talk to the Apify API directly from Python (bundled with this SDK). +- **[Crawlee for Python](https://crawlee.dev/python)** — the web scraping and browser automation framework the SDK builds on. +- **[Apify SDK for JavaScript / TypeScript](https://docs.apify.com/sdk/js)** — the equivalent SDK for Node.js. +- **[Apify API client for JavaScript / TypeScript](https://docs.apify.com/api/client/js)** — the equivalent API client for Node.js. +- **[Crawlee for JavaScript / TypeScript](https://crawlee.dev)** — the original Node.js implementation of Crawlee. +- **[Apify CLI](https://docs.apify.com/cli)** — command-line tool for creating, running, and deploying Actors locally and on the platform. -## Creating Actors +## Support and community -To create and run Actors through Apify Console, -see the [Console documentation](https://docs.apify.com/academy/getting-started/creating-actors#choose-your-template). +- **Discord** — chat with the team and other users on the [Apify Discord server](https://discord.gg/jyEM2PRvMU). +- **GitHub issues** — report a bug or request a feature in the [issue tracker](https://github.com/apify/apify-sdk-python/issues). -To create and run Python Actors locally, check the documentation for -[how to create and run Python Actors locally](https://docs.apify.com/sdk/python/docs/quick-start). +## Contributing -## Guides +Bug reports, fixes, and improvements are welcome! See [CONTRIBUTING.md](./CONTRIBUTING.md) for the development setup, coding standards, testing, and release process. The project uses [uv](https://docs.astral.sh/uv/) for project management and [Poe the Poet](https://poethepoet.natn.io/) as a task runner; the typical loop is: -To see how you can use the Apify SDK with other popular libraries used for web scraping, -check out our guides for using -[BeautifulSoup with HTTPX](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx), -[Parsel with Impit](https://docs.apify.com/sdk/python/docs/guides/parsel-impit), -[Playwright](https://docs.apify.com/sdk/python/docs/guides/playwright), -[Selenium](https://docs.apify.com/sdk/python/docs/guides/selenium), -[Crawlee](https://docs.apify.com/sdk/python/docs/guides/crawlee), -or [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy). +```bash +uv run poe install-dev # install dev dependencies and git hooks +uv run poe check-code # lint, type-check, and unit tests +``` -## Usage concepts +## License -To learn more about the features of the Apify SDK and how to use them, -check out the Usage Concepts section in the sidebar, -particularly the guides for the [Actor lifecycle](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle), -[working with storages](https://docs.apify.com/sdk/python/docs/concepts/storages), -[handling Actor events](https://docs.apify.com/sdk/python/docs/concepts/actor-events) -or [how to use proxies](https://docs.apify.com/sdk/python/docs/concepts/proxy-management). +Released under the [Apache License 2.0](./LICENSE). From 58dc2836c7ad0985afa14101563422db72175f12 Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Wed, 10 Jun 2026 15:41:50 +0200 Subject: [PATCH 2/8] Improvement --- README.md | 93 +++++++++++++++++-------------------------------------- 1 file changed, 28 insertions(+), 65 deletions(-) diff --git a/README.md b/README.md index 227f5e5c..5975401d 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ Chat on Discord

-`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK manages the Actor lifecycle, gives you access to [storages](https://docs.apify.com/platform/storage) (datasets, key-value stores, request queues), handles platform events, configures [Apify Proxy](https://docs.apify.com/platform/proxy), and supports pay-per-event monetization. It builds on the [Crawlee](https://crawlee.dev/python) web scraping framework and bundles the [Apify API client](https://docs.apify.com/api/client/python). +`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK manages the Actor lifecycle, gives you access to [storages](https://docs.apify.com/platform/storage) (datasets, key-value stores, request queues), handles platform events, configures [Apify Proxy](https://docs.apify.com/platform/proxy), and supports pay-per-event monetization. It's built on the [Apify API client](https://docs.apify.com/api/client/python). > If you only need to **consume** the [Apify API](https://docs.apify.com/api/v2) from Python (running Actors, reading datasets, managing storages) rather than building Actors, use the [Apify API client for Python](https://docs.apify.com/api/client/python) instead. It comes bundled with this SDK. @@ -90,16 +90,16 @@ To create, run, and deploy your first Actor step by step, see the [Quick start g ## Features -- **Actor lifecycle management** — `async with Actor:` initializes the Actor, then handles exit, failure, status messages, and reboots ([Actor lifecycle](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle)). -- **Typed Actor input** — read input validated against your input schema with `Actor.get_input()` ([Actor input](https://docs.apify.com/sdk/python/docs/concepts/actor-input)). -- **Storage access** — read and write datasets, key-value stores, and request queues, both locally and on the platform ([Working with storages](https://docs.apify.com/sdk/python/docs/concepts/storages)). -- **Platform events** — react to system info, migration, and abort events streamed over a WebSocket ([Actor events](https://docs.apify.com/sdk/python/docs/concepts/actor-events)). -- **Proxy management** — route requests through Apify Proxy with residential or datacenter groups, country targeting, and rotation ([Proxy management](https://docs.apify.com/sdk/python/docs/concepts/proxy-management)). -- **Actor orchestration** — start, call, abort, and metamorph other Actors and tasks, and register webhooks for run events ([Interacting with other Actors](https://docs.apify.com/sdk/python/docs/concepts/interacting-with-other-actors), [Webhooks](https://docs.apify.com/sdk/python/docs/concepts/webhooks)). -- **Pay-per-event monetization** — charge for the events your Actor emits ([Pay-per-event](https://docs.apify.com/sdk/python/docs/concepts/pay-per-event)). -- **Direct Apify API access** — reach the full [Apify API](https://docs.apify.com/api/v2) through a preconfigured [`ApifyClient`](https://docs.apify.com/api/client/python) ([Accessing the Apify API](https://docs.apify.com/sdk/python/docs/concepts/access-apify-api)). -- **Built on Crawlee** — combine the SDK with [Crawlee](https://crawlee.dev/python) crawlers, or any HTTP or browser library you prefer ([Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). -- **Scrapy integration** — run existing Scrapy spiders as Apify Actors through the `apify[scrapy]` extra ([Scrapy guide](https://docs.apify.com/sdk/python/docs/guides/scrapy)). +- Run the full Actor lifecycle inside `async with Actor:`, covering init, exit, failures, status messages, and reboots ([Actor lifecycle](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle)). +- Read Actor input validated against your input schema with `Actor.get_input()` ([Actor input](https://docs.apify.com/sdk/python/docs/concepts/actor-input)). +- Read and write datasets, key-value stores, and request queues, locally or on the platform ([Working with storages](https://docs.apify.com/sdk/python/docs/concepts/storages)). +- React to platform events such as system info, migration, and abort ([Actor events](https://docs.apify.com/sdk/python/docs/concepts/actor-events)). +- Route requests through Apify Proxy with group selection, country targeting, and rotation ([Proxy management](https://docs.apify.com/sdk/python/docs/concepts/proxy-management)). +- Start, call, abort, and metamorph other Actors and tasks, and attach webhooks to run events ([Interacting with other Actors](https://docs.apify.com/sdk/python/docs/concepts/interacting-with-other-actors), [Webhooks](https://docs.apify.com/sdk/python/docs/concepts/webhooks)). +- Monetize your Actor with pay-per-event charging ([Pay-per-event](https://docs.apify.com/sdk/python/docs/concepts/pay-per-event)). +- Reach the full [Apify API](https://docs.apify.com/api/v2) through a preconfigured `ApifyClient` ([Accessing the Apify API](https://docs.apify.com/sdk/python/docs/concepts/access-apify-api)). +- Fully compatible with [Crawlee](https://crawlee.dev/python), so Apify is a natural place to deploy and scale your Crawlee projects ([Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). +- Run existing Scrapy spiders as Actors through the `apify[scrapy]` extra ([Scrapy guide](https://docs.apify.com/sdk/python/docs/guides/scrapy)). ## Usage examples @@ -118,45 +118,31 @@ from apify import Actor async def main() -> None: async with Actor: - # Retrieve the Actor input, and use default values if not provided. actor_input = await Actor.get_input() or {} start_urls = actor_input.get('start_urls', [{'url': 'https://apify.com'}]) - # Open the default request queue for handling URLs to be processed. + # Enqueue the start URLs into the default request queue. request_queue = await Actor.open_request_queue() - - # Enqueue the start URLs. for start_url in start_urls: - url = start_url.get('url') - await request_queue.add_request(url) + await request_queue.add_request(start_url['url']) - # Process the URLs from the request queue. + # Process the queue until it's empty. while request := await request_queue.fetch_next_request(): Actor.log.info(f'Scraping {request.url} ...') - - # Fetch the HTTP response from the specified URL using HTTPX. async with AsyncClient() as client: response = await client.get(request.url) - - # Parse the HTML content using Beautiful Soup. soup = BeautifulSoup(response.content, 'html.parser') - # Extract the desired data. - data = { + # Push the extracted data to the default dataset. + await Actor.push_data({ 'url': request.url, - 'title': soup.title.string, - 'h1s': [h1.text for h1 in soup.find_all('h1')], - 'h2s': [h2.text for h2 in soup.find_all('h2')], - 'h3s': [h3.text for h3 in soup.find_all('h3')], - } - - # Store the extracted data to the default dataset. - await Actor.push_data(data) + 'title': soup.title.string if soup.title else None, + }) ``` ### PlaywrightCrawler from Crawlee -Scrape pages with [Crawlee](https://crawlee.dev/python)'s `PlaywrightCrawler`, which handles queueing, concurrency, and browser automation for you: +Scrape pages with [Crawlee](https://crawlee.dev/python)'s `PlaywrightCrawler`, which handles queueing, concurrency, and the browser for you: ```python from crawlee.crawlers import PlaywrightCrawler, PlaywrightCrawlingContext @@ -166,44 +152,21 @@ from apify import Actor async def main() -> None: async with Actor: - # Retrieve the Actor input, and use default values if not provided. actor_input = await Actor.get_input() or {} - start_urls = [url.get('url') for url in actor_input.get('start_urls', [{'url': 'https://apify.com'}])] - - # Exit if no start URLs are provided. - if not start_urls: - Actor.log.info('No start URLs specified in Actor input, exiting...') - await Actor.exit() + start_urls = [url['url'] for url in actor_input.get('start_urls', [{'url': 'https://apify.com'}])] - # Create a crawler. - crawler = PlaywrightCrawler( - # Limit the crawl to max requests. Remove or increase it for crawling all links. - max_requests_per_crawl=50, - headless=True, - ) + crawler = PlaywrightCrawler(max_requests_per_crawl=50, headless=True) - # Define a request handler, which will be called for every request. @crawler.router.default_handler - async def request_handler(context: PlaywrightCrawlingContext) -> None: - url = context.request.url - Actor.log.info(f'Scraping {url}...') - - # Extract the desired data. - data = { + async def handler(context: PlaywrightCrawlingContext) -> None: + Actor.log.info(f'Scraping {context.request.url} ...') + await context.push_data({ 'url': context.request.url, 'title': await context.page.title(), - 'h1s': [await h1.text_content() for h1 in await context.page.locator('h1').all()], - 'h2s': [await h2.text_content() for h2 in await context.page.locator('h2').all()], - 'h3s': [await h3.text_content() for h3 in await context.page.locator('h3').all()], - } - - # Store the extracted data to the default dataset. - await context.push_data(data) - - # Enqueue additional links found on the current page. + }) + # Follow links found on the page. await context.enqueue_links() - # Run the crawler with the starting URLs. await crawler.run(start_urls) ``` @@ -215,7 +178,7 @@ They run either locally or on the [Apify platform](https://docs.apify.com/platfo ## Documentation -The full documentation lives at **[docs.apify.com/sdk/python](https://docs.apify.com/sdk/python)**. +The full SDK documentation lives at **[docs.apify.com/sdk/python](https://docs.apify.com/sdk/python)**. For the Apify platform itself, see the [Apify documentation](https://docs.apify.com/). | Section | What you'll find | |---|---| @@ -230,7 +193,7 @@ The full documentation lives at **[docs.apify.com/sdk/python](https://docs.apify ## Related projects - **[Apify API client for Python](https://docs.apify.com/api/client/python)** — talk to the Apify API directly from Python (bundled with this SDK). -- **[Crawlee for Python](https://crawlee.dev/python)** — the web scraping and browser automation framework the SDK builds on. +- **[Crawlee for Python](https://crawlee.dev/python)** — web scraping and browser automation framework; Apify is a natural place to host and scale Crawlee projects. - **[Apify SDK for JavaScript / TypeScript](https://docs.apify.com/sdk/js)** — the equivalent SDK for Node.js. - **[Apify API client for JavaScript / TypeScript](https://docs.apify.com/api/client/js)** — the equivalent API client for Node.js. - **[Crawlee for JavaScript / TypeScript](https://crawlee.dev)** — the original Node.js implementation of Crawlee. From d13fb7369fa6ab600c964113f30f1403c1dfdb9b Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Thu, 11 Jun 2026 12:03:04 +0200 Subject: [PATCH 3/8] docs: expand README features with libraries, AI agents, and MCP --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 5975401d..f818726b 100644 --- a/README.md +++ b/README.md @@ -99,7 +99,11 @@ To create, run, and deploy your first Actor step by step, see the [Quick start g - Monetize your Actor with pay-per-event charging ([Pay-per-event](https://docs.apify.com/sdk/python/docs/concepts/pay-per-event)). - Reach the full [Apify API](https://docs.apify.com/api/v2) through a preconfigured `ApifyClient` ([Accessing the Apify API](https://docs.apify.com/sdk/python/docs/concepts/access-apify-api)). - Fully compatible with [Crawlee](https://crawlee.dev/python), so Apify is a natural place to deploy and scale your Crawlee projects ([Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). -- Run existing Scrapy spiders as Actors through the `apify[scrapy]` extra ([Scrapy guide](https://docs.apify.com/sdk/python/docs/guides/scrapy)). +- Works with popular Python web scraping libraries such as [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy), [Scrapling](https://github.com/D4Vinci/Scrapling), and [Crawl4AI](https://docs.apify.com/sdk/python/docs/guides/crawl4ai). +- Automate browsers with tools such as [Playwright](https://docs.apify.com/sdk/python/docs/guides/playwright), [Selenium](https://docs.apify.com/sdk/python/docs/guides/selenium), and [Browser Use](https://docs.apify.com/sdk/python/docs/guides/browser-use). +- Run a [web server](https://docs.apify.com/sdk/python/docs/guides/running-webserver) inside an Actor, and manage your project with [uv](https://docs.apify.com/sdk/python/docs/guides/uv). +- Host AI agents with ready-made templates for [PydanticAI](https://apify.com/templates/python-pydanticai), [CrewAI](https://apify.com/templates/python-crewai), [LangGraph](https://apify.com/templates/python-langgraph), [LlamaIndex](https://apify.com/templates/python-llamaindex-agent), and [Smolagents](https://apify.com/templates/python-smolagents). +- Deploy Python [MCP servers](https://apify.com/templates/python-mcp-server) as Actors. ## Usage examples From e60a5b4926755a396a89364dad91cace1c7b2a8e Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Thu, 11 Jun 2026 12:15:56 +0200 Subject: [PATCH 4/8] docs: move README ecosystem coverage into a What you can build section --- README.md | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index f818726b..c7eabc7f 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ Chat on Discord

-`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK manages the Actor lifecycle, gives you access to [storages](https://docs.apify.com/platform/storage) (datasets, key-value stores, request queues), handles platform events, configures [Apify Proxy](https://docs.apify.com/platform/proxy), and supports pay-per-event monetization. It's built on the [Apify API client](https://docs.apify.com/api/client/python). +`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK manages the Actor lifecycle, gives you access to [storages](https://docs.apify.com/platform/storage) (datasets, key-value stores, request queues), handles platform events, configures [Apify Proxy](https://docs.apify.com/platform/proxy), and supports pay-per-event monetization. It's built on top of the [Apify API client](https://docs.apify.com/api/client/python). > If you only need to **consume** the [Apify API](https://docs.apify.com/api/v2) from Python (running Actors, reading datasets, managing storages) rather than building Actors, use the [Apify API client for Python](https://docs.apify.com/api/client/python) instead. It comes bundled with this SDK. @@ -22,6 +22,7 @@ - [Installation](#installation) - [Quick start](#quick-start) - [Features](#features) +- [What you can build](#what-you-can-build) - [Usage examples](#usage-examples) - [What are Actors?](#what-are-actors) - [Documentation](#documentation) @@ -98,16 +99,26 @@ To create, run, and deploy your first Actor step by step, see the [Quick start g - Start, call, abort, and metamorph other Actors and tasks, and attach webhooks to run events ([Interacting with other Actors](https://docs.apify.com/sdk/python/docs/concepts/interacting-with-other-actors), [Webhooks](https://docs.apify.com/sdk/python/docs/concepts/webhooks)). - Monetize your Actor with pay-per-event charging ([Pay-per-event](https://docs.apify.com/sdk/python/docs/concepts/pay-per-event)). - Reach the full [Apify API](https://docs.apify.com/api/v2) through a preconfigured `ApifyClient` ([Accessing the Apify API](https://docs.apify.com/sdk/python/docs/concepts/access-apify-api)). -- Fully compatible with [Crawlee](https://crawlee.dev/python), so Apify is a natural place to deploy and scale your Crawlee projects ([Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). -- Works with popular Python web scraping libraries such as [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy), [Scrapling](https://github.com/D4Vinci/Scrapling), and [Crawl4AI](https://docs.apify.com/sdk/python/docs/guides/crawl4ai). -- Automate browsers with tools such as [Playwright](https://docs.apify.com/sdk/python/docs/guides/playwright), [Selenium](https://docs.apify.com/sdk/python/docs/guides/selenium), and [Browser Use](https://docs.apify.com/sdk/python/docs/guides/browser-use). -- Run a [web server](https://docs.apify.com/sdk/python/docs/guides/running-webserver) inside an Actor, and manage your project with [uv](https://docs.apify.com/sdk/python/docs/guides/uv). -- Host AI agents with ready-made templates for [PydanticAI](https://apify.com/templates/python-pydanticai), [CrewAI](https://apify.com/templates/python-crewai), [LangGraph](https://apify.com/templates/python-langgraph), [LlamaIndex](https://apify.com/templates/python-llamaindex-agent), and [Smolagents](https://apify.com/templates/python-smolagents). -- Deploy Python [MCP servers](https://apify.com/templates/python-mcp-server) as Actors. + +## What you can build + +An Actor is just a Python program, so almost any Python project can become one. The SDK doesn't lock you into a particular framework. Bring the libraries you already use, and let Apify handle running, scaling, scheduling, and monetization. + +**Web scraping and crawling.** The SDK is fully compatible with [Crawlee](https://crawlee.dev/python), which makes Apify a natural place to deploy and scale your Crawlee projects (see the [Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). It also works with other popular scraping libraries, such as [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy), [Scrapling](https://github.com/D4Vinci/Scrapling), and [Crawl4AI](https://docs.apify.com/sdk/python/docs/guides/crawl4ai). + +**Browser automation.** Drive a real browser with [Playwright](https://docs.apify.com/sdk/python/docs/guides/playwright) or [Selenium](https://docs.apify.com/sdk/python/docs/guides/selenium), or with higher-level tools such as [Browser Use](https://docs.apify.com/sdk/python/docs/guides/browser-use). + +**AI agents.** Host AI agents built with your framework of choice. Ready-made Actor templates cover [PydanticAI](https://apify.com/templates/python-pydanticai), [CrewAI](https://apify.com/templates/python-crewai), [LangGraph](https://apify.com/templates/python-langgraph), [LlamaIndex](https://apify.com/templates/python-llamaindex-agent), and [Smolagents](https://apify.com/templates/python-smolagents). + +**MCP servers.** Deploy a Python [MCP server](https://apify.com/templates/python-mcp-server) as an Actor and make its tools available to any MCP client. + +**Web servers and APIs.** Run a [web server](https://docs.apify.com/sdk/python/docs/guides/running-webserver) inside an Actor to serve HTTP requests, for example to expose your scraper as a live API. + +Whatever you build, you can manage the project with [uv](https://docs.apify.com/sdk/python/docs/guides/uv). To start from a working example, browse the ready-made [Python Actor templates](https://apify.com/templates/categories/python). ## Usage examples -The SDK works with whatever scraping stack you prefer. The examples below show two common setups. For more, see the [Guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx). +The examples below show two common setups. For more, see the [Guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx). ### HTTPX with BeautifulSoup From 743480ee4a5556fef4aa905480396520aa00ce5a Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Thu, 11 Jun 2026 12:38:24 +0200 Subject: [PATCH 5/8] docs: refine README structure and wording, add CI badge --- README.md | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index c7eabc7f..08997d9e 100644 --- a/README.md +++ b/README.md @@ -8,12 +8,13 @@ PyPI version PyPI downloads Python versions + Build status Coverage License Chat on Discord

-`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK manages the Actor lifecycle, gives you access to [storages](https://docs.apify.com/platform/storage) (datasets, key-value stores, request queues), handles platform events, configures [Apify Proxy](https://docs.apify.com/platform/proxy), and supports pay-per-event monetization. It's built on top of the [Apify API client](https://docs.apify.com/api/client/python). +`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK handles the Actor lifecycle, [storage](https://docs.apify.com/platform/storage) access, platform events, [Apify Proxy](https://docs.apify.com/platform/proxy), and pay-per-event charging. > If you only need to **consume** the [Apify API](https://docs.apify.com/api/v2) from Python (running Actors, reading datasets, managing storages) rather than building Actors, use the [Apify API client for Python](https://docs.apify.com/api/client/python) instead. It comes bundled with this SDK. @@ -21,10 +22,10 @@ - [Installation](#installation) - [Quick start](#quick-start) +- [What are Actors?](#what-are-actors) - [Features](#features) - [What you can build](#what-you-can-build) - [Usage examples](#usage-examples) -- [What are Actors?](#what-are-actors) - [Documentation](#documentation) - [Related projects](#related-projects) - [Support and community](#support-and-community) @@ -89,6 +90,12 @@ The quickest way to scaffold a full Actor project, with the `.actor` configurati To create, run, and deploy your first Actor step by step, see the [Quick start guide](https://docs.apify.com/sdk/python/docs/quick-start). +## What are Actors? + +Actors are serverless cloud programs that can do almost anything a human can do in a web browser. They range from small tasks, such as filling in forms or unsubscribing from online services, all the way up to scraping and processing vast numbers of web pages. + +They run either locally or on the [Apify platform](https://docs.apify.com/platform/), where you can run them at scale, monitor them, schedule them, or publish and monetize them. If you're new to Apify, learn [what Apify is](https://docs.apify.com/platform/about) in the platform documentation. + ## Features - Run the full Actor lifecycle inside `async with Actor:`, covering init, exit, failures, status messages, and reboots ([Actor lifecycle](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle)). @@ -102,7 +109,7 @@ To create, run, and deploy your first Actor step by step, see the [Quick start g ## What you can build -An Actor is just a Python program, so almost any Python project can become one. The SDK doesn't lock you into a particular framework. Bring the libraries you already use, and let Apify handle running, scaling, scheduling, and monetization. +Almost any Python project can become an Actor. The SDK doesn't lock you into a particular framework, so bring the libraries you already use and let Apify run your project in the cloud. **Web scraping and crawling.** The SDK is fully compatible with [Crawlee](https://crawlee.dev/python), which makes Apify a natural place to deploy and scale your Crawlee projects (see the [Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). It also works with other popular scraping libraries, such as [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy), [Scrapling](https://github.com/D4Vinci/Scrapling), and [Crawl4AI](https://docs.apify.com/sdk/python/docs/guides/crawl4ai). @@ -118,7 +125,7 @@ Whatever you build, you can manage the project with [uv](https://docs.apify.com/ ## Usage examples -The examples below show two common setups. For more, see the [Guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx). +The examples below show two common setups, but the same `async with Actor:` pattern works with any stack. For more, see the [guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx). ### HTTPX with BeautifulSoup @@ -155,7 +162,7 @@ async def main() -> None: }) ``` -### PlaywrightCrawler from Crawlee +### Crawlee with Playwright Scrape pages with [Crawlee](https://crawlee.dev/python)'s `PlaywrightCrawler`, which handles queueing, concurrency, and the browser for you: @@ -185,12 +192,6 @@ async def main() -> None: await crawler.run(start_urls) ``` -## What are Actors? - -Actors are serverless cloud programs that can do almost anything a human can do in a web browser. They range from small tasks, such as filling in forms or unsubscribing from online services, all the way up to scraping and processing vast numbers of web pages. - -They run either locally or on the [Apify platform](https://docs.apify.com/platform/), where you can run them at scale, monitor them, schedule them, or publish and monetize them. If you're new to Apify, learn [what Apify is](https://docs.apify.com/platform/about) in the platform documentation. - ## Documentation The full SDK documentation lives at **[docs.apify.com/sdk/python](https://docs.apify.com/sdk/python)**. For the Apify platform itself, see the [Apify documentation](https://docs.apify.com/). @@ -208,7 +209,7 @@ The full SDK documentation lives at **[docs.apify.com/sdk/python](https://docs.a ## Related projects - **[Apify API client for Python](https://docs.apify.com/api/client/python)** — talk to the Apify API directly from Python (bundled with this SDK). -- **[Crawlee for Python](https://crawlee.dev/python)** — web scraping and browser automation framework; Apify is a natural place to host and scale Crawlee projects. +- **[Crawlee for Python](https://crawlee.dev/python)** — web scraping and browser automation framework; fully compatible with this SDK. - **[Apify SDK for JavaScript / TypeScript](https://docs.apify.com/sdk/js)** — the equivalent SDK for Node.js. - **[Apify API client for JavaScript / TypeScript](https://docs.apify.com/api/client/js)** — the equivalent API client for Node.js. - **[Crawlee for JavaScript / TypeScript](https://crawlee.dev)** — the original Node.js implementation of Crawlee. From 63062ec3c1a26d790750acf230dcd276c2cdcdc3 Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Thu, 11 Jun 2026 13:15:03 +0200 Subject: [PATCH 6/8] update --- README.md | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 08997d9e..23b3f109 100644 --- a/README.md +++ b/README.md @@ -109,19 +109,15 @@ They run either locally or on the [Apify platform](https://docs.apify.com/platfo ## What you can build -Almost any Python project can become an Actor. The SDK doesn't lock you into a particular framework, so bring the libraries you already use and let Apify run your project in the cloud. +Almost any Python project can become an Actor, including projects for: -**Web scraping and crawling.** The SDK is fully compatible with [Crawlee](https://crawlee.dev/python), which makes Apify a natural place to deploy and scale your Crawlee projects (see the [Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). It also works with other popular scraping libraries, such as [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy), [Scrapling](https://github.com/D4Vinci/Scrapling), and [Crawl4AI](https://docs.apify.com/sdk/python/docs/guides/crawl4ai). +- **Web scraping and crawling** — The SDK is fully compatible with [Crawlee](https://crawlee.dev/python), which makes Apify a natural place to deploy and scale your crawlers (see the [Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). It also works with other popular scraping libraries, such as [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy), [Scrapling](https://docs.apify.com/sdk/python/docs/guides/scrapling), or [Crawl4AI](https://docs.apify.com/sdk/python/docs/guides/crawl4ai). +- **Browser automation** — Drive a real browser with [Playwright](https://docs.apify.com/sdk/python/docs/guides/playwright) or [Selenium](https://docs.apify.com/sdk/python/docs/guides/selenium), or with higher-level tools such as [Browser Use](https://docs.apify.com/sdk/python/docs/guides/browser-use). +- **Web servers and APIs** — Run a [web server](https://docs.apify.com/sdk/python/docs/guides/running-webserver) inside an Actor to serve HTTP requests, for example to expose your scraper as a live API. +- **AI agents** — Host agents built with your framework of choice. Ready-made Actor templates cover [PydanticAI](https://apify.com/templates/python-pydanticai), [CrewAI](https://apify.com/templates/python-crewai), [LangGraph](https://apify.com/templates/python-langgraph), [LlamaIndex](https://apify.com/templates/python-llamaindex-agent), and [Smolagents](https://apify.com/templates/python-smolagents). +- **MCP servers** — Deploy a Python MCP server as an Actor and make its tools available to any MCP client. See [MCP server](https://apify.com/templates/python-mcp-empty) and [MCP proxy](https://apify.com/templates/python-mcp-proxy) templates -**Browser automation.** Drive a real browser with [Playwright](https://docs.apify.com/sdk/python/docs/guides/playwright) or [Selenium](https://docs.apify.com/sdk/python/docs/guides/selenium), or with higher-level tools such as [Browser Use](https://docs.apify.com/sdk/python/docs/guides/browser-use). - -**AI agents.** Host AI agents built with your framework of choice. Ready-made Actor templates cover [PydanticAI](https://apify.com/templates/python-pydanticai), [CrewAI](https://apify.com/templates/python-crewai), [LangGraph](https://apify.com/templates/python-langgraph), [LlamaIndex](https://apify.com/templates/python-llamaindex-agent), and [Smolagents](https://apify.com/templates/python-smolagents). - -**MCP servers.** Deploy a Python [MCP server](https://apify.com/templates/python-mcp-server) as an Actor and make its tools available to any MCP client. - -**Web servers and APIs.** Run a [web server](https://docs.apify.com/sdk/python/docs/guides/running-webserver) inside an Actor to serve HTTP requests, for example to expose your scraper as a live API. - -Whatever you build, you can manage the project with [uv](https://docs.apify.com/sdk/python/docs/guides/uv). To start from a working example, browse the ready-made [Python Actor templates](https://apify.com/templates/categories/python). +Whatever you build, the Apify SDK doesn't lock you into a particular framework. Bring the libraries you already use, and let Apify run your project in the cloud. ## Usage examples From 48d72734b1ccc77c5b7a8ce07ccbee4298b03538 Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Thu, 11 Jun 2026 13:25:41 +0200 Subject: [PATCH 7/8] docs: use compact static python version badge --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 23b3f109..560124f6 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@

PyPI version PyPI downloads - Python versions + Python versions Build status Coverage License From cc1ff0569661385ddd1912354ded9abd7e62f4a6 Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Fri, 12 Jun 2026 13:46:20 +0200 Subject: [PATCH 8/8] docs: remove duplicated Actor description from README intro --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 560124f6..54d32498 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ Chat on Discord

-`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK handles the Actor lifecycle, [storage](https://docs.apify.com/platform/storage) access, platform events, [Apify Proxy](https://docs.apify.com/platform/proxy), and pay-per-event charging. +`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. It handles the Actor lifecycle, [storage](https://docs.apify.com/platform/storage) access, platform events, [Apify Proxy](https://docs.apify.com/platform/proxy), pay-per-event charging, and more. > If you only need to **consume** the [Apify API](https://docs.apify.com/api/v2) from Python (running Actors, reading datasets, managing storages) rather than building Actors, use the [Apify API client for Python](https://docs.apify.com/api/client/python) instead. It comes bundled with this SDK.