Collect Telegram chat messages and channel comments along with author profile details in a structured, analysis-ready format. This project automates Telegram message parsing so teams can stop manual copy/paste and start working with clean datasets for research, moderation, or reporting.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for telegram-get-chat-messages-channel-comments you've just found your team — Let’s Chat. 👆👆
This project parses messages from a Telegram channel or chat and enriches them with author account information, producing downloadable files for analysis. It solves the problem of manually collecting large volumes of Telegram messages and participant details, especially when you need consistent fields across many runs. It’s built for analysts, community managers, researchers, and automation teams who want Telegram messages exported into structured outputs for workflows and dashboards.
- Runs an asynchronous parsing task and publishes progress updates via webhook.
- Supports resume from the last checkpoint to extend collection across multiple runs.
- Limits each run session duration while allowing unlimited total collection over time.
- Produces both a reusable text list and a rich Excel export for downstream filtering.
- Keeps results organized by most recent activity for faster review and prioritization.
| Feature | Description |
|---|---|
| Channel / chat message parsing | Collect messages from a target Telegram channel or chat and include message metadata. |
| Author account enrichment | Attach author profile fields (name, username, bio, language, premium status, online status, etc.). |
| Channel comments coverage | Parse comments alongside messages when available to capture conversation context. |
| Resume from checkpoint | Continue data collection from the last known point to avoid reprocessing. |
| Asynchronous task execution | Offloads parsing to a background task and reports state changes through webhook events. |
| Webhook progress updates | Sends structured progress payloads (counts, keys, timestamps, errors) to your webhook endpoint. |
| Duplicate handling control | Optionally keep duplicates in the output when your analysis requires repeated participants. |
| Excel + text outputs | Generates an Excel file with enriched records plus a text file of usernames for reuse. |
| Message range controls | Use min/max message IDs to narrow parsing to specific historical windows. |
| Error transparency | Returns explicit error states for restricted output, permissions, or parsing issues. |
| Field Name | Field Description |
|---|---|
| result | Task result status (e.g., success or empty). |
| error | Error message if the task fails or output is restricted. |
| task_id | Unique identifier for the parsing task. |
| channel_telegram_id | Telegram ID of the parsed channel/chat target. |
| execution_time | Duration of the parsing task execution. |
| state | Task status (started, in_progress, completed, error). |
| users_count | Current number of users parsed from the channel/chat. |
| users_with_usernames_count | Current number of users with usernames parsed. |
| user_ids_s3_key | Storage key/reference for a file containing collected user IDs. |
| xlsx_s3_key | Storage key/reference for the generated Excel export. |
| timestamp | RFC3339 timestamp of a webhook update event. |
| channel_name | Human-readable channel name discovered during parsing. |
| user_id | Unique numeric identifier of the message author. |
| first_name | Author first name (if available). |
| last_name | Author last name (if available). |
| username | Author username (if available). |
| has_profile_photo | Whether the author has a profile photo (Yes/No). |
| phone_number | Author phone number if available/accessible. |
| premium_status | Whether the author is premium (Yes/No). |
| online_status | Author online status representation at capture time. |
| language | Author language field if available. |
| bio | Author bio/about text for profiling and keyword filtering. |
| message_date | Date/time of the message captured. |
| message_text | Full message text content. |
{
"result": "success",
"error": null
}
telegram-get-chat-messages-channel-comments-scraper (IMPORTANT :!! always keep this name as the name of the apify actor !!! Telegram Get Chat Messages / Channel Comments )/
├── src/
│ ├── main.py
│ ├── cli.py
│ ├── core/
│ │ ├── task_manager.py
│ │ ├── checkpoint_store.py
│ │ ├── validators.py
│ │ └── constants.py
│ ├── telegram/
│ │ ├── client.py
│ │ ├── api_methods.py
│ │ ├── message_parser.py
│ │ ├── user_enricher.py
│ │ └── ranges.py
│ ├── webhooks/
│ │ ├── dispatcher.py
│ │ ├── payloads.py
│ │ └── retry.py
│ ├── exporters/
│ │ ├── excel_writer.py
│ │ ├── text_writer.py
│ │ └── schema.py
│ ├── utils/
│ │ ├── http.py
│ │ ├── time.py
│ │ └── logging.py
│ └── config/
│ ├── settings.py
│ └── settings.example.json
├── data/
│ ├── input.example.json
│ └── samples/
│ ├── sample_webhook_payload.json
│ └── sample_result.json
├── tests/
│ ├── test_validators.py
│ ├── test_ranges.py
│ ├── test_webhook_payloads.py
│ └── test_exporters.py
├── .env.example
├── .gitignore
├── pyproject.toml
├── requirements.txt
├── LICENSE
└── README.md
- Market researchers use it to export Telegram messages and author profiles, so they can identify trends and segment audiences by bio keywords and activity.
- Community managers use it to monitor channel comments and frequent posters, so they can prioritize moderation and engagement based on recent activity.
- Security and compliance teams use it to investigate suspicious behavior patterns, so they can validate identities and flag potential automation or fraud signals.
- Sales and partnerships teams use it to find high-intent prospects, so they can filter for premium users, active accounts, and relevant bios at scale.
- Data teams use it to feed Telegram message exports into analytics pipelines, so they can build dashboards and retention/engagement reporting.
How do I provide the channel username? You can provide a channel identifier in multiple formats: a plain username, an @username, or a full Telegram URL. The parser normalizes the input before launching the task.
Why do I need a webhook URL? Parsing is asynchronous and can take several minutes, so progress and results are delivered to your webhook endpoint as the task moves through started → in_progress → completed (or error). This makes it easy to integrate with any system that can receive HTTP requests.
How do I collect more than a single run’s worth of messages? Each run is time-limited, but the project supports resuming from the last checkpoint. Run it repeatedly to extend coverage without reprocessing earlier data, optionally using message ID ranges for tighter control.
What causes “restricted output” or missing profile fields? Restrictions can occur due to access limitations, permission requirements, or plan constraints in the upstream environment. Some fields (like phone number) may be unavailable depending on access rules and what the platform exposes for a given account.
Primary Metric: Processes a typical channel parsing task in ~5 minutes for standard volumes, returning a completed state with downloadable export keys.
Reliability Metric: Maintains stable progress delivery with webhook updates across task states, with consistent completion behavior when the target is accessible.
Efficiency Metric: Uses checkpoint-based continuation to avoid duplicate work, reducing repeated parsing overhead when collecting large histories over multiple runs.
Quality Metric: Produces enriched author profiles alongside message text, enabling high-confidence filtering by profile photo presence, online status signals, and bio keyword matching.
