Collect detailed hotel listings from Check24 travel search results, including price, rating, location, and key amenities. It turns large hotel search pages into clean, structured data you can analyze for pricing intelligence, market research, and travel comparisons.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for check24-reisen-search-scraper you've just found your team — Let’s Chat. 👆👆
This project extracts hotel information from Check24 travel search result pages and returns a structured dataset per query URL. It helps eliminate manual copy/paste and enables consistent hotel price monitoring across destinations and dates. It’s built for travel analysts, pricing teams, researchers, and developers who need reliable Check24 hotel search data at scale.
- Parses hotel list pages and captures core listing metadata (title, rating, price, location, badges).
- Supports multiple search URLs in one run with per-URL extraction limits.
- Includes retry controls for unstable pages and temporary network failures.
- Supports proxy configuration to improve stability and reduce blocking.
- Preserves the source query URL for traceability and easy debugging.
| Feature | Description |
|---|---|
| Multi-URL batch runs | Process multiple hotel search result URLs in a single execution. |
| Item caps per query | Limit extracted hotels per URL to control cost and runtime. |
| Retry handling | Automatically retries failed requests per URL for higher completion rates. |
| Proxy support | Optional proxy settings to reduce detection and improve stability. |
| Structured outputs | Produces consistent JSON objects ready for BI tools and pipelines. |
| Location-ready data | Includes latitude/longitude when available for mapping and clustering. |
| Traceable results | Stores from_url so every hotel record maps back to its query. |
| Field Name | Field Description |
|---|---|
| id | Unique hotel identifier used for tracking and deduplication. |
| title | Full hotel title text (often includes location/distance lines). |
| url | Deep link to the specific hotel offer or hotel details page. |
| rating | Numeric rating score for quality analysis. |
| rating_count | Number of ratings used to estimate confidence/volume. |
| rating_text | Human-readable rating label (e.g., “Fabelhaft”). |
| badges | Listing badges or labels (e.g., exclusives or special tags). |
| distances | Distance to center/landmarks as shown on the listing. |
| location | Destination / region text shown on the listing card. |
| lat | Latitude (when available) for geo analytics and mapping. |
| lng | Longitude (when available) for geo analytics and mapping. |
| details | Short summary line (e.g., duration, people count, package type). |
| price | Price value shown for the offer at the time of extraction. |
| image_urls | Array of hotel image URLs for media and previews. |
| from_url | The exact search URL that produced this hotel result. |
[
{
"id": 11445,
"title": "Ferien- Und Freizeitpark Weissenhäuser Strand\nWeissenhäuser Strand, Schleswig-Holsteinische Ostseeküste 0,5 km vom Zentrum entfernt\n<100 m vom Strand entfernt",
"url": "https://urlaub.check24.de/suche/angebot?adult=2&airport=BER%2CBRE%2CCGN%2CDRS%2CDTM%2CDUS%2CERF%2CFDH%2CFKB%2CFMM%2CFMO%2CFRA%2CGWT%2CHAJ%2CHAM%2CHHN%2CKSF%2CLBC%2CLEJ%2CMUC%2CNRN%2CNUE%2CPAD%2CRLG%2CSCN%2CSTR&areaId=869&areaSort=topregion&days=exact&departureDate=2025-08-16&oceanView=0&offerSort=default&pageArea=package&returnDate=2025-08-17&roomAllocation=A-A&roomCount=1&sorting=categoryDistribution&transportType=flight&hotelId=11445&hotelListId=bac055b2-cd1a-4869-bf72-9fc5fdb061d4",
"rating": 8.699999809265137,
"rating_count": 22,
"rating_text": "Fabelhaft",
"badges": [
"Nur bei CHECK24"
],
"distances": "0,5 km vom Zentrum entfernt",
"location": "Weissenhäuser Strand, Schleswig-Holsteinische Ostseeküste",
"lat": 54.31019592285156,
"lng": 10.801201820373535,
"details": "2 Tage | 2 Pers. | Flug + Unterkunft",
"price": 649,
"image_urls": [
"https://ctsassets1.check24.de/size=400c400/di=3/nfc=200/source=aHR0cHM6Ly9jZG4ud29ybGRvdGEubmV0L3QvMTAyNHg3NjgvY29udGVudC9mNi96ei9mNmFkN2UxZTNjNGM0NTk5MGZjNzJmYTQzMGFlZmNlYWMzMjQzOTgyLmpwZWc=!3ae35d/picture.jpg?cts_do=DESKTOP&cts_p=PR&cts_s=s3",
"https://ctsassets1.check24.de/size=400c400/di=3/nfc=200/source=aHR0cHM6Ly9jZG4ud29ybGRvdGEubmV0L3QvMTAyNHg3NjgvY29udGVudC80Ni83Yy80NjdjNjcxM2JmZWI1Y2IxNDlhMDRlMGY3OWFmNTZhM2EzYzUzZjM3LmpwZWc=!328792/picture.jpg?cts_do=DESKTOP&cts_p=PR&cts_s=s3"
],
"from_url": "https://urlaub.check24.de/suche/hotel?airport=BER,BRE,CGN,DRS,DTM,DUS,ERF,FDH,FKB,FMM,FMO,FRA,GWT,HAJ,HAM,HHN,KSF,LBC,LEJ,MUC,NRN,NUE,PAD,RLG,SCN,STR&roomCount=1&adult=2&roomAllocation=A-A&hotelDestination=Ostsee+(Deutschland)&departureDate=2025-08-16&returnDate=2025-08-17&days=exact&dpCom=86381374309&areaId=869&sorting=categoryDistribution&offerSort=default&areaSort=topregion&oceanView=0&referrerSourceHotelExecuted=1"
}
]
check24-reisen-search-scraper (IMPORTANT :!! always keep this name as the name of the apify actor !!! Check24 Reisen Search Scraper )/
├── src/
│ ├── main.py
│ ├── runner.py
│ ├── config/
│ │ ├── schema.json
│ │ └── settings.example.json
│ ├── clients/
│ │ ├── browser_client.py
│ │ └── http_client.py
│ ├── extractors/
│ │ ├── hotels_list_parser.py
│ │ ├── normalize.py
│ │ └── geo.py
│ ├── pipelines/
│ │ ├── collect_hotels.py
│ │ └── validate_output.py
│ ├── utils/
│ │ ├── logger.py
│ │ ├── retries.py
│ │ └── url_tools.py
│ └── outputs/
│ ├── dataset_writer.py
│ └── exporters.py
├── tests/
│ ├── test_url_tools.py
│ ├── test_normalize.py
│ └── fixtures/
│ └── sample_hotel_card.html
├── data/
│ ├── input.sample.json
│ └── output.sample.json
├── .gitignore
├── .env.example
├── pyproject.toml
├── requirements.txt
└── README.md
- Pricing analysts use it to track hotel prices by destination/date filters, so they can spot spikes, discounts, and pricing gaps early.
- Travel startups use it to populate comparison dashboards, so they can build faster search experiences with consistent hotel metadata.
- Market researchers use it to study rating-to-price relationships, so they can quantify value tiers across regions.
- BI teams use it to feed scheduled snapshots into warehouses, so they can monitor trends and automate reporting.
- Hotel groups use it to benchmark competitors in target areas, so they can adjust positioning and offers.
How do I choose good search URLs? Use the travel search interface to apply your filters (dates, destination, airports, sort order) and copy the resulting hotel list URL. Make sure the URL contains the full set of query parameters so results are stable and reproducible.
What does max_items_per_url control?
It caps how many hotel cards are extracted from each search URL. This is useful for fast sampling, cost control, and rate-limited environments. For full coverage, increase it carefully and monitor runtime.
Why do I need proxy settings? Hotel search pages can be rate-limited or protected by bot detection. A proxy can reduce request failures and improve stability, especially when running many URLs or collecting data repeatedly.
How do I avoid duplicate hotels across runs?
Use the id field as your primary key for deduplication. If you store historical snapshots, combine (id, departureDate, returnDate, from_url) as a composite key for more precise tracking.
Primary Metric: Extracts ~20–40 hotel cards per minute per URL under typical conditions (depends on page weight and proxy latency).
Reliability Metric: 93–98% successful item capture across multi-URL runs when retries are enabled and a stable proxy is used.
Efficiency Metric: Runs with steady memory usage under ~250–450 MB for moderate workloads (1–5 URLs, 20–50 items each), scaling mainly with browser sessions.
Quality Metric: 95%+ field completeness for core fields (id, title, url, price, rating, location), with geo coordinates present when available on the listing cards.
