GSMArena Mobile Data Scraper

Overview

A Python-based web crawler leveraging Playwright to scrape detailed information about mobile phones from GSMArena. This project is designed for structured data extraction and comes in two versions:

Text-only scraping (current branch): Extracts specifications, comments, and metadata.
Image-inclusive scraping (separate branch): Downloads associated images alongside textual data.

Features

Human-like Interactions: Simulates mouse movements and actions to evade bot detection.
CSV Storage: Saves data in a structured CSV format with robust error handling.
Asynchronous Operations: Ensures high efficiency and performance.
Configurable Scraping Range: Easily adjust the range of pages to be scraped.

Requirements

Python 3.x
pandas
openpyxl
playwright
httpx

Installation

Clone the repository:

git clone https://github.com/mahdi-mosafa/mobile-specs-crawler.git
cd mobile-specs-crawler

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GSMArena Mobile Data Scraper

Overview

Features

Requirements

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

GSMArena Mobile Data Scraper

Overview

Features

Requirements

Installation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages