TeachersPayTeachers Web Scraper

This is a web scraper for TeachersPayTeachers.com, built using Selenium. The script allows users to input a search keyword, and then it scrapes the first 300 results from the site. The script captures essential information about each resource, including:

Name of Resource
Overall Rating
Number of Ratings
Description from Seller
Keywords
Grade Levels
Price

The collected data is then saved to a CSV file.

This web scraper was created as part of a paid project for educuational research under Lydia Beahm, a professor in the Department of Education and Human Development at Clemson University.

Features

User Input: The script prompts the user to enter a search keyword.
Pagination: The script handles pagination to collect up to the first 300 results (12 pages with 25 results per page).
Data Capturing: Captures detailed information such as ratings, description, price, and more for each resource.
CSV Export: Stores all extracted information in a well-structured CSV file for easy access and analysis.

Requirements

Prerequisites

Before running the script, ensure you have the following installed:

Python 3.x: You can download it from Python's official website.
Selenium: Install it via pip:
```
pip install selenium
```
Chrome WebDriver: Download the appropriate version of Chrome WebDriver for your system from here, and ensure it is accessible in your system's PATH.

Additional Libraries

CSV: Used for reading and writing to CSV files (part of Python's standard library).
Selenium WebDriver: Used to control the web browser.

How to Run

Clone the repository:

git clone https://github.com/adamruns/teachers4teachers-scraper.git
cd teachers4teachers-scraper

Install dependencies (if not done already):
```
pip install selenium
```
Download Chrome WebDriver and ensure it's in your system's PATH.
Run the script:
```
python extract_info.py
```
Input the Search Keyword when prompted:
- Example: classroom management
- The scraper will then fetch the first 300 results related to this keyword.
Wait for the script to complete: It will scrape all the relevant data and save it in a CSV file (product_info_with_grades_ratings_price.csv).

Captured Information

The script captures the following information for each resource:

Name of Resource: The title of the educational resource.
Overall Rating: The average rating out of 5.
Number of Ratings: The total number of ratings the resource has received.
Description from Seller: The detailed description provided by the seller.
Keywords: Any tags or keywords associated with the resource.
Grade Levels: The grade levels (e.g., PreK, 1st Grade) for which the resource is suitable.
Price: The listed price of the resource.

Output

The script generates a CSV file named product_info_with_grades_ratings_price.csv with the following columns:

Title	Short Description	Long Description	Grades Used	Overall Rating	Number of Ratings	Price	URL	Keyword
...	...	...	...	...	...	...	...	...

Customization

You can modify the script to:

Change the number of pages scraped by adjusting the total_pages variable.
Adjust search parameters such as filtering by grade level, subject, or resource type.

Troubleshooting

Common Issues

Stale Element Reference: If you encounter a "Stale Element Reference" error, it means the page updated dynamically while the script was trying to access a specific element. The script handles some retries to mitigate this, but you can further increase the waiting time between page loads.
WebDriver Issues: Ensure that your version of Chrome WebDriver matches your installed version of Chrome. You can check this by going to chrome://settings/help in your Chrome browser and downloading the appropriate WebDriver from here.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
combined_script.py		combined_script.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TeachersPayTeachers Web Scraper

Features

Requirements

Prerequisites

Additional Libraries

How to Run

Captured Information

Output

Customization

Troubleshooting

Common Issues

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TeachersPayTeachers Web Scraper

Features

Requirements

Prerequisites

Additional Libraries

How to Run

Captured Information

Output

Customization

Troubleshooting

Common Issues

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages