Skip to content

Latest commit

 

History

History
27 lines (20 loc) · 1.15 KB

File metadata and controls

27 lines (20 loc) · 1.15 KB

newscrape

Python-based news scraper that collects news from Canadian news outlets via user-inputted keywords to filter article titles and body text

How to Use?

When you run the executable, you should see a GUI pop up with some info text.

What are Keywords and Entrywords? Keywords are words that the scraper will use to filter article titles. Entrywords are words that the scraper will use to filter the text inside of the article.

To add/remove a Keyword/Entryword:

  • Type in the entry box your keyword (caps insensitive)
  • Click the "Keyword" or "Entryword" radio button
  • Click "Add"
  • If removing, select the item in the list and click "Remove"

Once given all keywords and entrywords, click "Submit" and it will start the scraping process.

The output CSV will be in output/news_articles.csv

Current list of news sites parsed

More will be added if there is demand