Skip to content

Incorporate and fix scraper issues #81

@mkao006

Description

@mkao006

Incorporate the implementation by Marco and Luca and make it functional.

There are a few issues faced.

  1. The filter_links_already_seen method doesn't actually work. The scraper takes the same amount of time after multiple iterations.
    task_duration

  2. It seems the implementation doesn't write back to the database. The table isn't even created, which resulted all downstream to fail.
    downstream_fail

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions