Skip to content

bash create_datasets_from_start.sh Throws 404 error for downloading wikipedia dataset #16

@ghost

Description

bash create_datasets_from_start.sh throws error downloading datasets when https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2 is updating (shows 404 error).
On top of that in order to change that, you cannot just change it in WikiDownloader.py, it is hardcoded way back in /opt/conda/lib/python3.8/site-packages/lddl/download/wikipedia.py. I had to change in that file with another available dataset.
Used dataset: https://dumps.wikimedia.your.org/enwiki/20220820/enwiki-20220820-pages-articles.xml.bz2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions