-
Notifications
You must be signed in to change notification settings - Fork 92
Add Sofascore referee scraping methods #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -177,6 +177,63 @@ def get_match_dict(self, match_id: str | int) -> dict: | |
| data = response["event"] | ||
| return data | ||
|
|
||
| # ============================================================================================== | ||
| def get_match_referee(self, match_id: str | int) -> dict | None: | ||
| """ Get the referee dict for a single match | ||
|
|
||
| :param match_id: Sofascore match URL or match ID | ||
| :type match_id: str | int | ||
| :return: Referee dict for the match, or None if the match does not have a referee field. | ||
| :rtype: dict | None | ||
| """ | ||
| match_dict = self.get_match_dict(match_id) | ||
| return match_dict["referee"] if "referee" in match_dict else None | ||
|
|
||
| # ============================================================================================== | ||
| def get_referee(self, referee_id: str | int) -> dict: | ||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you think there's any value in creating a SofascoreReferee object similar to what I've done for SofascorePlayer? Class attributes would things like id, name, and then the matches being scraped by the next function in your PR? And then this function would return an instance of the SofascoreReferee object? |
||
| """ Get a referee dict from a referee ID | ||
|
|
||
| :param referee_id: Sofascore referee ID | ||
| :type referee_id: str | int | ||
| :raises TypeError: If ``referee_id`` is not a string or int. | ||
| :rtype: dict | ||
| """ | ||
| if not isinstance(referee_id, int) and not isinstance(referee_id, str): | ||
| raise TypeError("`referee_id` must be a string or int.") | ||
|
|
||
| response = botasaurus_browser_get_json(f"{API_PREFIX}/referee/{int(referee_id)}") | ||
| return response["referee"] | ||
|
|
||
| # ============================================================================================== | ||
| def get_referee_matches(self, referee_id: str | int, max_pages: int=10) -> list[dict]: | ||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've never really used referee data before so a few questions:
Again, I've never used ref data so I'm not familiar with the use case and what info is valuable or not valuable. |
||
| """ Get recent match dicts for a referee | ||
|
|
||
| :param referee_id: Sofascore referee ID | ||
| :type referee_id: str | int | ||
| :param max_pages: Maximum number of pages to request. Defaults to 10. | ||
| :type max_pages: int | ||
| :raises TypeError: If ``referee_id`` is not a string or int. | ||
| :raises TypeError: If ``max_pages`` is not an int. | ||
| :return: Flat list of event dicts for the referee's recent matches. | ||
| :rtype: list[dict] | ||
| """ | ||
| if not isinstance(referee_id, int) and not isinstance(referee_id, str): | ||
| raise TypeError("`referee_id` must be a string or int.") | ||
| if not isinstance(max_pages, int): | ||
| raise TypeError("`max_pages` must be an int.") | ||
|
|
||
| matches = list() | ||
| referee_id = int(referee_id) | ||
| for page in range(max_pages): | ||
| response = botasaurus_browser_get_json( | ||
| f"{API_PREFIX}/referee/{referee_id}/events/last/{page}" | ||
| ) | ||
| if "events" not in response or len(response["events"]) == 0: | ||
| break | ||
| matches += response["events"] | ||
|
|
||
| return matches | ||
|
|
||
| # ============================================================================================== | ||
| def get_team_names(self, match_id: str | int) -> tuple[str, str]: | ||
| """ Get the team names for the home and away teams | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function necessary if it's just accessing a key returned by
get_match_dict()?