Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions src/ScraperFC/sofascore.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,63 @@ def get_match_dict(self, match_id: str | int) -> dict:
data = response["event"]
return data

# ==============================================================================================
def get_match_referee(self, match_id: str | int) -> dict | None:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function necessary if it's just accessing a key returned by get_match_dict()?

""" Get the referee dict for a single match

:param match_id: Sofascore match URL or match ID
:type match_id: str | int
:return: Referee dict for the match, or None if the match does not have a referee field.
:rtype: dict | None
"""
match_dict = self.get_match_dict(match_id)
return match_dict["referee"] if "referee" in match_dict else None

# ==============================================================================================
def get_referee(self, referee_id: str | int) -> dict:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think there's any value in creating a SofascoreReferee object similar to what I've done for SofascorePlayer? Class attributes would things like id, name, and then the matches being scraped by the next function in your PR? And then this function would return an instance of the SofascoreReferee object?

""" Get a referee dict from a referee ID

:param referee_id: Sofascore referee ID
:type referee_id: str | int
:raises TypeError: If ``referee_id`` is not a string or int.
:rtype: dict
"""
if not isinstance(referee_id, int) and not isinstance(referee_id, str):
raise TypeError("`referee_id` must be a string or int.")

response = botasaurus_browser_get_json(f"{API_PREFIX}/referee/{int(referee_id)}")
return response["referee"]

# ==============================================================================================
def get_referee_matches(self, referee_id: str | int, max_pages: int=10) -> list[dict]:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never really used referee data before so a few questions:

  • Is there any reason to default to 10 pages and now just scraping all of the matches?
  • Is there any concern with return a list of match dicts being "too much"? Either too much data/RAM or too much info? Thoughts on just returning a list of match IDs?

Again, I've never used ref data so I'm not familiar with the use case and what info is valuable or not valuable.

""" Get recent match dicts for a referee

:param referee_id: Sofascore referee ID
:type referee_id: str | int
:param max_pages: Maximum number of pages to request. Defaults to 10.
:type max_pages: int
:raises TypeError: If ``referee_id`` is not a string or int.
:raises TypeError: If ``max_pages`` is not an int.
:return: Flat list of event dicts for the referee's recent matches.
:rtype: list[dict]
"""
if not isinstance(referee_id, int) and not isinstance(referee_id, str):
raise TypeError("`referee_id` must be a string or int.")
if not isinstance(max_pages, int):
raise TypeError("`max_pages` must be an int.")

matches = list()
referee_id = int(referee_id)
for page in range(max_pages):
response = botasaurus_browser_get_json(
f"{API_PREFIX}/referee/{referee_id}/events/last/{page}"
)
if "events" not in response or len(response["events"]) == 0:
break
matches += response["events"]

return matches

# ==============================================================================================
def get_team_names(self, match_id: str | int) -> tuple[str, str]:
""" Get the team names for the home and away teams
Expand Down
Loading