Skip to content

fix and improve the identification of ARD Mediethek identifiers#4915

Merged
dvikan merged 1 commit intoRSS-Bridge:masterfrom
h4b4n3r0:fix-and-improve-ard-mediathek
Apr 2, 2026
Merged

fix and improve the identification of ARD Mediethek identifiers#4915
dvikan merged 1 commit intoRSS-Bridge:masterfrom
h4b4n3r0:fix-and-improve-ard-mediathek

Conversation

@h4b4n3r0
Copy link
Copy Markdown
Contributor

This fixes #4478

Fix show ID extraction for ARD-Mediathek URLs with trailing numbers

Problem

The current code in ARDMediathekBridge.php assumes the show ID is always the last segment of the URL:

$pathComponents = explode('/', $this->getInput('path'));
$showID = $pathComponents[last_non_empty_segment];

However, some ARD-Mediathek URLs append a numeric segment (e.g., /1) after the actual show ID, which relates to the selected season.

Example URL:

https://www.ardmediathek.de/serie/tage-die-es-nicht-gab-oder-staffel-2/staffel-1/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2UtZGllLWVzLW5pY2h0LWdhYg/1

explode('/') produces:

[
  "https:",
  "",
  "www.ardmediathek.de",
  "serie",
  "tage-die-es-nicht-gab-oder-staffel-2",
  "staffel-1",
  "Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2UtZGllLWVzLW5pY2h0LWdhYg",
  "1"
]

The current logic incorrectly picks the last segment 1 instead of the actual ID:

Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2UtZGllLWVzLW5pY2h0LWdhYg

Solution: Regex-based extraction

The ARD IDs are always base64 encoded strings, with the prefix 'crid://'

Thus, ARD IDs always start with the prefix Y3JpZDov, a regex can directly extract them.

preg_match('~(Y3JpZDov[^/]+)~', $this->getInput('path'), $matches);

if (empty($matches[1])) {
    throwClientException('Could not extract show ID');
}

$showID = $matches[1];
  • Works with full URLs and plain IDs
  • More robust and future-proof

Impact

  • Fixes broken feeds caused by extra numeric URL segments
  • Works for all ARD-Mediathek show URLs
  • Prevents Could not extract show ID errors

@h4b4n3r0
Copy link
Copy Markdown
Contributor Author

This was manually tested. It is ready to review.

@github-actions
Copy link
Copy Markdown

Pull request artifacts

Bridge Context Status
ARDMediathek 1 untitled (current) Bridge returned error 404! (20512)
Type: HttpException
Message: https://api.ardmediathek.de/page-gateway/widgets/ard/asset/None?pageSize=29 resulted in 404 Not Found
ARDMediathek 1 untitled (pr) Bridge returned error 400! (20512)
Type: ClientException
Message: Could not extract show ID

last change: Saturday 2026-02-28 16:15:11

@Bockiii
Copy link
Copy Markdown
Contributor

Bockiii commented Mar 21, 2026

The pr test failed and couldnt extract the ID. the test always uses the defaults for the bridge, in this case "https://www.ardmediathek.de/sendung/45-min/Y3JpZDovL25kci5kZS8xMzkx"

Can you manually check if it works for you with that show?

@h4b4n3r0
Copy link
Copy Markdown
Contributor Author

Dear @Bockiii, thanks for the response. Strangely code is working perfectly on my end.

docker build -t rss-bridge .
docker create --name rss-bridge --publish 3000:80 --volume ./config:/config rss-bridg
docker start rss-bridge
grafik

For improved testing I also added var_dumps above and below which can be seen in my screenshot:

        $path = $this->getInput('path');
        var_dump($path);   
        preg_match('~(Y3JpZDov[^/]+)~', $this->getInput('path'), $matches);
        var_dump($matches);

Not sure, what is the problem here.

@h4b4n3r0
Copy link
Copy Markdown
Contributor Author

I think I found the issue. I assume its in the testing end:
grafik

It says path=None on calling the endpoint.

@dvikan dvikan merged commit 8cb8f73 into RSS-Bridge:master Apr 2, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ARD-Mediathek Bridge failed with error 404

3 participants