Skip to content

feature(extract-worker): implement the markdown extraction workflow#46

Merged
ClemDoum merged 9 commits into
mainfrom
feature(extract-worker)/extract-worker
Jun 8, 2026
Merged

feature(extract-worker): implement the markdown extraction workflow#46
ClemDoum merged 9 commits into
mainfrom
feature(extract-worker)/extract-worker

Conversation

@ClemDoum

@ClemDoum ClemDoum commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Changes

Implement a first version of content extraction worker, extracting content in mardown format and save it in the artifact cache.

extract-worker

Added

  • implemented a ExtractMarkdownContentWorkflow workflow with extract_worker_config, create_markdown_extract_batches and extract_markdown_content activities

datashare-python

Fixed

  • implemented several bugs in worker project initialization

@ClemDoum ClemDoum force-pushed the feature(extract-worker)/extract-worker branch from d697ea0 to a210513 Compare June 5, 2026 16:46
@ClemDoum ClemDoum force-pushed the feature(extract-worker)/extract-worker branch from c8f52f1 to 0e7ae65 Compare June 8, 2026 11:59
@ClemDoum ClemDoum force-pushed the feature(extract-worker)/extract-worker branch from fb3fd03 to a285ffa Compare June 8, 2026 12:10
@ClemDoum ClemDoum self-assigned this Jun 8, 2026
@ClemDoum ClemDoum marked this pull request as ready for review June 8, 2026 12:16
@ClemDoum ClemDoum merged commit 7a1b819 into main Jun 8, 2026
11 checks passed
@ClemDoum ClemDoum deleted the feature(extract-worker)/extract-worker branch June 8, 2026 12:37
@ClemDoum

ClemDoum commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Addresses #37 #38

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant