Kontext. Kern. Karte. (Context. Core. Card.)
Kardenwort is an intelligent command-line utility designed to accelerate language learning by deconstructing text and automatically creating context-rich flashcards for Anki. It serves as a powerful offline companion to your study materials, transforming any text—books, articles, or AI-generated content—into a structured vocabulary list ready for efficient learning.
This tool is not just a word collector; it's an intelligent pipeline powered by two NLP libraries, large dictionaries, semantic rules, and a user-trainable override system to achieve high-accuracy lemmatization and word deconstruction, especially for grammatically complex languages like German.
- Kardenwort
- Map of Contents
- The Kardenwort Philosophy in Brief
- Key Features
- Key Advantages and Differences from Alternatives
- Project Structure
- Installation and Setup
- Usage and Workflows
- Core Functionality: The Two Main Modes
- Understanding Input Processing
- The Processing Pipeline in Detail
- The Anki Card Template
- Command-Line Arguments Reference
- Configuration
- Flexible Anki Field Mapping
- Important Notes
- Our Ecosystem
- Development and Testing
- My Personal Motivation
- Kardenwort Ecosystem
- License and Acknowledgements
The goal of Kardenwort is to reduce the complexity of language learning, particularly for synthetic languages like German where words are heavily inflected and compounded. It achieves this by automating the difficult task of deconstructing words to their base form (lemma).
Our core principles are:
- Separating Reading from Study: Reduce cognitive load by splitting content consumption and vocabulary acquisition into two distinct, focused activities.
- Medium Independence: Kardenwort is a companion to your learning material, not a replacement. Use it with physical books, PDFs, or any other media without losing the original context (diagrams, formatting, etc.).
- Offline First & Privacy: The entire process runs locally. Your data is never sent to the cloud, ensuring privacy and reliability.
- Simple is Not Easy: We do the complex work of linguistic analysis to provide you with a simple, clean, and actionable list of words, making your learning process easy.
- Intelligent Lemmatization: Uses
spaCyto accurately find the base form of words. - Advanced German Deconstruction: Employs
german-compound-splitter(GCS) to break down long German compound words into their components. - User-Trainable: Fine-tune the lemmatization for your specific texts using a simple
lemma_override.tsvfile. Corrections are saved forever and automatically reapplied. - Rich Context: Each word card includes the original sentence and surrounding context.
- Dual Card Types: Generates both vocabulary cards (
wordtype) and full sentence cards (sentencetype) in a single run withmixed-triplemode. - Hierarchical Deck Creation: Automatically build nested Anki decks from Markdown headers (
#,##) in your source text. - Automatic Deck Descriptions: Populates Anki deck descriptions with the full source text and translations, providing valuable context directly within the deck browser.
- Granular Deck Control: Generate sentence-level subdecks for highly organized study sets.
- Fully Configurable Field Mapping: Decouple your Anki Note Type from the source code. Map any field (e.g.,
Quotation,WordSource) to internal linguistic data viaconfig.ini. - Multi-Language Support: Currently supports English (en) and German (de).
- Direct Anki Integration: Automatically imports generated cards into Anki via a runner script.
- GoldenDict-ng Integration: Create vocabulary lists on-the-fly directly from your favorite dictionary application.
- Auditory-Focused Cards: The template is designed to work with audio, helping you practice listening and pronunciation.
- Configuration-Driven Intelligence: Extraction features (wordlists, sorting, indexing) are automatically enabled based on your Anki field mapping, reducing CLI complexity and ensuring consistent output.
While many text-processing tools for language learners exist (e.g., LingQ, Readlang, LanguageCrush Lute, LWT, FLTR alexandria-reader, Lemmatize, LinguaCafe, VocabSieve, AnkiMorphs, FrequencyMan, Watch Foreign Language Movies with Anki (movies2anki), Vocab Tracker, Language Reactor, asbplayer, Yet Another Language Learning Media Player (yallmp), subs2srs, Dualsub, YouTube™ Dual Subtitles, Smart Book, ReadEra, Yomitan (Yomichan), Local Audio Server for Yomichan, GoldenDict-ng ) Kardenwort offers a unique combination of capabilities:
- Superior German Language Processing: No other tool provides this level of German vocabulary deconstruction. Kardenwort correctly parses compound nouns, finds verbs with separable prefixes, and handles capitalization properly—a common pain point in other systems.
- Complete Freedom After Export: Unlike integrated readers where a flashcard is tied to the source text, our output is a fully autonomous TSV file. You have complete control to edit any field in Anki on any device, truly freeing your data.
- Quality You Can Influence: While the initial analysis relies on
spaCy, you can directly influence the results. By training the system through thelemma_override.tsvfile, you can achieve perfect processing for your specific texts and domain.
20250913122858-kardenwort/
├── data/
│ ├── de/
│ │ ├── deu-mixed-typical-2011-1m-words.csv
│ │ ├── german.dic
│ │ └── lemma_override_de.tsv
│ └── en/
│ ├── en-news-2023-1m-words.csv
│ └── lemma_override_en.tsv
├── docs/
│ ├── assets/
│ │ └── ...
│ └── kardenwort-goldendict-config.txt
├── results/
│ └── 20251115160000-morgen-faehrt-der-neue.triple.sentence.de.json
│ └── 20251115160000-morgen-faehrt-der-neue.triple.sentence.de.tsv
│ └── 20251115160030-morgen-faehrt-der-neue.triple.word.de.tsv
├── source_texts/
│ ├── text1.txt
│ ├── text2.txt
│ └── text3.txt
├── src/
│ └── kardenwort/
│ └── core/
│ ├── kardenwort.py
│ └── kardenwort_runner.py
├── tests/
│ ├── cases/
│ └── source_texts/
│ ├── de/
│ └── en/
├── .gitignore
├── config.ini
├── config.ini.template
├── LICENSE
└── README.md
Follow these steps to get the entire Kardenwort ecosystem up and running.
Prerequisites:
- Python 3.9: It is strongly recommended to use this specific version.
Important for Windows Users: Versions of Python higher than 3.9 (e.g., 3.10+) may require a C++ compiler (like Visual Studio Build Tools) to install dependencies such as
spaCy. To avoid these compilation issues, we recommend installing Python 3.9 directly from the Microsoft Store, which provides a hassle-free setup. - Anki Desktop: Must be installed and running.
- AnkiConnect Add-on: Install the AnkiConnect add-on in Anki.
⚠️ Important Dependency for Deck Descriptions The new feature for adding automatic descriptions to Anki decks (--anki-deck-content) requires a specific, modified version of AnkiConnect.Please download and install it from this repository: https://github.com/voothi/20251110002755-kardenwort-ankiconnect
If you use the standard AnkiConnect add-on, all other features will work correctly, but deck descriptions will not be updated.
-
Clone the Repositories: Clone all three projects into a common parent directory. For example, create a folder named
kardenwort-ecosystemand clone the repositories inside it.mkdir kardenwort-ecosystem cd kardenwort-ecosystem git clone https://github.com/kardenwort/20250913122858-kardenwort.git git clone https://github.com/kardenwort/20250913123240-kardenwort-anki-csv-importer.git git clone https://github.com/kardenwort/20250913123501-kardenwort-anki-templates.gitYour final structure will be:
kardenwort-ecosystem/ ├── 20250913122858-kardenwort/ ├── 20250913123240-kardenwort-anki-csv-importer/ └── 20250913123501-kardenwort-anki-templates/ -
Import the Anki Template: In the
20250913123501-kardenwort-anki-templatesproject, navigate to thedecks-for-first-initialize-templatesdirectory. Choose the latest version folder (e.g.,v1.0.0), select one of the.apkgdeck files inside, and import it into Anki Desktop. This will automatically add and configure the required note type. -
Set up a Shared Python Environment: We will create a single virtual environment one level above the project folders. This keeps the project directories clean and allows all scripts to use the same set of installed packages.
# First, navigate into the main project directory cd 20250913122858-kardenwort # Create the virtual environment in the parent directory (../) python -m venv ../20250914043440-kardenwort-spacy-env # Activate it ../20250914043440-kardenwort-spacy-env/Scripts/Activate.ps1 # Windows (PowerShell) # source ../20250914043440-kardenwort-spacy-env/bin/activate # macOS/Linux # Now that the environment is active, install dependencies from the requirements file pip install -r requirements.txt # Download SpaCy language models python -m spacy download en_core_web_lg python -m spacy download de_core_news_lg
-
Configure Kardenwort:
- While still inside the
20250913122858-kardenwortdirectory, copyconfig.ini.templatetoconfig.ini. - Open
config.iniand verify the paths under[environment]. The default relative paths are designed for this structure and should work without changes.
- While still inside the
-
Run a Test:
- Add some German text to
source_texts/text1.txt. - Ensure Anki is running with the modified AnkiConnect add-on.
- From the root of the
20250913122858-kardenwortproject, execute the runner script. Important: Your virtual environment must be active.
# This creates vocabulary (word) cards from a single German text file python src/kardenwort/core/kardenwort_runner.py --type word --mode single --language deIf successful, a new deck will appear in Anki. Your setup is complete.
- Add some German text to
The primary way to use the utility is via the kardenwort_runner.py script, which automates the entire process of text analysis and Anki import.
For a comprehensive and up-to-date list of command-line examples for various scenarios, please refer to the configuration file:
docs/kardenwort-goldendict-config.txt
Examples:
# Create German vocabulary cards from text1.txt and text2.txt with compound splitting
python src/kardenwort/core/kardenwort_runner.py --type word --mode dual --language de --de-gcs
# Create English sentence cards from text1.txt and text2.txt
python src/kardenwort/core/kardenwort_runner.py --type sentence --mode dual --language en
# Process a single string of text directly, suspend new cards
python src/kardenwort/core/kardenwort_runner.py --type word --mode single --language de --text "Das ist ein Test." --suspend-cards
# NEW: Process a markdown file in a single pass, creating both sentence and word cards in a
# shared hierarchical deck, and add the source text to the parent deck's description.
python src/kardenwort/core/kardenwort_runner.py --mode mixed-triple --language de --anki-markdown-decks --anki-deck-content parent-source --suspend-cardsFor Windows users, we provide a collection of ready-to-use batch scripts (.cmd) that cover all common processing scenarios. You can find them in the scripts/run/cmd/ directory (e.g., kardenwort_run_de_ws_t3_s_anki_v3.cmd).
These scripts offer a convenient way to run the tool without typing out all the arguments. However, they come with a significant limitation.
Please be aware that these
.cmdscripts have a limitation when used for on-the-fly text processing: they can only handle a single line of input.This restriction applies when text is passed directly via the
--textargument or from standard input (stdin), which is a common method for integration with tools like GoldenDict.To process multi-line text in GoldenDict, you must bypass these convenient
.cmdscripts. The correct approach is to configure GoldenDict to call thekardenwort_runner.pyscript directly, utilizing the--multi-textflag. You can find the correct commands for this in the provided configuration file:docs/kardenwort-goldendict-config.txt.
Create vocabulary lists or Anki cards instantly from any word or phrase you look up in GoldenDict. This is a powerful workflow for on-the-fly analysis.
You can configure multiple "program" dictionaries in GoldenDict to run Kardenwort with different settings. For example, for German, you could have three modes:
- Simple (S): Fast analysis without compound splitting.
- Medium (M): Analysis with compound splitting for common word types.
- Large (L): Deepest analysis, splitting compounds for almost all word types.
For detailed instructions and ready-to-use command-line examples, see the configuration file:
docs/kardenwort-goldendict-config.txt
The utility's primary goal is to extract material from text to create two types of cards, determined by the --type parameter:
-
--type word(Vocabulary Cards):- Goal: To create cards for studying individual words.
- Mechanism: The script analyzes the entire input text, extracts unique words based on the chosen deduplication scope, reduces them to their base form (lemma), and creates a separate row for each unique lemma.
- Specialty: This mode includes advanced logic like German compound splitting (GCS) and handling of separable verbs.
-
--type sentence(Sentence Cards):- Goal: To create cards with full sentences for studying phrases and grammar in context.
- Mechanism: The script processes input files line-by-line. For each content line from the first file, one record is created. If parallel texts are provided, the corresponding lines are added to the same record.
The result of the script's execution is a TSV file and an optional companion JSON metadata file (for deck descriptions), ready for import into Anki.
How the utility receives and interprets input data is key to its effective use.
- Ways to Provide Data: You can provide text via a command-line string (
--text "..."), a file path (--text1-file ...), an environment variable, or piped through standard input. - File Format: Input files must be plain text (
.txt) with UTF-8 encoding. For parallel texts, line-by-line correspondence is crucial.
This is a critical feature. The utility automatically chooses how to split text into "processing units":
- Line-by-Line Mode: If the input text contains at least one newline character (
\n), each line is treated as a separate, complete unit. This is ideal for subtitles or pre-formatted parallel texts. - Sentence Tokenization Mode: If the input text is a single block without newlines,
spaCy's sentence tokenizer is used to grammatically split it into sentences. This is perfect for prose from articles or books.
This mechanism directly determines what you will see as SentenceSource and context on your Anki card.
Using the --multi-text flag, you can provide up to three parallel texts (source, translation 1, translation 2) from a single source like --text or standard input. Simply separate the texts with ---. This is especially useful for integration with tools like GoldenDict.
# Example with multi-text
echo "Source text. --- First translation. --- Second translation." | python ... --multi-text- Initialization: The script loads the
spaCymodel, GCS dictionary, user-definedlemma_override.tsv, and a word frequency index. - Text Ingestion: Input text is read from a file, argument, environment variable, or stdin.
- Tokenization & Lemmatization: The text is broken into words (tokens). Each token undergoes a series of steps: GCS, separable verb handling, lemma correction, and application of user override rules.
- Collection & Sorting: Unique lemmas are collected based on the deduplication scope and sorted. Known words (from the frequency index) are listed first, followed by unknown words.
- TSV & JSON Generation: A structured TSV file is created. If
--anki-deck-contentis used, a companion.jsonfile containing deck descriptions is also generated. - Anki Import: The runner script passes the TSV and JSON files to the
kardenwort-anki-csv-importer, which creates/updates decks and cards in Anki.
The generated TSV files are designed for our feature-rich Anki template, which organizes the information into a clean and interactive layout.
Template Features:
- Interactive Collapsible Sections: Keep cards uncluttered by hiding and revealing information groups.
- Dynamic Fields: Fields only appear if they contain data. The 82-column TSV format includes special fields like
SentenceSourceIndexfor chronological sorting andDeckfor dynamic, hierarchical deck assignment. - Integrated Audio: Supports both pre-recorded audio and text-to-speech.
- Context Display: Shows the word in its original sentence, plus the preceding and succeeding sentences.
- Full Word List: Displays all unique words (lemmas) found in the source sentence.
Below is a detailed list of all available arguments for the core processing script (kardenwort.py) and its runner (kardenwort_runner.py).
| Argument | Description | Example |
|---|---|---|
--type |
The type of cards to create (word or sentence). Not needed for mixed-triple mode. |
--type word |
--lemmas-per-line |
A special mode that outputs one line of sorted lemmas per input line. Mutually exclusive with --type. |
--lemmas-per-line |
--language |
The source language of the text (de or en). |
--language de |
--mode |
(Runner only) Processing mode (single, dual, triple, mixed-triple). mixed-triple runs sentence and word modes sequentially for a shared deck. |
--mode mixed-triple |
--anki-csv-header |
(Runner only) JSON list of Anki field names. Overrides [anki_fields] from config.ini. |
--anki-csv-header '["FieldA", "FieldB"]' |
--anki-field-mapping |
(Runner only) JSON object mapping Anki fields to data sources. Overrides [anki_field_mapping.*] from config.ini. |
--anki-field-mapping '{"FieldA": "lemma"}' |
| Argument | Description | Example |
|---|---|---|
--text |
Process a string directly. Mutually exclusive with --text1-file. |
--text "This is a test." |
--multi-text |
Parse --text or stdin as up to three texts separated by ---. |
--multi-text |
--text1-file |
Path to the primary source text file. | --text1-file "source.txt" |
--text2-file |
Path to the second text file (e.g., translation). | --text2-file "target.txt" |
--text3-file |
Path to the third text file. | --text3-file "extra.txt" |
--output-file |
Path for the output .tsv file. If omitted, prints to standard output. |
--output-file "out/my_deck.tsv" |
--basename-add-timestamp |
Prepend a YYYYMMDDHHMMSS- timestamp to the output filename. |
--basename-add-timestamp |
--basename-add-first-words |
Appends a slug to the filename from the first N words (default: 4). |
--basename-add-first-words 3 |
--stdout-print-output-basename |
Print the final output filename to standard output. | --stdout-print-output-basename |
| Argument | Description | Example |
|---|---|---|
--anki-create-subdecks |
Generates a parent deck with a subdeck for each mode (e.g., My-Text::My-Text.word.de). |
--anki-create-subdecks |
--anki-markdown-decks |
Parses Markdown headers in the source text to create a hierarchical deck structure. | --anki-markdown-decks |
--anki-sentence-subdecks |
Creates a final subdeck level for each sentence. Requires --anki-markdown-decks. |
--anki-sentence-subdecks |
--anki-parent-deck |
Manually specifies a parent deck name for shared deck creation. | --anki-parent-deck "My-Book" |
--anki-deck-content |
Populates Anki deck descriptions. Choices: parent-source, parent-translations, subdeck-source, subdeck-translations. |
--anki-deck-content parent-source |
--strip-headers |
Strip Markdown headers from text fields in the final output. Choices: all, source, translations. Default is all if no argument is given. |
--strip-headers source |
--suspend-cards |
Suspends all newly imported or updated cards in Anki. | --suspend-cards |
| Argument | Description | Example |
|---|---|---|
--sentence-context-size |
Sets the number of preceding and succeeding sentences (N) to include as context. Runner default is 4. |
--sentence-context-size 2 |
--tts-destination-lang |
The destination language for TTS field activation (e.g., 'ru', 'en'). | --tts-destination-lang ru |
--add-wordlist-col |
(Auto-enabled) Include a list of unique words in SentenceSourceWordlist. Driven by mapping. |
--add-wordlist-col |
--wordlist-use-br |
Use <br> tags for wordlist. Can be set in config.ini [output_format]. |
--wordlist-use-br |
--add-header |
Include TSV header row. Can be set in config.ini [output_format]. |
--add-header |
--add-source-word-col |
(Auto-enabled) Add inflected word to WordSourceInflectedForm. Driven by mapping. |
--add-source-word-col |
--add-sentence-index-col |
(Auto-enabled) Add index for sorting to SentenceSourceIndex. Driven by mapping. |
--add-sentence-index-col |
| Argument | Description | Example |
|---|---|---|
--lemma-override-file |
Path to a TSV file for context-aware lemma overrides. | --lemma-override-file "data/overrides.tsv" |
--lemma-index-file |
Path to a word frequency CSV file for sorting. | --lemma-index-file "data/frequency.csv" |
--deduplication-scope |
Sets the scope for lemma deduplication. global: unique lemmas across the entire text. sentence: unique per sentence. none: no deduplication. |
--deduplication-scope sentence |
--prefer-shortest-form |
With global deduplication, prefer the shortest word form of a lemma instead of the first one encountered. |
--prefer-shortest-form |
--force-proper-noun-capitalization |
Force capitalization of proper noun lemmas (PROPN). | --force-proper-noun-capitalization |
| Argument | Description | Example |
|---|---|---|
--de-gcs |
Enable German Compound Splitting. | --de-gcs |
--de-dictionary-file |
Path to the dictionary file used by GCS for validation. | --de-dictionary-file "data/de/german.dic" |
--de-gcs-preserve-compound-word |
Include the original compound word in the card list along with its split parts. | --de-gcs-preserve-compound-word |
--de-gcs-add-parts-to-wordlist |
Also add the split components to the SentenceSourceWordlist field. |
--de-gcs-add-parts-to-wordlist |
--de-gcs-split-mode |
Set splitting mode: only-nouns (safe), any (aggressive), or combined. |
--de-gcs-split-mode combined |
--de-gcs-pos-tags |
Specify which Part-of-Speech tags to apply splitting to (e.g. NOUN PROPN or !VERB). |
--de-gcs-pos-tags "NOUN PROPN" |
--de-fix-genitive |
Attempts to correct German genitive noun lemmas (e.g., 'Hauses' -> 'Haus'). | --de-fix-genitive |
--de-force-noun-capitalization |
Force capitalization of all German noun lemmas (NOUN, PROPN). | --de-force-noun-capitalization |
| Argument | Description | Example |
|---|---|---|
--show-success-message |
Display a user-friendly success message on standard output upon completion. | --show-success-message |
--play-sound-on-completion |
Play a system beep sound upon successful completion of the entire process. | --play-sound-on-completion |
These flags are for direct console output when --output-file is not used.
| Argument | Description | Example |
|---|---|---|
--stdout-format |
Format for console output: list, context, tsv, html. |
`--stdout-format html` |
The behavior of the kardenwort_runner.py script is controlled by config.ini.
- Copy
config.ini.templatetoconfig.ini. - Open
config.iniand edit the paths under the[environment]section to match your system's setup.python_executable: Path to the Python executable inside your virtual environment.kardenwort_workspace: Path to this project's root folder.importer_workspace: Path to thekardenwort-anki-csv-importerproject folder.
Relative paths are supported and are calculated from the location of the config.ini file, making the setup portable.
Kardenwort follows a strict hierarchy for resolving settings:
- Command-Line Arguments: Any argument passed directly to
kardenwort_runner.pyorkardenwort.pytakes the highest priority. This allows you to override global defaults for specific runs. config.iniSettings: If an argument is not provided via CLI, the script falls back to the values defined in your configuration file.- Internal Defaults: If neither a CLI argument nor a config setting is present, the script uses safe, built-in defaults.
Note
Since version 2.0.0, output formatting options like --wordlist-use-br and --add-header should be primarily managed in the [output_format] section of config.ini for a cleaner CLI experience.
Kardenwort uses a configuration-driven system to map linguistic analysis results to your specific Anki Note Type. This allows you to use any Note Type without modifying the source code.
In the [anki_fields] section of config.ini, list the fields of your Anki Note Type in the exact order they appear:
[anki_fields]
Quotation
WordSource
SentenceSource
SentenceSourceWordlist
...Tip
You no longer need to number your fields (e.g., 1 = Quotation). A simple list is preferred; the system automatically calculates indices based on the line order.
Use the [anki_field_mapping.word] and [anki_field_mapping.sentence] sections to assign internal data to these fields.
[anki_field_mapping.word]
WordSource = lemma
Quotation = source_word
SentenceSource = source_sentence| Key | Description | Mode |
|---|---|---|
lemma |
The base form of the word (lemmatized). | Word |
source_word |
The original inflected word from the text. | Word |
source_sentence |
The current sentence/unit being processed. | Both |
source_context_left |
Preceding context sentence(s). | Both |
source_context_right |
Succeeding context sentence(s). | Both |
target_sentence |
Primary translation of the source sentence. | Both |
target_context_left |
Preceding translation context. | Both |
target_context_right |
Succeeding translation context. | Both |
tertiary_sentence |
Tertiary translation (if available). | Both |
tertiary_context_left |
Preceding tertiary translation context. | Both |
tertiary_context_right |
Succeeding tertiary translation context. | Both |
cloze |
The source sentence, intended for cloze deletion. | Both |
wordlist |
A list of all unique lemmas found in the sentence. | Both |
sentence_index |
The serial index of the sentence (e.g., 000001). |
Both |
deck_name |
The final computed Anki deck name. | Both |
tts_source_[lang] |
TTS flag (e.g., tts_source_de) - set to "1" on match. |
Both |
tts_dest_[lang] |
TTS flag (e.g., tts_dest_en) - set to "1" on match. |
Both |
- TSV File Persistence: The generated TSV export files in the
results/directory are not automatically deleted or rotated. You can use them for your own analysis or manually re-import them into Anki at any time. - Data Privacy: This utility is designed for offline use. Your text data is processed locally and is not sent to any external servers by this program. However, be aware that if you use Anki's synchronization feature, your card data will be stored on Anki's servers.
Kardenwort is a suite of integrated tools designed to work together seamlessly:
- Kardenwort (Core Engine):
- The core intelligent engine for text processing and vocabulary extraction.
- 20250913122858-kardenwort
- Anki CSV Importer:
- The bridge that automatically imports the generated vocabulary files into Anki.
- 20250913123240-kardenwort-anki-csv-importer
- Anki Templates:
- The powerful and feature-rich Anki card template that brings your vocabulary to life.
- 20250913123501-kardenwort-anki-templates
- AnkiConnect Fork:
- This is a custom build of the official AnkiConnect add-on, extended with a new API action to enable deeper integration with external tools, specifically the Kardenwort language learning ecosystem.
- 20251110002755-kardenwort-ankiconnect
If you need the latest updates, want to access intermediate versions, or wish to explore the development history and feature branches, please refer to our dedicated development repositories where active development takes place.
Every two weeks, the code is cleanly transferred from these development repos to the main public repositories. A new stable build is then created and tagged with a common version number across all related projects.
The project uses pytest for all testing. The test suite is organized into three distinct tiers:
tests/01_smoke/: Extremely fast, high-level sanity checks to ensure the CLI boots and basic string extractions work without fatal errors.tests/02_unit/: Granular tests targeting isolated functions, particularly core lexical logic (kardenwort.py) and command-line configurations (kardenwort_runner.py).tests/03_integration/: End-to-end tests that process full parallel text files dynamically discovered from thetests/cases/*directory. These tests physically generate TSV outputs and perform deep verification of field order, frequency-based sorting, and content matches against reference files.
Tip
Fastest First Logic: The test directories are prefixed with numbers (01_, 02_, 03_) to ensure pytest executes the fastest tests first. This "Fail Fast" approach ensures you catch basic errors in seconds before waiting for the heavy integration analysis.
Commands: Ensure your virtual environment is active before running tests.
# Run ALL tests (smoke, unit, and integration) using the dedicated virtual environment
U:\voothi\20250825231214-spacy-env\Scripts\python.exe -m pytest tests/ -v
# Run only a specific suite (e.g., unit tests)
U:\voothi\20250825231214-spacy-env\Scripts\python.exe -m pytest tests/02_unit/ -v
# Run tests and generate a code coverage report for the source code
U:\voothi\20250825231214-spacy-env\Scripts\python.exe -m pytest tests/ -v --cov=src --cov-report=term-missingFor those who want the latest features, bug fixes, or wish to explore the development history, we maintain a set of active development repositories. Code is periodically merged from these repos into the stable public ones listed above.
- Kardenwort (Core Engine):
- Anki CSV Importer:
- Anki Templates:
- AnkiConnect Fork:
This project was born from my own struggle and eventual success in learning German. With a background in IT and software development, I approached language learning as an engineering problem. This tool is the result of years of refinement, built to solve the real-world problems I faced. My goal is to make a powerful, simple, and reliable tool that can help others on their own language learning journeys. My native languages are Russian and Ukrainian, and I am passionate about creating tools that can help bridge cultural and linguistic divides.
This project is part of the Kardenwort environment, designed to create a focused and efficient learning ecosystem.
This project was created by and is maintained by Denis Novikov (voothi).
It is licensed under the MIT License. See the LICENSE file for details.
This project relies on the following excellent open-source libraries:


