Belval
diff --git a/‎.codecov.yml‎
Lines changed: 1 addition & 1 deletion b/‎.codecov.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 0 deletions b/‎.gitignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎.travis.yml‎
Lines changed: 1 addition & 0 deletions b/‎.travis.yml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎Dockerfile‎
Lines changed: 1 addition & 1 deletion b/‎Dockerfile‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎MANIFEST.in‎
Lines changed: 7 additions & 0 deletions b/‎MANIFEST.in‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 43 additions & 24 deletions b/‎README.md‎
Lines changed: 43 additions & 24 deletions
diff --git a/‎requirements-hw.txt‎
Lines changed: 4 additions & 9 deletions b/‎requirements-hw.txt‎
Lines changed: 4 additions & 9 deletions
diff --git a/‎requirements.txt‎
Lines changed: 1 addition & 6 deletions b/‎requirements.txt‎
Lines changed: 1 addition & 6 deletions
diff --git a/‎setup.cfg‎
Lines changed: 2 additions & 0 deletions b/‎setup.cfg‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎setup.py‎
Lines changed: 52 additions & 0 deletions b/‎setup.py‎
Lines changed: 52 additions & 0 deletions
@@ -1,3 +1,3 @@
 ignore:
-  - "TextRecognitionDataGenerator/run.py"
+  - "trdg/run.py"
   - "tests.py"
@@ -99,3 +99,5 @@ ENV/
 
 # mypy
 .mypy_cache/
+
+.vscode/*
@@ -4,6 +4,7 @@ python:
 install:
   - pip install -r requirements-hw.txt
   - pip install codecov
+  - python3 setup.py install
 script:
   - python3 tests.py
   - coverage run tests.py
 
@@ -26,4 +26,4 @@ COPY requirements.txt /app/
 
 RUN pip3 install -r requirements.txt
 
-COPY TextRecognitionDataGenerator/ /app
+COPY trdg/ /app
@@ -0,0 +1,7 @@
+include README.md
+include LICENSE
+include trdg/fonts/latin/*
+include trdg/fonts/cn/*
+include trdg/pictures/*
+include trdg/dicts/*
+include trdg/texts/*
@@ -1,4 +1,4 @@
-# TextRecognitionDataGenerator [![TravisCI](https://travis-ci.org/Belval/TextRecognitionDataGenerator.svg?branch=master)](https://travis-ci.org/Belval/TextRecognitionDataGenerator) [![codecov](https://codecov.io/gh/Belval/TextRecognitionDataGenerator/branch/master/graph/badge.svg)](https://codecov.io/gh/Belval/TextRecognitionDataGenerator) [![Documentation Status](https://readthedocs.org/projects/textrecognitiondatagenerator/badge/?version=latest)](https://textrecognitiondatagenerator.readthedocs.io/en/latest/?badge=latest)
+# TextRecognitionDataGenerator [![TravisCI](https://travis-ci.org/Belval/TextRecognitionDataGenerator.svg?branch=master)](https://travis-ci.org/Belval/TextRecognitionDataGenerator) [![PyPI version](https://badge.fury.io/py/TextRecognitionDataGenerator.svg)](https://badge.fury.io/py/TextRecognitionDataGenerator) [![codecov](https://codecov.io/gh/Belval/TextRecognitionDataGenerator/branch/master/graph/badge.svg)](https://codecov.io/gh/Belval/TextRecognitionDataGenerator) [![Documentation Status](https://readthedocs.org/projects/textrecognitiondatagenerator/badge/?version=latest)](https://textrecognitiondatagenerator.readthedocs.io/en/latest/?badge=latest)
 
 A synthetic data generator for text recognition
 
@@ -8,19 +8,9 @@ Generating text image samples to train an OCR software. Now supporting non-latin
 
 ## What do I need to make it work?
 
-I use Archlinux so I cannot tell if it works on Windows yet.
+Just install the pip package using `pip install trdg`. Afterwards, you can use `trdg` from the CLI. I recommend using a virtualenv instead of installing with `sudo`.
 
-```
-Python 3.X
-OpenCV 4 (Works with 3.2, probably works with 2.4)
-Pillow
-Numpy
-Requests
-BeautifulSoup
-tqdm
-```
-
- You can simply use `pip install -r requirements.txt` too.
+If you want to add another language, you can clone the repository instead. Simply run `pip install -r requirements.txt`
 
 ## Docker image
 
@@ -35,21 +25,51 @@ docker run /output/path/:/app/out/ -t belval/trdg:latest python3 run.py [args]
 The path (`/output/path/`) must be absolute.
 
 ## New
+- Add python module
+- Move `run.py` to an executable python file ([`trdg/bin/trdg`](trdg/bin/trdg))
 - Add `--font` to use only one font for all the generated images (Thank you @JulienCoutault!)
 - Add `--fit` and `--margins` for finer layout control
 - Change the text orientation using the `-or` parameter
-- Change the space width using the `-sw` parameter
 - Specify text color range using `-tc '#000000,#FFFFFF'`, please note that the quotes are **necessary**
-- Explicit alignment when using `-al` with fixed width (0: Left, 1: Center, 2: Right)
 - Add support for Simplified and Traditional Chinese
 
 ## How does it work?
 
 Words will be randomly chosen from a dictionary of a specific language. Then an image of those words will be generated by using font, background, and modifications (skewing, blurring, etc.) as specified.
 
-### Basic
+### Basic (Python module)
+
+The usage as a Python module is very similar to the CLI, but it is more flexible if you want to include it directly in your training pipeline, and will consume less space and memory. There are 4 generators that can be used.
 
-`python run.py -w 5 -f 64`
+```py
+from TextRecognitionDataGenerator.generators import (
+    GeneratorFromDict,
+    GeneratorFromRandom,
+    GeneratorFromStrings,
+    GeneratorFromWikipedia,
+)
+
+# The generators use the same arguments as the CLI, only as parameters
+generator = GeneratorFromStrings(
+    ['Test1', 'Test2', 'Test3'],
+    blur=2,
+    random_blur=True
+)
+
+for img in generator:
+    # Do something with the pillow images here.
+```
+
+You can see the full class definition here:
+
+- [`GeneratorFromDict`](trdg/generators/from_dict.py)
+- [`GeneratorFromRandom`](trdg/generators/from_random.py)
+- [`GeneratorFromStrings`](trdg/generators/from_strings.py)
+- [`GeneratorFromWikipedia`](trdg/generators/from_wikipedia.py)
+
+### Basic (CLI)
+
+`trdg -c 1000 -w 5 -f 64`
 
 You get 1,000 randomly generated images with random text on them like:
 
@@ -59,9 +79,11 @@ You get 1,000 randomly generated images with random text on them like:
 ![4](samples/4.jpg "4")
 ![5](samples/5.jpg "5")
 
+By default, they will be generated to `out/` in the current working directory.
+
 ### Text skewing
 
-What if you want random skewing? Add `-k` and `-rk` (`python run.py -w 5 -f 64 -k 5 -rk`)
+What if you want random skewing? Add `-k` and `-rk` (`trdg -c 1000 -w 5 -f 64 -k 5 -rk`)
 
 ![6](samples/6.jpg "6")
 ![7](samples/7.jpg "7")
@@ -114,16 +136,13 @@ It uses a Tensorflow model trained using [this excellent project](https://github
 
 The text is chosen at random in a dictionary file (that can be found in the *dicts* folder) and drawn on a white background made with Gaussian noise. The resulting image is saved as [text]\_[index].jpg
 
-There are a lot of parameters that you can tune to get the results you want, therefore I recommend checking out `python run.py -h` for more information.
+There are a lot of parameters that you can tune to get the results you want, therefore I recommend checking out `trdg -h` for more information.
 
 ## Create images with Chinese text
 
-It is simple! Just do `python run.py -l cn -c 1000 -w 5`!
+It is simple! Just do `trdg -l cn -c 1000 -w 5`!
 
 Generated texts come both in simplified and traditional Chinese scripts.
-You may have to edit `texts/cn.txt` to include some meaningful words instead of random glyphs.
-
-Here are examples of what I could make with it:
 
 Traditional:
 
@@ -148,7 +167,7 @@ If you want to add a new non-latin language, the amount of work is minimal.
 
 1. Create a new folder with your language [two-letters code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes)
 2. Add a .ttf font in it
-3. Edit `run.py` to add an if statement in `load_fonts()`
+3. Edit `bin/trdg` to add an if statement in `load_fonts()`
 4. Add a text file in `dicts` with the same two-letters code
 5. Run the tool as you normally would but add `-l` with your two-letters code
 
 
@@ -1,9 +1,4 @@
-beautifulsoup4==4.6.0
-numpy==1.15.1
-opencv-python==4.0.0.21
-tqdm==4.23.4
-Pillow==5.1.0
-requests==2.20.0
-tensorflow==1.13.1
-matplotlib==3.0.2
-seaborn==0.9.0
+.
+tensorflow>=1.13.1
+matplotlib>=3.0.2
+seaborn>=0.9.0
@@ -1,6 +1 @@
-beautifulsoup4==4.6.0
-numpy==1.15.1
-opencv-python==4.0.0.21
-tqdm==4.23.4
-Pillow==5.1.0
-requests==2.20.0
+.
@@ -0,0 +1,2 @@
+[metadata]
+description-file = README.md
@@ -0,0 +1,52 @@
+# Always prefer setuptools over distutils
+from setuptools import setup, find_packages
+
+# To use a consistent encoding
+from codecs import open
+from os import path
+
+here = path.abspath(path.dirname(__file__))
+
+with open(path.join(here, "README.md"), encoding="utf-8") as f:
+    long_description = f.read()
+
+setup(
+    name="trdg",
+    version="1.1.0",
+    description="TextRecognitionDataGenerator: A synthetic data generator for text recognition",
+    long_description=long_description,
+    long_description_content_type="text/markdown",
+    url="https://github.com/Belval/TextRecognitionDataGenerator",
+    author="Edouard Belval",
+    author_email="edouard@belval.org",
+    # Choose your license
+    license="MIT",
+    # See https://pypi.python.org/pypi?%3Aaction=list_classifiers
+    classifiers=[
+        #   3 - Alpha
+        #   4 - Beta
+        #   5 - Production/Stable
+        "Development Status :: 3 - Alpha",
+        "Intended Audience :: Developers",
+        "License :: OSI Approved :: MIT License",
+        "Programming Language :: Python :: 2",
+        "Programming Language :: Python :: 2.7",
+        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.4",
+        "Programming Language :: Python :: 3.5",
+        "Programming Language :: Python :: 3.6",
+        "Programming Language :: Python :: 3.7",
+    ],
+    keywords="synthetic data text-recognition training-set-generator ocr dataset fake text",
+    packages=find_packages(exclude=["contrib", "docs", "tests"]),
+    include_package_data=True,
+    install_requires=[
+        "pillow>=5.1.0",
+        "numpy>=1.15.1,<1.17",
+        "requests>=2.20.0",
+        "opencv-python>=4.0.0.21",
+        "tqdm>=4.23.0",
+        "beautifulsoup4>=4.6.0"
+    ],
+    scripts=["trdg/bin/trdg"],
+)
Original file line number	Diff line number	Diff line change
`@@ -26,4 +26,4 @@ COPY requirements.txt /app/`
`26`	`26`
`27`	`27`	`RUN pip3 install -r requirements.txt`
`28`	`28`
`29`		`-COPY TextRecognitionDataGenerator/ /app`
	`29`	`+COPY trdg/ /app`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+[metadata]`
	`2`	`+description-file = README.md`