pocr — snip & OCR with PaddleOCR

Snip a region of the screen, OCR it with PaddleOCR, copy the result to the clipboard. Two modes:

command	output
`pocr-text`	recognized text (English, line-separated)
`pocr-math`	LaTeX source of a math formula (`PP-FormulaNet`)

Both fire a desktop notification with a short preview when done. Works on X11 (maim + xclip) and Wayland (grim + slurp + wl-copy); the right tools are picked at runtime.

Requirements

uv
One screenshot + clipboard pair, depending on session:
- X11: maim, xclip, libnotify (notify-send)
- Wayland: grim, slurp, wl-clipboard, libnotify

Install — Linux (Ubuntu / Fedora / Arch / …)

./install.sh

Runs uv sync and symlinks pocr-text / pocr-math into ~/.local/bin/. After that the commands are on your PATH like any other binary.

Install — NixOS

A proper Nix package via uv2nix: every Python dep is built into /nix/store, autoPatchelfHook fixes the prebuilt PaddlePaddle / OpenCV wheels so libGL & friends resolve, and the wrapper scripts have the screenshot/clipboard tools on PATH.

Via your NixOS flake (recommended)

Add the flake to your inputs:

inputs.pocr = {
  url = "github:DanyGoT/pocr";
  inputs.nixpkgs.follows = "nixpkgs";
};

Pass it into your modules (via specialArgs on your nixosSystem call) and add the package:

{ pkgs, pocr, ... }: {
  environment.systemPackages = [
    pocr.packages.${pkgs.system}.default
  ];
}

Then sudo nixos-rebuild switch — pocr-text and pocr-math are on PATH.

Ad-hoc

nix run github:DanyGoT/pocr#pocr-text
nix run github:DanyGoT/pocr#pocr-math

Development

Editable, uv-managed venv inside a Nix-provided shell:

nix develop
uv sync

Usage

pocr-text          # draw a rectangle, get text
pocr-math          # draw a rectangle around a formula, get LaTeX

pocr-text --file img.png        # OCR a file, skip the snip
pocr-text --print               # also echo result to stdout

The first call to either command downloads its PaddleOCR model into ~/.paddlex/official_models/ (a few hundred MB; one-time).

Daemon

The client commands talk to a long-lived pocrd process over a UNIX socket at $XDG_RUNTIME_DIR/pocrd.sock. The daemon holds the PaddleOCR engines warm so consecutive snips skip the paddle bootstrap and stay around 500 ms each.

pocrd is auto-started on demand the first time you run pocr-text or pocr-math after boot — no service to manage. Latency:

call	wall-clock
first `pocr-text` (daemon cold)	~4–5 s
every subsequent `pocr-text`	~500–700 ms
first `pocr-math` (formula engine)	~5–6 s
every subsequent `pocr-math`	~700 ms–1 s

The daemon shuts itself down after 2 minutes of idle (no incoming requests), releasing PaddleOCR's ~1.6 GB resident set. The next snip auto-respawns it (~5 s cold). To change the timeout:

export POCR_IDLE_TIMEOUT=600   # ten minutes
export POCR_IDLE_TIMEOUT=0     # disable; daemon stays warm forever

The daemon logs to ~/.cache/pocr/daemon.log. To restart it manually:

pkill pocrd      # client will spawn a fresh one on the next snip

To run the daemon in the foreground (for debugging):

pocrd            # blocks; ctrl-c to stop

Keyboard shortcuts

After install.sh, pocr-text / pocr-math are on PATH and bind like any other command.

i3

bindsym $mod+Shift+s exec --no-startup-id pocr-text
bindsym $mod+Shift+m exec --no-startup-id pocr-math

sxhkd

super + shift + s
    pocr-text

super + shift + m
    pocr-math

Hyprland

bind = SUPER SHIFT, S, exec, pocr-text
bind = SUPER SHIFT, M, exec, pocr-math

GNOME / KDE

Add a custom shortcut in Settings → Keyboard → Shortcuts pointing at pocr-text / pocr-math.

Tuning

Text language. Edit pocr/ocr.py, change lang="en" to e.g. "es", "ch", "fr". Full list in the PaddleOCR docs.
Formula model. Edit pocr/ocr.py, change model_name="PP-FormulaNet_plus-M" to "PP-FormulaNet-S" for a quarter the size and faster cold start (some accuracy lost), or "UniMERNet" for Chinese + older documents.
GPU. Change device="cpu" to "gpu" in pocr/ocr.py if you have CUDA-enabled PaddlePaddle installed.

Paddle-side workarounds (universal, not NixOS-specific)

A few one-liners in the code paper over known PaddlePaddle wheel bugs that hit every Linux user, not just NixOS — each is in one place with a comment:

pocr/ocr.py sets FLAGS_use_mkldnn=0 + enable_mkldnn=False because PaddlePaddle 3.x's PIR executor crashes on OneDNN ops in CPU inference.
pocr/ocr.py sets PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK=True so first-run doesn't hang for ~30s probing model hosters we can't reach.
pocr/cli.py exits via os._exit because paddle's C++ thread pools aren't daemon threads and would otherwise keep the interpreter alive long after the OCR result is already on the clipboard.

Remove any of these once the corresponding upstream issue is fixed.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
pocr		pocr
.gitignore		.gitignore
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
install.sh		install.sh
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pocr — snip & OCR with PaddleOCR

Requirements

Install — Linux (Ubuntu / Fedora / Arch / …)

Install — NixOS

Via your NixOS flake (recommended)

Ad-hoc

Development

Usage

Daemon

Keyboard shortcuts

i3

sxhkd

Hyprland

GNOME / KDE

Tuning

Paddle-side workarounds (universal, not NixOS-specific)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pocr — snip & OCR with PaddleOCR

Requirements

Install — Linux (Ubuntu / Fedora / Arch / …)

Install — NixOS

Via your NixOS flake (recommended)

Ad-hoc

Development

Usage

Daemon

Keyboard shortcuts

i3

sxhkd

Hyprland

GNOME / KDE

Tuning

Paddle-side workarounds (universal, not NixOS-specific)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages