idna — Internationalized Domain Names encoder/decoder

Version: 3.11 (__version__) Type: Pure Python (with ~13 KLoC of embedded Unicode tables) SPM target: Bundled in the Python framework (pulled in by requests) Total Python modules: 8

Internationalized Domain Names encoder / decoder. Lets you handle hostnames with non-ASCII characters (münchen.de, 中国.中国, ดอทคอม.ไทย) by converting them to ASCII "Punycode" form (xn--mnchen-3ya.de).

requests / urllib3 call this for you when you pass a Unicode URL. You'd call it directly to validate input, normalize stored hostnames, or implement a custom URL parser.

Modules

Module	What it does
`idna.__init__`	Public API: `encode`, `decode`, `IDNAError`, `IDNABidiError`, `InvalidCodepoint`, `InvalidCodepointContext`, `alabel`, `ulabel`, `check_bidi`, `check_hyphen_ok`, `check_initial_combiner`, `check_label`, `check_nfc`, `uts46_remap`, `valid_contextj`, `valid_contexto`, `valid_label_length`, `valid_string_length`, `intranges_contain`
`idna.core`	The encoder/decoder algorithm — UTS #46 + RFC 3490 / 5891 (437 lines)
`idna.codec`	Python codec registration — `"name".encode("idna")` (122 lines)
`idna.compat`	py2/py3 shim (deprecated, kept for compat — 15 lines)
`idna.idnadata`	Unicode tables — joining types, categories, BIDI classes (4309 lines, ~1.5 MB)
`idna.intranges`	Range-based table compression / lookup (57 lines)
`idna.package_data`	`__version__ = "3.11"`
`idna.uts46data`	UTS #46 case-mapping + character-status table (8841 lines, ~3 MB)

Quick start

import idna

# Encode → ASCII Punycode (for putting on the wire)
print(idna.encode("münchen.de"))          # b'xn--mnchen-3ya.de'
print(idna.encode("ドメイン.テスト"))      # b'xn--eckwd4c7c.xn--zckzah'

# Decode → human-readable Unicode
print(idna.decode("xn--mnchen-3ya.de"))   # 'münchen.de'

# Validate a label (single hostname segment, e.g. "münchen")
try:
    idna.uts46_remap("test_label")        # underscores forbidden
except idna.IDNAError as e:
    print(f"invalid: {e}")

# Codec registration — once `import idna` runs, `"name".encode("idna")` works
print("münchen".encode("idna"))           # b'xn--mnchen-3ya'
print(b"xn--mnchen-3ya".decode("idna"))   # 'münchen'

Common patterns

# Useful when normalizing user-typed URLs
from urllib.parse import urlparse
import idna

def canonical_host(url: str) -> str:
    """Return the URL's hostname in ASCII Punycode form."""
    host = urlparse(url).hostname or ""
    try:
        return idna.encode(host).decode("ascii")
    except idna.IDNAError:
        return host  # already ASCII or invalid

When to call directly

Validating hostname input before passing it to a non-Unicode API
Comparing hostnames safely (always normalize via Punycode first)
Building a DNS lookup tool / URL bar

Limitations

UTS #46 strictness mode — by default, idna.encode uses uts46=False and rejects some ambiguous Unicode (full-width ASCII, deprecated codepoints). Pass uts46=True to relax
No internationalized scheme support (e.g. http://中国.中国/ works because requests calls idna on the hostname; the scheme + path remain ASCII)
No emoji domains. Per RFC 5891, emoji are intentionally not IDNA-allowed. xn-- encodings of emoji you see in the wild are technically invalid even though they may resolve

Bundled tables

idna.idnadata (1.5 MB) holds joining types, Unicode categories, and BIDI classes per codepoint. idna.uts46data (3 MB) holds UTS #46 case mappings and per-character status (valid / disallowed / deviation / ignored / mapped). Combined, the embedded data is ~4.5 MB — the biggest cost of bundling idna, but it lets the library work with no external dependencies.

How `requests` uses it

When you do requests.get("https://münchen.de"):

requests constructs a urllib3.HTTPSConnectionPool with host="münchen.de"
urllib3 passes the host to idna.encode(...) → b'xn--mnchen-3ya.de'
urllib3 opens a TCP socket to that ASCII hostname
Server responds; requests decodes the body via charset_normalizer

Both idna and charset_normalizer are essential AND silent — you'll only notice them when something goes wrong (a server with Content-Type: text/html no charset, or a hostname with disallowed characters).

iOS notes

100% pure Python with embedded Unicode tables — no native code, no platform-specific paths
~4.5 MB on disk (mostly the idnadata + uts46data tables) — biggest cost item in this library
Works identically on iOS as on any other platform

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idna — Internationalized Domain Names encoder/decoder

Modules

Quick start

Common patterns

When to call directly

Limitations

Bundled tables

How `requests` uses it

iOS notes

See also

FilesExpand file tree

idna.md

Latest commit

History

idna.md

File metadata and controls

idna — Internationalized Domain Names encoder/decoder

Modules

Quick start

Common patterns

When to call directly

Limitations

Bundled tables

How requests uses it

iOS notes

See also

How `requests` uses it