A simple CLI tool for converting large JSON responses into JSON Lines (JSONL) format, which is more convenient for storing structured data that can be processed one record at a time.
git clone https://github.com/jsynowiec/json2jsonl && cd json2jsonl
uv run json2jsonljson2jsonl [INPUT] [OPTIONS]
Arguments:
INPUT Path to input JSON file. Omit or use '-' for stdin.
Options:
-o, --output PATH Output file path. Omit or use '-' for stdout.
--path JSONPATH JSONPath selecting the element to operate on (default: '$')
--extract JSONPATH JSONPath (relative to --path) pointing to the array to extract.
Required when root element is an object and input is stdin.
--no-parent-keys Omit parent object fields when flattening nested arrays.
--help Show help and exit.By default, the tool operates on the root element of the input JSON. Use --path to select a subtree using a JSONPath expression (default: $).
By default, the tool extracts from the root element and only accepts an array. If the root element is an object, the tool expects a JSONPath relative to it that points to a key containing an array. When reading from a file, the array key is auto-detected; if multiple array keys exist, the tool prompts you to choose. When reading from stdin, --extract is required.
If the root element is an array, each value is written to a separate line in the output.
Input:
[
{
"span_id": "b1d4f2a8-3c7e-4b1d-8a2f-9e0c6d4b2a1f",
"operation": "db.query",
"duration_ms": 342,
"tags": {
"db.type": "postgres",
"db.statement": "SELECT * FROM orders WHERE user_id = ?"
}
},
{
"span_id": "c2e5a3b9-4d8f-5c2e-9b3g-0f1d7e5c3b2g",
"operation": "http.request",
"duration_ms": 118,
"tags": {
"http.method": "POST",
"http.status_code": "503"
}
}
]Output:
{"span_id": "b1d4f2a8-3c7e-4b1d-8a2f-9e0c6d4b2a1f", "operation": "db.query", "duration_ms": 342, "tags": {"db.type": "postgres", "db.statement": "SELECT * FROM orders WHERE user_id = ?"}}
{"span_id": "c2e5a3b9-4d8f-5c2e-9b3g-0f1d7e5c3b2g", "operation": "http.request", "duration_ms": 118, "tags": {"http.method": "POST", "http.status_code": "503"}}If the root element is an object with one of its keys containing an array of nested objects, the CLI performs a lateral flatten operation. Each output line is an object with the parent fields merged in. This is the default behavior, but the parent keys can be omitted using the --no-parent-keys flag.
Input:
{
"trace_id": "a3c2e1d4-7f6b-4a2e-9c8d-1b0f5e3a7c2d",
"timestamp": "2024-11-15T14:32:07.341Z",
"severity": "WARN",
"spans": [
{
"span_id": "b1d4f2a8-3c7e-4b1d-8a2f-9e0c6d4b2a1f",
"operation": "db.query",
"duration_ms": 342,
"tags": {
"db.type": "postgres",
"db.statement": "SELECT * FROM orders WHERE user_id = ?"
}
},
{
"span_id": "c2e5a3b9-4d8f-5c2e-9b3g-0f1d7e5c3b2g",
"operation": "http.request",
"duration_ms": 118,
"tags": {
"http.method": "POST",
"http.status_code": "503"
}
}
]
}Output:
{"trace_id": "a3c2e1d4-7f6b-4a2e-9c8d-1b0f5e3a7c2d", "timestamp": "2024-11-15T14:32:07.341Z", "severity": "WARN", "span_id": "b1d4f2a8-3c7e-4b1d-8a2f-9e0c6d4b2a1f", "operation": "db.query", "duration_ms": 342, "tags": {"db.type": "postgres", "db.statement": "SELECT * FROM orders WHERE user_id = ?"}}
{"trace_id": "a3c2e1d4-7f6b-4a2e-9c8d-1b0f5e3a7c2d", "timestamp": "2024-11-15T14:32:07.341Z", "severity": "WARN", "span_id": "c2e5a3b9-4d8f-5c2e-9b3g-0f1d7e5c3b2g", "operation": "http.request", "duration_ms": 118, "tags": {"http.method": "POST", "http.status_code": "503"}}- Input files larger than 100MB require confirmation before processing.
- If the output file already exists, confirmation is required before overwriting.
- If a parent key conflicts with an array item key during flattening, the item value wins and a warning is printed to stderr.
- Output is always UTF-8, no BOM, with
\nline terminators.