Skip to content

Commit 1a82cab

Browse files
committed
Add guide on fixing delimiter vega issues
1 parent 3001287 commit 1a82cab

4 files changed

Lines changed: 316 additions & 0 deletions

File tree

Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
# Working with International Number Formats
2+
3+
## Overview
4+
5+
When working with data visualisations across different regional settings, you may encounter issues with how numbers and CSV files are formatted. This guide explains common problems and solutions.
6+
7+
## Understanding the problem
8+
### The two main issues
9+
10+
**1. Number formatting (decimals and thousands separators)**
11+
- **US/UK format**: `1,234.56` (comma for thousands, period for decimal)
12+
- **European (& others) format**: `1.234,56` (period for thousands, comma for decimal)
13+
14+
Most data tools, including Vega-Lite, expect the US/UK format by default.
15+
16+
17+
**2. CSV delimiters**
18+
19+
CSV stands for **C**omma **S**eparated **V**alues. However, in regions where commas are used as decimal separators, Excel and other programs will save CSV files using a different delimiter (usually a semicolon `;`) to avoid confusion.
20+
21+
### What Vega-Lite expects
22+
23+
For data to work correctly in Vega-Lite, it should look like this:
24+
25+
```csv
26+
date,value
27+
1971,42.5
28+
1972,43.7
29+
1973,45.8
30+
```
31+
32+
**Key requirements:**
33+
- Columns separated by commas (`,`)
34+
- Decimal points represented by periods (`.`)
35+
- No thousands separators in the data itself
36+
37+
---
38+
## Common Issues and symptoms
39+
40+
### Issue 1: CSV opens with all data in one column
41+
42+
**What it looks like in Excel:** Instead of seeing data spread across columns, everything appears in a single column with visible commas or semicolons:
43+
44+
```
45+
date,value
46+
1971,42.5
47+
1972,43.7
48+
```
49+
50+
**Why this happens:** Your Excel is configured for a different delimiter (expecting `;`) but the file uses commas (`,`).
51+
52+
53+
### Issue 2: Saved CSV uses semicolons instead of commas
54+
55+
**What it looks like when you open the saved file in a text editor:**
56+
57+
```csv
58+
date;value
59+
1971;42.5
60+
1972;43.7
61+
```
62+
63+
**Why this happens:** When Excel's regional settings use commas as decimal separators, it automatically switches to semicolons as the CSV delimiter to prevent conflicts.
64+
65+
66+
### Issue 3: Vega-Lite chart doesn't display data
67+
68+
**Symptoms:**
69+
- Chart appears blank or shows no data points
70+
- Console errors about parsing failures
71+
- Data preview in Vega Editor shows empty or incorrectly parsed values
72+
73+
**Why this happens:** Either the delimiter is wrong (semicolons instead of commas) or the decimal separator is wrong (commas instead of periods), preventing Vega-Lite from reading the data correctly.
74+
75+
---
76+
77+
## Solutions
78+
79+
### Solution 1: Fix within Vega-Lite
80+
81+
If your data file already exists with non-standard formatting, you can handle it directly in your Vega-Lite specification.
82+
83+
#### A. Handling custom delimiters (e.g., semicolons)
84+
85+
If your CSV uses semicolons instead of commas, modify the `data` object:
86+
87+
**Before:**
88+
```json
89+
"data": {
90+
"url": "https://raw.githubusercontent.com/username/repo/main/data.csv"
91+
}
92+
```
93+
94+
**After:**
95+
```json
96+
"data": {
97+
"url": "https://raw.githubusercontent.com/username/repo/main/data.csv",
98+
"format": {"type": "dsv", "delimiter": ";"}
99+
}
100+
```
101+
102+
**Explanation:**
103+
- `"type": "dsv"` tells Vega-Lite to expect a delimited text file with a custom delimiter
104+
- `"delimiter": ";"` specifies that semicolons separate the columns
105+
- DSV stands for **D**elimiter **S**eparated **V**alues (the general version of CSV)
106+
107+
> See the [Vega-Lite data documentation](https://vega.github.io/vega-lite/docs/data.html#csv) for more details.
108+
109+
#### B. Handling comma decimal separators
110+
111+
If your numerical data uses commas as decimal separators (e.g., `42,5` instead of `42.5`), Vega-Lite won't parse it as quantitative data. You need to add a transformation to replace commas with periods:
112+
113+
```json
114+
{
115+
"$schema": "https://vega.github.io/schema/vega-lite/v6.json",
116+
"data": {
117+
"url": "https://raw.githubusercontent.com/username/repo/main/data.csv",
118+
"format": {"type": "dsv", "delimiter": ";"}
119+
},
120+
"transform": [
121+
{"calculate": "replace(datum.value, ',', '.')", "as": "value_clean"}
122+
],
123+
"mark": "line",
124+
"encoding": {
125+
"x": {"field": "date", "type": "temporal"},
126+
"y": {"field": "value_clean", "type": "quantitative"}
127+
}
128+
}
129+
```
130+
131+
**Key points:**
132+
- The `transform` property is an array that can contain multiple transformation objects
133+
- `calculate` creates a new field (i.e. column) using an expression
134+
- `replace(datum.value, ',', '.')` finds commas in the `value` field and replaces them with periods
135+
- `"as": "value_clean"` names the new cleaned field (can choose any name here)
136+
- Use `datum.columnname` to reference columns in expressions
137+
- If your column name has spaces, use brackets: `datum['column name']`
138+
- Remember to update your encoding to use the new `value_clean` field instead of `value`
139+
140+
> See the [transform documentation](https://vega.github.io/vega-lite/docs/transform.html) and [expression functions](https://vega.github.io/vega/docs/expressions/).
141+
142+
143+
---
144+
145+
### Solution 2: Fix CSV file opening in excel
146+
147+
If you've downloaded a CSV that appears all in one column when opened in Excel, you can split it properly using the **Text to Columns** feature.
148+
149+
**Steps:**
150+
1. Open the problematic CSV file in Excel
151+
2. Select the column containing all the data (usually column A)
152+
3. Go to the **Data** tab in the ribbon
153+
4. Click **Text to Columns**
154+
5. In the wizard:
155+
- Keep **Delimited** selected, click **Next >**
156+
- Uncheck any pre-selected delimiters
157+
- Check the delimiter your data uses (usually **Comma** `,` or **Semicolon** `;`)
158+
- Click **Next >**, then **Finish**
159+
160+
Your data should now be properly separated into columns.
161+
162+
**Alternative: import method**
163+
164+
For more control, import the CSV rather than opening it directly:
165+
1. Open a blank Excel workbook
166+
2. Go to **File****Import**
167+
3. Select **CSV file** and locate your file
168+
4. Follow the import wizard to specify the correct delimiter
169+
5. Click **Finish**
170+
171+
This method gives you a preview before importing and allows you to specify all formatting options upfront.
172+
173+
---
174+
175+
### Solution 3: Change Excel settings (recommended for ongoing work)
176+
177+
To avoid these issues consistently, configure Excel to use standard number formats regardless of your system settings.
178+
179+
#### On Mac
180+
1. Open **Excel**
181+
2. Go to **Excel****Preferences** (or **Settings** in some versions)
182+
3. Click **Edit** (or **Authoring**)
183+
4. **Uncheck** "Use system separators"
184+
5. Set **Decimal separator** to `.` (period)
185+
6. Set **Thousands separator** to `,` (comma)
186+
187+
#### On Windows ([official guide](https://support.microsoft.com/en-us/office/change-the-character-used-to-separate-thousands-or-decimals-c093b545-71cb-4903-b205-aebb9837bd1e))
188+
1. Go to **File****Options****Advanced**
189+
2. Scroll to **Editing options**
190+
3. **Uncheck** "Use system separators"
191+
4. Set **Decimal separator** to `.` (period)
192+
5. Set **Thousands separator** to `,` (comma)
193+
194+
195+
After making these changes, Excel will save CSVs with commas as delimiters and periods as decimals, regardless of your system's regional settings.
196+
197+
---
198+
199+
### Solution 4: Change system settings (system-wide fix)
200+
201+
For a system-wide solution that affects all applications, you can change your operating system's number format settings.
202+
203+
#### On Mac
204+
1. Open **System Settings** (or **System Preferences**)
205+
2. Go to **General****Language & Region**
206+
3. Under **Number format**, select or customise to `1,234,567.89`
207+
208+
#### On Windows
209+
1. Open **Control Panel****Region**
210+
2. Click **Additional settings**
211+
3. Set **Decimal symbol** to `.` (period)
212+
4. Set **Digit grouping symbol** to `,` (comma)
213+
5. Click **OK** on all dialogs
214+
215+
**Note:** This changes the format system-wide, which may affect other applications and how numbers appear in your operating system.
216+
217+
---
218+
219+
220+
221+
## Displaying Numbers in Your Regional Format
222+
223+
After successfully getting your data into Vega-Lite (using periods for decimals), you may want to display the numbers in your chart using your regional format preferences.
224+
225+
You can do this by adding a `locale` configuration to your Vega-Lite specification:
226+
227+
```json
228+
{
229+
"$schema": "https://vega.github.io/schema/vega-lite/v6.json",
230+
"config": {
231+
"locale": {
232+
"number": {
233+
"decimal": ",",
234+
"thousands": ".",
235+
"grouping": [3],
236+
"currency": ["", ""]
237+
}
238+
}
239+
},
240+
"data": {"url": "..."},
241+
"mark": "line",
242+
"encoding": {
243+
"x": {"field": "date", "type": "temporal"},
244+
"y": {"field": "value", "type": "quantitative"}
245+
}
246+
}
247+
```
248+
249+
**Explanation:**
250+
- `"decimal": ","` displays decimals with commas (e.g., `3,14`)
251+
- `"thousands": "."` displays thousands with periods (e.g., `1.234`)
252+
- `"grouping": [3]` groups digits in sets of three
253+
- `"currency": [""," €"]` sets currency symbols (optional). Set the symbol prefix / suffix for currency values (this only applies when we set a currency display format, with "format": "$").
254+
255+
You can also customise date and time formats by adding a `"time"` property within the `locale` object.
256+
> See the [Vega locale documentation](https://vega.github.io/vega/docs/api/locale/) and [config usage](https://vega.github.io/vega/docs/config/#usage).
257+
258+
**Important:** This only affects how numbers are *displayed* in the chart. The underlying data must still use standard formatting (periods for decimals) for Vega-Lite to parse it correctly.
259+
260+
---
261+
262+
## Quick Reference
263+
264+
| Issue | Quick Fix |
265+
| -------------------------- | ------------------------------------------------------------------------------------- |
266+
| Semicolon delimiters | Add `"format": {"type": "dsv", "delimiter": ";"}` to data object |
267+
| Comma decimals | Add transform: `{"calculate": "replace(datum.value, ',', '.')", "as": "value_clean"}` |
268+
| CSV opens in one column | Use Excel's **Text to Columns** feature |
269+
| Want to change permanently | Override Excel's separator settings or change in system-wide systems |
270+
| Want regional display | Add `locale` to `config` object in Vega-Lite chart |
271+
272+
273+
Overview: Most international formatting issues can be resolved either by preprocessing your data or by adding a few lines to your Vega-Lite specification. Choose the method that works best for your workflow.
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
{
2+
"$schema": "https://vega.github.io/schema/vega-lite/v6.json",
3+
4+
"data": {
5+
"url": "https://raw.githubusercontent.com/jhellingsdata/jhellingsdata.github.io/refs/heads/main/examples/vega-lite/international-formats/sample-data/ex1_standard.csv"
6+
},
7+
"mark": {"type": "line"},
8+
"encoding": {
9+
"x": {"field": "date", "type": "temporal"},
10+
"y": {"field": "value", "type": "quantitative"}
11+
}
12+
}
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"$schema": "https://vega.github.io/schema/vega-lite/v6.json",
3+
4+
"data": {
5+
"url": "https://raw.githubusercontent.com/jhellingsdata/jhellingsdata.github.io/refs/heads/main/examples/vega-lite/international-formats/sample-data/ex2.csv",
6+
"format": {"type": "dsv", "delimiter": ";"}
7+
},
8+
"mark": {"type": "line"},
9+
"encoding": {
10+
"x": {"field": "date", "type": "temporal"},
11+
"y": {"field": "value", "type": "quantitative"}
12+
}
13+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"$schema": "https://vega.github.io/schema/vega-lite/v6.json",
3+
4+
"data": {
5+
"url": "https://raw.githubusercontent.com/jhellingsdata/jhellingsdata.github.io/refs/heads/main/examples/vega-lite/international-formats/sample-data/ex3.csv",
6+
"format": {"type": "dsv", "delimiter": ";"}
7+
},
8+
9+
"transform": [
10+
{"calculate": "replace(datum.value, ',', '.')", "as": "value_clean"}
11+
],
12+
13+
"mark": {"type": "line"},
14+
"encoding": {
15+
"x": {"field": "date", "type": "temporal"},
16+
"y": {"field": "value_clean", "type": "quantitative"}
17+
}
18+
}

0 commit comments

Comments
 (0)