Skip to content

Commit 6fb1777

Browse files
Copilothotlong
andcommitted
Improve error handling and documentation for data format requirements and corrupted files
Co-authored-by: hotlong <50353452+hotlong@users.noreply.github.com>
1 parent e32c1a1 commit 6fb1777

2 files changed

Lines changed: 121 additions & 5 deletions

File tree

packages/drivers/excel/README.md

Lines changed: 71 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -99,10 +99,13 @@ const driver = new ExcelDriver({
9999

100100
## How It Works
101101

102-
### Data Storage
102+
### Data Storage Format
103103

104+
**Important**: The Excel file must follow this structure:
105+
106+
- **One file** contains multiple worksheets
104107
- **Each worksheet = One object type** (e.g., `users`, `products`)
105-
- **First row = Column headers** (field names)
108+
- **First row = Column headers** (field names like `id`, `name`, `email`)
106109
- **Subsequent rows = Data records**
107110

108111
Example Excel structure:
@@ -113,13 +116,45 @@ Example Excel structure:
113116
| user-1 | Alice | alice@example.com | admin | 2024-01-01T00:00:00Z |
114117
| user-2 | Bob | bob@example.com | user | 2024-01-02T00:00:00Z |
115118

119+
**Sheet: products**
120+
| id | name | price | category |
121+
|----|------|-------|----------|
122+
| prod-1 | Laptop | 999.99 | Electronics |
123+
116124
### Workflow
117125

118126
1. **Load**: Reads Excel file into memory on initialization
119127
2. **Query**: Performs operations in-memory (fast!)
120128
3. **Persist**: Writes changes back to Excel file
121129
4. **Auto-save**: Enabled by default for data safety
122130

131+
### Error Handling
132+
133+
The driver provides clear error messages for common issues:
134+
135+
**Corrupted or Invalid Files:**
136+
```
137+
Failed to read Excel file - File may be corrupted or not a valid .xlsx file
138+
```
139+
140+
**File Format Issues:**
141+
- Missing headers: Worksheets without headers in the first row are skipped with a warning
142+
- Empty rows: Completely empty rows are automatically skipped
143+
- Missing ID field: IDs are auto-generated if not present
144+
145+
**File Access Issues:**
146+
```
147+
Failed to read Excel file - Permission denied. Check file permissions.
148+
Failed to read Excel file - File is locked by another process. Close it and try again.
149+
```
150+
151+
**Data Format Mismatch:**
152+
If an existing Excel file doesn't match the expected format (no headers, wrong structure), the driver will:
153+
1. Log a warning to the console
154+
2. Skip problematic worksheets
155+
3. Continue loading valid worksheets
156+
4. You can check console output for warnings about skipped data
157+
123158
## API Reference
124159

125160
### CRUD Operations
@@ -339,6 +374,38 @@ for (const record of records) {
339374
- **File locking**: Not suitable for concurrent multi-process writes
340375
- **Performance**: Slower than dedicated databases for large datasets
341376
- **Query optimization**: No indexes or query optimization
377+
- **File format**: Only supports .xlsx format (Excel 2007+), not .xls (Excel 97-2003)
378+
379+
## Data Format Requirements
380+
381+
To ensure proper operation, Excel files must follow these requirements:
382+
383+
### File Structure
384+
**Valid .xlsx file** (Excel 2007+ format)
385+
**First row contains headers** (column names)
386+
**One worksheet per object type**
387+
**Consistent column structure** within each worksheet
388+
389+
### Common Issues and Solutions
390+
391+
| Issue | Symptom | Solution |
392+
|-------|---------|----------|
393+
| Corrupted file | `FILE_READ_ERROR: File may be corrupted` | Open in Excel, save as new .xlsx file, or restore from backup |
394+
| No headers | Warning: `Worksheet has no headers` | Add column names in first row (id, name, email, etc.) |
395+
| File locked | `File is locked by another process` | Close the file in Excel or other applications |
396+
| Permission denied | `Permission denied` | Check file permissions, run with appropriate access rights |
397+
| Wrong format | Data not loading | Ensure first row has headers, data starts from row 2 |
398+
| Empty rows | Rows skipped | Empty rows are automatically ignored, check console warnings |
399+
400+
### Validating Your Excel File
401+
402+
Before using an Excel file with the driver:
403+
404+
1. **Check file format**: Ensure it's `.xlsx` (not `.xls`, `.csv`, or other formats)
405+
2. **Verify headers**: First row of each worksheet should contain column names
406+
3. **Check for corruption**: Open file in Excel to verify it's not corrupted
407+
4. **Review structure**: Each worksheet should represent one object type
408+
5. **Test with small file first**: Start with a simple file to verify compatibility
342409

343410
## Best Practices
344411

@@ -347,6 +414,8 @@ for (const record of records) {
347414
3. **Backup files**: Keep backups of important Excel files
348415
4. **Validate data**: Excel doesn't enforce schemas - validate in your app
349416
5. **Batch operations**: Use `createMany`/`updateMany` for better performance
417+
6. **Monitor console warnings**: Check for warnings about skipped worksheets or rows
418+
7. **Use version control**: Track Excel file changes with git for critical data
350419

351420
## TypeScript Support
352421

packages/drivers/excel/src/index.ts

Lines changed: 50 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -100,10 +100,25 @@ export class ExcelDriver implements Driver {
100100
await this.workbook.xlsx.readFile(this.filePath);
101101
this.loadDataFromWorkbook();
102102
} catch (error) {
103+
const errorMessage = (error as Error).message;
104+
105+
// Provide helpful error messages for common issues
106+
let detailedMessage = `Failed to read Excel file: ${this.filePath}`;
107+
if (errorMessage.includes('corrupted') || errorMessage.includes('invalid')) {
108+
detailedMessage += ' - File may be corrupted or not a valid .xlsx file';
109+
} else if (errorMessage.includes('permission') || errorMessage.includes('EACCES')) {
110+
detailedMessage += ' - Permission denied. Check file permissions.';
111+
} else if (errorMessage.includes('EBUSY')) {
112+
detailedMessage += ' - File is locked by another process. Close it and try again.';
113+
}
114+
103115
throw new ObjectQLError({
104116
code: 'FILE_READ_ERROR',
105-
message: `Failed to read Excel file: ${this.filePath}`,
106-
details: { error: (error as Error).message }
117+
message: detailedMessage,
118+
details: {
119+
filePath: this.filePath,
120+
error: errorMessage
121+
}
107122
});
108123
}
109124
} else if (this.config.createIfMissing) {
@@ -120,6 +135,11 @@ export class ExcelDriver implements Driver {
120135

121136
/**
122137
* Load data from workbook into memory.
138+
*
139+
* Expected Excel format:
140+
* - First row contains column headers (field names)
141+
* - Subsequent rows contain data records
142+
* - Each worksheet represents one object type
123143
*/
124144
private loadDataFromWorkbook(): void {
125145
this.workbook.eachSheet((worksheet) => {
@@ -130,29 +150,56 @@ export class ExcelDriver implements Driver {
130150
const headerRow = worksheet.getRow(1);
131151
const headers: string[] = [];
132152
headerRow.eachCell((cell, colNumber) => {
133-
headers[colNumber - 1] = String(cell.value);
153+
const headerValue = cell.value;
154+
if (headerValue) {
155+
headers[colNumber - 1] = String(headerValue);
156+
}
134157
});
135158

159+
// Warn if worksheet has no headers (might be corrupted or wrong format)
160+
if (headers.length === 0 && worksheet.rowCount > 0) {
161+
console.warn(`[ExcelDriver] Warning: Worksheet "${sheetName}" has no headers in first row. Skipping.`);
162+
return;
163+
}
164+
136165
// Skip first row (headers) and read data rows
166+
let rowsProcessed = 0;
167+
let rowsSkipped = 0;
168+
137169
worksheet.eachRow((row, rowNumber) => {
138170
if (rowNumber === 1) return; // Skip header row
139171

140172
const record: any = {};
173+
let hasData = false;
174+
141175
row.eachCell((cell, colNumber) => {
142176
const header = headers[colNumber - 1];
143177
if (header) {
144178
record[header] = cell.value;
179+
hasData = true;
145180
}
146181
});
147182

183+
// Skip completely empty rows
184+
if (!hasData) {
185+
rowsSkipped++;
186+
return;
187+
}
188+
148189
// Ensure ID exists
149190
if (!record.id) {
150191
record.id = this.generateId(sheetName);
151192
}
152193

153194
records.push(record);
195+
rowsProcessed++;
154196
});
155197

198+
// Log summary for debugging
199+
if (rowsSkipped > 0) {
200+
console.warn(`[ExcelDriver] Worksheet "${sheetName}": Processed ${rowsProcessed} rows, skipped ${rowsSkipped} empty rows`);
201+
}
202+
156203
this.data.set(sheetName, records);
157204
this.updateIdCounter(sheetName, records);
158205
});

0 commit comments

Comments
 (0)