Skip to content

Commit 0fabb3a

Browse files
committed
Initial commit: WHOIS Parser for ccTLDs
A comprehensive WHOIS parser supporting 169 country-code TLDs. Features: - Support for 52 ccTLDs with full parsing (31% success rate) - .jp (Japan) support with Japanese field parsing - .kr (South Korea) support with Korean date formats - Date normalization for international formats - 30-second timeout for slow servers - Comprehensive test suite Stats: - ✅ 52 fully supported ccTLDs - ⚠️ 99 with partial parsing (often missing creation dates) - ❌ 18 failed (server issues) Built for DomainDetails.com by Simple Bytes LLC MIT License
0 parents  commit 0fabb3a

9 files changed

Lines changed: 2143 additions & 0 deletions

File tree

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
node_modules/
2+
*.log
3+
.DS_Store
4+
.env
5+
test-results.json

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2025 Simple Bytes LLC
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
# WHOIS Parser for ccTLDs
2+
3+
A comprehensive, battle-tested WHOIS parser with support for 169 country-code TLDs (ccTLDs). Built for [DomainDetails.com](https://domaindetails.com) by Simple Bytes LLC.
4+
5+
## Overview
6+
7+
This parser handles the messy reality of WHOIS data across different country registries. Unlike simple regex-based parsers, it handles:
8+
9+
- **Multiple date formats** - from `2005/05/30` (Japan) to `2007. 03. 02.` (South Korea) to natural language dates
10+
- **International field names** - including Japanese (登録年月日), Korean, and other non-English formats
11+
- **Various nameserver formats** - square brackets, multi-line sections, colon-separated with IPs
12+
- **Slow WHOIS servers** - with configurable 30-second timeouts
13+
- **Fallback parsing** - when domain names aren't explicitly listed in responses
14+
15+
## Success Rate
16+
17+
**52/169 ccTLDs fully supported** (31% perfect parsing)
18+
19+
-**Full parsing**: 52 ccTLDs with domain, dates, nameservers, registrar
20+
- ⚠️ **Partial parsing**: 99 ccTLDs with some fields (often missing creation dates - registry limitation)
21+
-**Failed**: 18 ccTLDs (server offline, timeout >30s, or connection refused)
22+
23+
### ✅ Fully Supported (52 ccTLDs)
24+
25+
`.ac` `.af` `.ag` `.bh` `.bi` `.bj` `.ci` `.cl` `.co` `.dm` `.do` `.ge` `.gg` `.gi` `.gl` `.hr` `.hu` `.ie` `.io` `.it` `.je` `.jp` `.kr` `.kz` `.la` `.ma` `.me` `.mk` `.mn` `.mx` `.my` `.nu` `.nz` `.pk` `.pt` `.ru` `.sc` `.se` `.sg` `.sh` `.sk` `.so` `.st` `.su` `.sx` `.sy` `.tc` `.td` `.tl` `.us` `.ve` `.ws` `.信息`
26+
27+
### ⏱️ Timeout Issues (8 ccTLDs)
28+
29+
These servers take longer than 30 seconds to respond:
30+
31+
`.dz` `.gp` `.mw` `.ng` `.pt` `.sb` `.tk` `.uy`
32+
33+
### 🚫 Connection Refused (4 ccTLDs)
34+
35+
These servers are offline or block automated queries:
36+
37+
`.bo` `.cf` `.hm` `.pf`
38+
39+
### 🌟 Highlighted Format Support
40+
41+
- 🇯🇵 **Japan (.jp)** - Square bracket format with Japanese fields
42+
- 🇰🇷 **South Korea (.kr)** - Korean/English dual format with wide spacing
43+
- 🇬🇬 **Guernsey (.gg)**, 🇯🇪 **Jersey (.je)** - Natural language dates
44+
- 🇷🇺 **Russia (.ru)** - Cyrillic field names with IP-annotated nameservers
45+
- 🇮🇹 **Italy (.it)** - Multi-line nameserver sections
46+
- 🇩🇪 **Germany (.de)** - Nserver format
47+
48+
## Features
49+
50+
- **Date Normalization**: Converts all date formats to ISO 8601 (`YYYY-MM-DDTHH:MM:SSZ`)
51+
- **Multi-Format Parsing**: Handles colon-separated, square bracket, and multi-line formats
52+
- **Domain Fallback**: Uses input domain when WHOIS doesn't return it explicitly
53+
- **Robust Error Handling**: Graceful fallbacks for missing fields
54+
- **Comprehensive Testing**: 169 ccTLD test suite included
55+
56+
## Installation
57+
58+
```bash
59+
npm install
60+
```
61+
62+
## Usage
63+
64+
```javascript
65+
import { parseWhoisData, whoisQuery } from './whois-parser.js';
66+
67+
// Query a WHOIS server
68+
const whoisText = await whoisQuery('google.jp', 'whois.jprs.jp');
69+
70+
// Parse the response
71+
const parsed = parseWhoisData(whoisText, 'google.jp');
72+
73+
console.log(parsed);
74+
// {
75+
// domainName: 'GOOGLE.JP',
76+
// registrar: 'Google LLC',
77+
// creationDate: '2005-05-30T00:00:00Z',
78+
// expirationDate: '2026-05-31T00:00:00Z',
79+
// nameservers: ['ns1.google.com', 'ns2.google.com', ...],
80+
// registrant: 'Google LLC',
81+
// status: ['Active', 'DomainTransferLocked', 'AgentChangeLocked'],
82+
// dnssec: null,
83+
// lastModified: '2025-06-01T01:05:04.000Z'
84+
// }
85+
```
86+
87+
## Testing
88+
89+
Run the comprehensive test suite:
90+
91+
```bash
92+
# Quick test (19 popular ccTLDs, ~30 seconds)
93+
npm run test:sample
94+
95+
# Full test (all 169 ccTLDs, ~5 minutes)
96+
npm test
97+
```
98+
99+
### Test Output
100+
101+
```
102+
[1/169] Testing .jp (google.jp)...
103+
✅ OK (domain: GOOGLE.JP, created: 2005-05-30T00:00:00Z)
104+
105+
[2/169] Testing .kr (naver.kr)...
106+
✅ OK (domain: naver.kr, created: 2007-03-02T00:00:00Z)
107+
108+
========== SUMMARY ==========
109+
✅ Successful: 52/169
110+
⚠️ Parsing Issues: 99/169
111+
❌ Failed: 18/169
112+
```
113+
114+
## Supported Formats
115+
116+
### Date Formats
117+
118+
```
119+
2005/05/30 → 2005-05-30T00:00:00Z (.jp)
120+
2007. 03. 02. → 2007-03-02T00:00:00Z (.kr)
121+
30th April 2003 → 2003-04-30T00:00:00Z (.gg, .je)
122+
30th April each year → 2026-04-30T00:00:00Z (recurring)
123+
2005-02-14T20:35:14.765Z → 2005-02-14T20:35:14.765Z (standard)
124+
```
125+
126+
### Field Formats
127+
128+
**Square Brackets** (.jp):
129+
```
130+
[Domain Name] GOOGLE.JP
131+
[Name Server] ns1.google.com
132+
[登録年月日] 2005/05/30
133+
```
134+
135+
**Colon-Separated** (most ccTLDs):
136+
```
137+
Domain Name: google.kr
138+
Registered Date: 2007. 03. 02.
139+
Host Name: ns1.google.com
140+
```
141+
142+
**Dotted Format** (.ax, .kz):
143+
```
144+
domain...............: test.ax
145+
```
146+
147+
**Multi-Line** (.gg, .je, .it):
148+
```
149+
Name servers:
150+
ns1.google.com
151+
ns2.google.com
152+
```
153+
154+
## Known Limitations
155+
156+
### Missing Creation Dates
157+
158+
Some registries (.de, .be, .dk, .at, .im) don't publicly expose creation dates via WHOIS. This is a registry policy, not a parser limitation.
159+
160+
### Server Timeouts
161+
162+
6 ccTLDs still timeout even at 30 seconds: .dz, .gp, .mw, .ng, .sb, .tk
163+
164+
### Offline Servers
165+
166+
Some WHOIS servers are permanently offline or block automated queries: .bo, .cf, .ch, .es, .hm, .iq, .mz, .pf, .tr
167+
168+
## Contributing
169+
170+
Found a ccTLD that doesn't parse correctly? We'd love a PR!
171+
172+
1. Run the test suite to identify the failing TLD
173+
2. Query the WHOIS server manually: `whois -h <server> <domain>`
174+
3. Identify the unique format patterns
175+
4. Update `parseWhoisData()` with new patterns
176+
5. Re-run tests to verify
177+
178+
## Credits
179+
180+
- **Built by**: [DomainDetails.com](https://domaindetails.com) team
181+
- **Special thanks**: [@synozeer](https://x.com/synozeer) for spotting .gg and other ccTLD issues
182+
- **Powered by**: Claude Code for the comprehensive parser refactor
183+
184+
## License
185+
186+
MIT License - feel free to use in your projects!
187+
188+
## Related Projects
189+
190+
- [whois](https://www.npmjs.com/package/whois) - Node.js WHOIS client
191+
- [whoiser](https://www.npmjs.com/package/whoiser) - Alternative WHOIS parser
192+
- [DomainDetails.com](https://domaindetails.com) - Free domain lookup tool using this parser
193+
194+
---
195+
196+
**Found this useful?** Give us a star ⭐ at [github.com/simplebytes-com/whois-parser](https://github.com/simplebytes-com/whois-parser) and check out [DomainDetails.com](https://domaindetails.com)!

example.js

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
/**
2+
* Example usage of the WHOIS Parser
3+
*/
4+
5+
import { parseWhoisData, whoisQuery } from './whois-parser.js';
6+
import fs from 'fs';
7+
8+
// Load WHOIS server dictionary
9+
const whoisDict = JSON.parse(fs.readFileSync('./whois_dict.json', 'utf8'));
10+
11+
async function lookupDomain(domain) {
12+
try {
13+
// Extract TLD from domain
14+
const tld = domain.split('.').pop();
15+
16+
// Get WHOIS server for this TLD
17+
const server = whoisDict[tld];
18+
if (!server) {
19+
throw new Error(`No WHOIS server found for .${tld}`);
20+
}
21+
22+
console.log(`\nQuerying ${domain} via ${server}...`);
23+
24+
// Query WHOIS server
25+
const whoisText = await whoisQuery(domain, server);
26+
27+
// Parse the response
28+
const parsed = parseWhoisData(whoisText, domain);
29+
30+
// Display results
31+
console.log('\n========== WHOIS DATA ==========');
32+
console.log(`Domain: ${parsed.domainName || 'N/A'}`);
33+
console.log(`Registrar: ${parsed.registrar || 'N/A'}`);
34+
console.log(`Created: ${parsed.creationDate || 'N/A'}`);
35+
console.log(`Expires: ${parsed.expirationDate || 'N/A'}`);
36+
console.log(`Registrant: ${parsed.registrant || 'N/A'}`);
37+
console.log(`DNSSEC: ${parsed.dnssec || 'N/A'}`);
38+
console.log(`Last Modified: ${parsed.lastModified || 'N/A'}`);
39+
40+
if (parsed.nameservers && parsed.nameservers.length > 0) {
41+
console.log(`Nameservers: ${parsed.nameservers.join(', ')}`);
42+
}
43+
44+
if (parsed.status && parsed.status.length > 0) {
45+
console.log(`Status: ${parsed.status.join(', ')}`);
46+
}
47+
48+
return parsed;
49+
50+
} catch (error) {
51+
console.error(`\n❌ Error: ${error.message}`);
52+
throw error;
53+
}
54+
}
55+
56+
// Example usage
57+
const domains = [
58+
'google.jp', // Japan - square bracket format
59+
'naver.kr', // South Korea - Korean/English format
60+
'google.gg', // Guernsey - natural language dates
61+
'google.de', // Germany - Nserver format
62+
];
63+
64+
console.log('='.repeat(50));
65+
console.log('WHOIS Parser - Example Queries');
66+
console.log('='.repeat(50));
67+
68+
// Query each domain sequentially
69+
for (const domain of domains) {
70+
await lookupDomain(domain);
71+
// Add delay between queries to be respectful to WHOIS servers
72+
await new Promise(resolve => setTimeout(resolve, 1000));
73+
}
74+
75+
console.log('\n' + '='.repeat(50));
76+
console.log('All queries completed!');
77+
console.log('='.repeat(50));

package.json

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
{
2+
"name": "ccTLD-whois-parser",
3+
"version": "1.0.0",
4+
"description": "Comprehensive WHOIS parser with support for 169 country-code TLDs (ccTLDs)",
5+
"type": "module",
6+
"main": "whois-parser.js",
7+
"scripts": {
8+
"test": "node test-full.js",
9+
"test:sample": "node test-sample.js"
10+
},
11+
"keywords": [
12+
"whois",
13+
"parser",
14+
"ccTLD",
15+
"domain",
16+
"dns",
17+
"registry",
18+
"international",
19+
"japan",
20+
"korea",
21+
"russia",
22+
"europe"
23+
],
24+
"author": "Simple Bytes LLC",
25+
"license": "MIT",
26+
"repository": {
27+
"type": "git",
28+
"url": "https://github.com/simplebytes-com/whois-parser.git"
29+
},
30+
"homepage": "https://github.com/simplebytes-com/whois-parser#readme",
31+
"bugs": {
32+
"url": "https://github.com/simplebytes-com/whois-parser/issues"
33+
},
34+
"engines": {
35+
"node": ">=18.0.0"
36+
}
37+
}

0 commit comments

Comments
 (0)