|
| 1 | +# WHOIS Parser for ccTLDs |
| 2 | + |
| 3 | +A comprehensive, battle-tested WHOIS parser with support for 169 country-code TLDs (ccTLDs). Built for [DomainDetails.com](https://domaindetails.com) by Simple Bytes LLC. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +This parser handles the messy reality of WHOIS data across different country registries. Unlike simple regex-based parsers, it handles: |
| 8 | + |
| 9 | +- **Multiple date formats** - from `2005/05/30` (Japan) to `2007. 03. 02.` (South Korea) to natural language dates |
| 10 | +- **International field names** - including Japanese (登録年月日), Korean, and other non-English formats |
| 11 | +- **Various nameserver formats** - square brackets, multi-line sections, colon-separated with IPs |
| 12 | +- **Slow WHOIS servers** - with configurable 30-second timeouts |
| 13 | +- **Fallback parsing** - when domain names aren't explicitly listed in responses |
| 14 | + |
| 15 | +## Success Rate |
| 16 | + |
| 17 | +**52/169 ccTLDs fully supported** (31% perfect parsing) |
| 18 | + |
| 19 | +- ✅ **Full parsing**: 52 ccTLDs with domain, dates, nameservers, registrar |
| 20 | +- ⚠️ **Partial parsing**: 99 ccTLDs with some fields (often missing creation dates - registry limitation) |
| 21 | +- ❌ **Failed**: 18 ccTLDs (server offline, timeout >30s, or connection refused) |
| 22 | + |
| 23 | +### ✅ Fully Supported (52 ccTLDs) |
| 24 | + |
| 25 | +`.ac` `.af` `.ag` `.bh` `.bi` `.bj` `.ci` `.cl` `.co` `.dm` `.do` `.ge` `.gg` `.gi` `.gl` `.hr` `.hu` `.ie` `.io` `.it` `.je` `.jp` `.kr` `.kz` `.la` `.ma` `.me` `.mk` `.mn` `.mx` `.my` `.nu` `.nz` `.pk` `.pt` `.ru` `.sc` `.se` `.sg` `.sh` `.sk` `.so` `.st` `.su` `.sx` `.sy` `.tc` `.td` `.tl` `.us` `.ve` `.ws` `.信息` |
| 26 | + |
| 27 | +### ⏱️ Timeout Issues (8 ccTLDs) |
| 28 | + |
| 29 | +These servers take longer than 30 seconds to respond: |
| 30 | + |
| 31 | +`.dz` `.gp` `.mw` `.ng` `.pt` `.sb` `.tk` `.uy` |
| 32 | + |
| 33 | +### 🚫 Connection Refused (4 ccTLDs) |
| 34 | + |
| 35 | +These servers are offline or block automated queries: |
| 36 | + |
| 37 | +`.bo` `.cf` `.hm` `.pf` |
| 38 | + |
| 39 | +### 🌟 Highlighted Format Support |
| 40 | + |
| 41 | +- 🇯🇵 **Japan (.jp)** - Square bracket format with Japanese fields |
| 42 | +- 🇰🇷 **South Korea (.kr)** - Korean/English dual format with wide spacing |
| 43 | +- 🇬🇬 **Guernsey (.gg)**, 🇯🇪 **Jersey (.je)** - Natural language dates |
| 44 | +- 🇷🇺 **Russia (.ru)** - Cyrillic field names with IP-annotated nameservers |
| 45 | +- 🇮🇹 **Italy (.it)** - Multi-line nameserver sections |
| 46 | +- 🇩🇪 **Germany (.de)** - Nserver format |
| 47 | + |
| 48 | +## Features |
| 49 | + |
| 50 | +- **Date Normalization**: Converts all date formats to ISO 8601 (`YYYY-MM-DDTHH:MM:SSZ`) |
| 51 | +- **Multi-Format Parsing**: Handles colon-separated, square bracket, and multi-line formats |
| 52 | +- **Domain Fallback**: Uses input domain when WHOIS doesn't return it explicitly |
| 53 | +- **Robust Error Handling**: Graceful fallbacks for missing fields |
| 54 | +- **Comprehensive Testing**: 169 ccTLD test suite included |
| 55 | + |
| 56 | +## Installation |
| 57 | + |
| 58 | +```bash |
| 59 | +npm install |
| 60 | +``` |
| 61 | + |
| 62 | +## Usage |
| 63 | + |
| 64 | +```javascript |
| 65 | +import { parseWhoisData, whoisQuery } from './whois-parser.js'; |
| 66 | + |
| 67 | +// Query a WHOIS server |
| 68 | +const whoisText = await whoisQuery('google.jp', 'whois.jprs.jp'); |
| 69 | + |
| 70 | +// Parse the response |
| 71 | +const parsed = parseWhoisData(whoisText, 'google.jp'); |
| 72 | + |
| 73 | +console.log(parsed); |
| 74 | +// { |
| 75 | +// domainName: 'GOOGLE.JP', |
| 76 | +// registrar: 'Google LLC', |
| 77 | +// creationDate: '2005-05-30T00:00:00Z', |
| 78 | +// expirationDate: '2026-05-31T00:00:00Z', |
| 79 | +// nameservers: ['ns1.google.com', 'ns2.google.com', ...], |
| 80 | +// registrant: 'Google LLC', |
| 81 | +// status: ['Active', 'DomainTransferLocked', 'AgentChangeLocked'], |
| 82 | +// dnssec: null, |
| 83 | +// lastModified: '2025-06-01T01:05:04.000Z' |
| 84 | +// } |
| 85 | +``` |
| 86 | + |
| 87 | +## Testing |
| 88 | + |
| 89 | +Run the comprehensive test suite: |
| 90 | + |
| 91 | +```bash |
| 92 | +# Quick test (19 popular ccTLDs, ~30 seconds) |
| 93 | +npm run test:sample |
| 94 | + |
| 95 | +# Full test (all 169 ccTLDs, ~5 minutes) |
| 96 | +npm test |
| 97 | +``` |
| 98 | + |
| 99 | +### Test Output |
| 100 | + |
| 101 | +``` |
| 102 | +[1/169] Testing .jp (google.jp)... |
| 103 | + ✅ OK (domain: GOOGLE.JP, created: 2005-05-30T00:00:00Z) |
| 104 | +
|
| 105 | +[2/169] Testing .kr (naver.kr)... |
| 106 | + ✅ OK (domain: naver.kr, created: 2007-03-02T00:00:00Z) |
| 107 | +
|
| 108 | +========== SUMMARY ========== |
| 109 | +✅ Successful: 52/169 |
| 110 | +⚠️ Parsing Issues: 99/169 |
| 111 | +❌ Failed: 18/169 |
| 112 | +``` |
| 113 | + |
| 114 | +## Supported Formats |
| 115 | + |
| 116 | +### Date Formats |
| 117 | + |
| 118 | +``` |
| 119 | +2005/05/30 → 2005-05-30T00:00:00Z (.jp) |
| 120 | +2007. 03. 02. → 2007-03-02T00:00:00Z (.kr) |
| 121 | +30th April 2003 → 2003-04-30T00:00:00Z (.gg, .je) |
| 122 | +30th April each year → 2026-04-30T00:00:00Z (recurring) |
| 123 | +2005-02-14T20:35:14.765Z → 2005-02-14T20:35:14.765Z (standard) |
| 124 | +``` |
| 125 | + |
| 126 | +### Field Formats |
| 127 | + |
| 128 | +**Square Brackets** (.jp): |
| 129 | +``` |
| 130 | +[Domain Name] GOOGLE.JP |
| 131 | +[Name Server] ns1.google.com |
| 132 | +[登録年月日] 2005/05/30 |
| 133 | +``` |
| 134 | + |
| 135 | +**Colon-Separated** (most ccTLDs): |
| 136 | +``` |
| 137 | +Domain Name: google.kr |
| 138 | +Registered Date: 2007. 03. 02. |
| 139 | +Host Name: ns1.google.com |
| 140 | +``` |
| 141 | + |
| 142 | +**Dotted Format** (.ax, .kz): |
| 143 | +``` |
| 144 | +domain...............: test.ax |
| 145 | +``` |
| 146 | + |
| 147 | +**Multi-Line** (.gg, .je, .it): |
| 148 | +``` |
| 149 | +Name servers: |
| 150 | + ns1.google.com |
| 151 | + ns2.google.com |
| 152 | +``` |
| 153 | + |
| 154 | +## Known Limitations |
| 155 | + |
| 156 | +### Missing Creation Dates |
| 157 | + |
| 158 | +Some registries (.de, .be, .dk, .at, .im) don't publicly expose creation dates via WHOIS. This is a registry policy, not a parser limitation. |
| 159 | + |
| 160 | +### Server Timeouts |
| 161 | + |
| 162 | +6 ccTLDs still timeout even at 30 seconds: .dz, .gp, .mw, .ng, .sb, .tk |
| 163 | + |
| 164 | +### Offline Servers |
| 165 | + |
| 166 | +Some WHOIS servers are permanently offline or block automated queries: .bo, .cf, .ch, .es, .hm, .iq, .mz, .pf, .tr |
| 167 | + |
| 168 | +## Contributing |
| 169 | + |
| 170 | +Found a ccTLD that doesn't parse correctly? We'd love a PR! |
| 171 | + |
| 172 | +1. Run the test suite to identify the failing TLD |
| 173 | +2. Query the WHOIS server manually: `whois -h <server> <domain>` |
| 174 | +3. Identify the unique format patterns |
| 175 | +4. Update `parseWhoisData()` with new patterns |
| 176 | +5. Re-run tests to verify |
| 177 | + |
| 178 | +## Credits |
| 179 | + |
| 180 | +- **Built by**: [DomainDetails.com](https://domaindetails.com) team |
| 181 | +- **Special thanks**: [@synozeer](https://x.com/synozeer) for spotting .gg and other ccTLD issues |
| 182 | +- **Powered by**: Claude Code for the comprehensive parser refactor |
| 183 | + |
| 184 | +## License |
| 185 | + |
| 186 | +MIT License - feel free to use in your projects! |
| 187 | + |
| 188 | +## Related Projects |
| 189 | + |
| 190 | +- [whois](https://www.npmjs.com/package/whois) - Node.js WHOIS client |
| 191 | +- [whoiser](https://www.npmjs.com/package/whoiser) - Alternative WHOIS parser |
| 192 | +- [DomainDetails.com](https://domaindetails.com) - Free domain lookup tool using this parser |
| 193 | + |
| 194 | +--- |
| 195 | + |
| 196 | +**Found this useful?** Give us a star ⭐ at [github.com/simplebytes-com/whois-parser](https://github.com/simplebytes-com/whois-parser) and check out [DomainDetails.com](https://domaindetails.com)! |
0 commit comments