Skip to content

Commit 8d190fe

Browse files
committed
Improve compliance with CSV standard
1 parent 89c210e commit 8d190fe

28 files changed

Lines changed: 810775 additions & 804703 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
tab
File renamed without changes.
Lines changed: 29 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,29 @@
1-
WordNet 3.1 License
2-
3-
1 This software and database is being provided to you, the LICENSEE, by
4-
2 Princeton University under the following license. By obtaining, using
5-
3 and/or copying this software and database, you agree that you have
6-
4 read, understood, and will comply with these terms and conditions.:
7-
5
8-
6 Permission to use, copy, modify and distribute this software and
9-
7 database and its documentation for any purpose and without fee or
10-
8 royalty is hereby granted, provided that you agree to comply with
11-
9 the following copyright notice and statements, including the disclaimer,
12-
10 and that the same appear on ALL copies of the software, database and
13-
11 documentation, including modifications that you make for internal
14-
12 use or for distribution.
15-
13
16-
14 WordNet 3.1 Copyright 2011 by Princeton University. All rights reserved.
17-
15
18-
16 THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND PRINCETON
19-
17 UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
20-
18 IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PRINCETON
21-
19 UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES OF MERCHANT-
22-
20 ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE
23-
21 OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT
24-
22 INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR
25-
23 OTHER RIGHTS.
26-
24
27-
25 The name of Princeton University or Princeton may not be used in
28-
26 advertising or publicity pertaining to distribution of the software
29-
27 and/or database. Title to copyright in this software, database and
30-
28 any associated documentation shall at all times remain with
31-
29 Princeton University and LICENSEE agrees to preserve same.
1+
1 This software and database is being provided to you, the LICENSEE, by
2+
2 Princeton University under the following license. By obtaining, using
3+
3 and/or copying this software and database, you agree that you have
4+
4 read, understood, and will comply with these terms and conditions.:
5+
5
6+
6 Permission to use, copy, modify and distribute this software and
7+
7 database and its documentation for any purpose and without fee or
8+
8 royalty is hereby granted, provided that you agree to comply with
9+
9 the following copyright notice and statements, including the disclaimer,
10+
10 and that the same appear on ALL copies of the software, database and
11+
11 documentation, including modifications that you make for internal
12+
12 use or for distribution.
13+
13
14+
14 WordNet 3.1 Copyright 2011 by Princeton University. All rights reserved.
15+
15
16+
16 THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND PRINCETON
17+
17 UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
18+
18 IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PRINCETON
19+
19 UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES OF MERCHANT-
20+
20 ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE
21+
21 OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT
22+
22 INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR
23+
23 OTHER RIGHTS.
24+
24
25+
25 The name of Princeton University or Princeton may not be used in
26+
26 advertising or publicity pertaining to distribution of the software
27+
27 and/or database. Title to copyright in this software, database and
28+
28 any associated documentation shall at all times remain with
29+
29 Princeton University and LICENSEE agrees to preserve same.

Makefile

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,16 @@
11
# WNcsv (c) 2020 Eric Kafe
22
# License: CC BY 4.0, https://creativecommons.org/licenses/by/4.0/
33

4+
.PHONY: tab
5+
46
tab:
57
@chmod a+x csv2tab
8+
@mkdir -p tab
69
@echo Converting CSV databases to TAB
710
./csv2tab
811

912
clean:
1013
@echo Deleting TAB files
11-
@rm csv/*.tab
14+
-@rm tab/*.tab
15+
-@rmdir tab
1216

README.md

Lines changed: 38 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -9,35 +9,50 @@ WNprolog-3.0 documentation (c) 2012 Princeton University.
99
The present release contains the following numbers of
1010
unique database posts:
1111

12-
- wn_ant.csv: 7988
13-
- wn_at.csv: 1278
14-
- wn_cls.csv: 9559
15-
- wn_cs.csv: 221
16-
- wn_der.csv: 74781
17-
- wn_ent.csv: 408
18-
- wn_fr.csv: 21684
19-
- wn_g.csv: 117791
20-
- wn_hyp.csv: 89172
21-
- wn_ins.csv: 8589
22-
- wn_mm.csv: 12288
23-
- wn_mp.csv: 9111
24-
- wn_ms.csv: 797
25-
- wn_per.csv: 8074
26-
- wn_ppl.csv: 73
27-
- wn_sa.csv: 4054
28-
- wn_s.csv: 207272
29-
- wn_sim.csv: 21434
30-
- wn_sk.csv: 207272
31-
- wn_syntax.csv: 1054
32-
- wn_vgp.csv: 1744
33-
- total: 804644
12+
7988 wn_ant.csv
13+
1278 wn_at.csv
14+
9559 wn_cls.csv
15+
221 wn_cs.csv
16+
74781 wn_der.csv
17+
408 wn_ent.csv
18+
6053 wn_exc.csv
19+
21684 wn_fr.csv
20+
117791 wn_g.csv
21+
89172 wn_hyp.csv
22+
8589 wn_ins.csv
23+
12288 wn_mm.csv
24+
9111 wn_mp.csv
25+
797 wn_ms.csv
26+
8074 wn_per.csv
27+
73 wn_ppl.csv
28+
4054 wn_sa.csv
29+
207272 wn_s.csv
30+
21434 wn_sim.csv
31+
207272 wn_sk.csv
32+
1054 wn_syntax.csv
33+
1744 wn_vgp.csv
34+
810697 total
35+
36+
37+
## Other versions of WordNet in CSV format
38+
39+
This repository also includes alternative branches, with CSV versions
40+
of Princeton WordNet 3.0 and 3.1, or Open English Wordnet Editions 2022
41+
and 2025+.
3442

3543

3644
## Utilities:
3745

3846
For convenient inter-operation with other projects, the included _csv2tab_ script
39-
converts the CSV databases to tab-separated files, which can be easily imported
47+
converts the CSV databases to tab-separated files (TSV), which can be easily imported
4048
into many database systems.
4149

4250
- "make tab" produces a ".tab" file for every ".csv" file in the "csv" directory.
4351
- "make clean" deletes the ".tab" files.
52+
53+
54+
## News (2026):
55+
56+
- fix double-quotes in CSV strings.
57+
- Separate db records with CRLF, as required by RFC 4180.
58+
- Output ".tab" files to a separate "tab" directory.

0 commit comments

Comments
 (0)