Skip to content

Commit 69ffc7b

Browse files
authored
Initial public commit
1 parent 66d9bbe commit 69ffc7b

9 files changed

Lines changed: 1202 additions & 0 deletions

File tree

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# DISCOver
2+
DISCOver: an interface to explore the DISCO corpus

about.html

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
<!DOCTYPE html>
2+
<html lang="en">
3+
<head>
4+
<meta charset="UTF-8">
5+
<title>About (the DISCOurse)</title>
6+
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
7+
<meta name="HandheldFriendly" content="true" />
8+
<meta name="viewport" content="width=device-width, height=device-height, user-scalable=no" />
9+
<title>DISCO</title>
10+
<script src="js/jquery-3.3.1.min.js"></script>
11+
<script src="js/jquery.tokeninput.js"></script>
12+
<script src="https://code.jquery.com/ui/1.12.1/jquery-ui.js"></script>
13+
<script src="https://code.highcharts.com/highcharts.js"></script>
14+
<script src="https://code.highcharts.com/modules/exporting.js"></script>
15+
<link rel="shortcut icon" href="img/favicon.ico" type="image/x-icon">
16+
<link rel="icon" href="img/favicon.ico" type="image/x-icon">
17+
<script src="js/menu.js"></script>
18+
<link rel="stylesheet" href="css/main.css" type="text/css"/>
19+
<link rel="stylesheet" href="css/jquery-ui.css">
20+
<link rel="stylesheet" href="css/token-input.css" type="text/css" />
21+
</head>
22+
<body>
23+
24+
<!--#include virtual="ssi/menu.html"-->
25+
26+
<main>
27+
<h1>About this corpus</h1>
28+
<h2 id="description">Corpus description</h2>
29+
<p>Our corpus currently offers a total of 4087 sonnets in Spanish: 2676 from the 19th
30+
century, 330 from the 18th century and 1088 from the so-called Spanish Golden Age (15th
31+
to 17th centuries). There are a total of 1204 authors (both from Spain and Latin
32+
America). It intends to provide a wide sample, inspired by distant reading approaches <a
33+
href="#moretti" target="blank">(Moretti, 2005)</a>. The raw texts were in most cases extracted from
34+
<a href="#cervantes" target="blank">Biblioteca Virtual Miguel de Cervantes (1999)</a>, with some
35+
18th-century texts coming from <a href="https://wikisource.org/wiki/Main_Page" target="blank">Wikisource</a>. A table in section Data
36+
Distribution below summarizes these data. </p>
37+
<p>The corpus is available in plain-text and in TEI formats; XML-TEI P5 was used given this
38+
standard’s benefits in terms of reuse, storage, and retrieval. Author metadata were
39+
extracted or inferred from unstructured content in the sources (year, place of birth and
40+
death, and gender), and placed in the TEIheader, or in a metadata table in the case of
41+
the plain-text version. For both TEI and plain-text formats, two versions of the texts
42+
are available: one collecting every sonnet per author, the other encoding a single
43+
sonnet per file. For corpus preparation, we closely followed the TEI guidelines and
44+
RIDE’s criteria for Digital Text Collections <a href="#hk-n" target="blank">(Henny-Krahmer and Neuber,
45+
2017)</a>. </p>
46+
<p>Additionally, authors have been assigned VIAF identifiers and described using RDFa
47+
attributes. This gives the corpus an entry-point to the Linked Open Data cloud,
48+
enhancing its findability. The corpus is available as a GitHub repository and saved in
49+
Zenodo, in response to good practices for data use, reuse, and conservation.</p><h2
50+
id="graphics">Data distribution</h2>
51+
<table class="desc">
52+
<caption><span id="table1">Table 1</span>: Corpus data distribution per period, author gender and primary continent of
53+
literary activity</caption>
54+
<tr>
55+
<th>Period</th>
56+
<th>Nbr of Sonnets</th>
57+
<th colspan="3">Nbr of Authors</th>
58+
<th>Tokens</th>
59+
</tr>
60+
<tr>
61+
<td rowspan="4"><b>19th</b></td>
62+
<td rowspan="4">2676</td>
63+
<td rowspan="4">685</td>
64+
<td>Female</td>
65+
<td>48</td>
66+
<td rowspan="4">252,518</td>
67+
</tr>
68+
<tr>
69+
<td>Male</td>
70+
<td>637</td>
71+
</tr>
72+
<tr>
73+
<td>America</td>
74+
<td>334</td>
75+
</tr>
76+
<tr>
77+
<td>Europe</td>
78+
<td>348 (+3)</td>
79+
</tr>
80+
<tr>
81+
<td rowspan="4"><b>18th</b></td>
82+
<td rowspan="4">323</td>
83+
<td rowspan="4">42</td>
84+
<td>Female</td>
85+
<td>1</td>
86+
<td rowspan="4">29,006</td>
87+
</tr>
88+
<tr>
89+
<td>Male</td>
90+
<td>41</td>
91+
</tr>
92+
<tr>
93+
<td>America</td>
94+
<td>6</td>
95+
</tr>
96+
<tr>
97+
<td>Europe</td>
98+
<td>36</td>
99+
</tr>
100+
<tr>
101+
<td rowspan="4"><b>15th-17th</b><br />(Golden Age)</td>
102+
<td rowspan="4">1088</td>
103+
<td rowspan="4">477</td>
104+
<td>Female</td>
105+
<td>31</td>
106+
<td rowspan="4">99,779</td>
107+
</tr>
108+
<tr>
109+
<td>Male</td>
110+
<td>446</td>
111+
</tr>
112+
<tr>
113+
<td>America</td>
114+
<td>12</td>
115+
</tr>
116+
<tr>
117+
<td>Europe</td>
118+
<td>458 (+7)</td>
119+
</tr>
120+
</table>
121+
<h2>Bibliography</h2>
122+
<p id="cervantes">Biblioteca Virtual Miguel de Cervantes (1999): <em>Biblioteca Virtual
123+
Miguel de Cervantes</em>
124+
<a href="http://www.cervantesvirtual.com" target="_blank">http://www.cervantesvirtual.com</a></p>
125+
<p id="hk-n">Henny-Krahmer, Ulrike, and Frederike Neuber. 2017. “Criteria for Reviewing Digital Text Collections, Version 1.0.” <em>A Review Journal for Digital Editions and Resources</em>, no. 6. <a href="https://www.i-d-e.de/publikationen/weitereschriften/criteria-text-collections-version-1-0/" target="_blank">https://www.i-d-e.de/publikationen/weitereschriften/criteria-text-collections-version-1-0</a>>.</p>
126+
<p id="moretti">Moretti, Franco. 2005. <em>Graphs, Maps, Trees: Abstract Models for a Literary History</em>. Verso</p>
127+
<hr />
128+
<div style="text-align:center; font-size:smaller; font-style:italic">
129+
<h3>Cálamo currante</h3>
130+
<p>Si escribir te propones un soneto,<br/>
131+
ve haciendo lo que yo, que, a fe, no es harto;<br/>
132+
tras el verso tercero saldrá el cuarto...<br/>
133+
¡Si es coser y cantar! ¡Mira: un cuarteto!</p>
134+
135+
<p>Haz otro igual después, que te prometo<br/>
136+
que si aquesto es parir, es fácil parto;<br/>
137+
van seis versos, y el séptimo ya ensarto;<br/>
138+
otro, y van ocho, y al primer terceto.</p>
139+
140+
<p>Todo es que el verso nono venga al baile<br/>
141+
y el décimo en la rueda esté metido.<br/>
142+
¿Hay consonante a baile y fraile? Haíle.</p>
143+
144+
<p>Pues entonces, ya es esto pan comido,<br/>
145+
y cata a Periquillo hecho fraile,<br/>
146+
y cata el sonetejo concluido.</p>
147+
<p style="font-style:normal;">Francisco de Osuna</p>
148+
</div>
149+
</main>
150+
<!--#include virtual="ssi/footer.html"-->
151+
152+
</body>
153+
</html>

citation.html

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!DOCTYPE html>
3+
<html xmlns="http://www.w3.org/1999/xhtml">
4+
<head>
5+
<title>About DISCO</title>
6+
<meta charset="utf-8" />
7+
<link href="css/ancillary.css" rel="stylesheet" type="text/css" />
8+
<link href="css/temporary.css" rel="stylesheet" type="text/css" />
9+
</head>
10+
11+
<body>
12+
<!--#include virtual="ssi/menu.html"-->
13+
<h1>Credits</h1>
14+
<p class='cite'>This interface visualizes and analyses the data available at:</p>
15+
<p class="cite">Ruiz Fabo, Pablo, Helena Bermúdez Sabel, Clara Martínez Cantón, and José
16+
Calvo Tello. 2017. <em>Diachronic Spanish Sonnet Corpus</em> (DISCO). Madrid: UNED. <a
17+
href="https://github.com/pruizf/disco" target="_blank"
18+
>https://github.com/pruizf/disco</a>. <a
19+
href="https://zenodo.org/badge/latestdoi/103841064" target="_blank"><img
20+
src="https://zenodo.org/badge/103841064.svg" /></a>
21+
</p>
22+
<p class="cite">That dataset was enhanced, and rhyme annotation was added using the tool <a
23+
href="https://github.com/versotym/rhymeTagger" target="_blank">RhymeTagger</a>,
24+
developed by <a target="_blank" href="http://www.versologie.cz/en/plechac.html">Petr Plecháč</a> (Ústav pro českou literaturu AV ČR, v. v. i.).</p>
25+
</body>
26+
</html>

credits.html

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
<!DOCTYPE html>
2+
<html lang="en">
3+
<head>
4+
<meta charset="UTF-8">
5+
<title>Credits</title>
6+
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
7+
<meta name="HandheldFriendly" content="true" />
8+
<meta name="viewport" content="width=device-width, height=device-height, user-scalable=no" />
9+
<title>DISCO</title>
10+
<script src="js/jquery-3.3.1.min.js"></script>
11+
<script src="js/jquery.tokeninput.js"></script>
12+
<script src="https://code.jquery.com/ui/1.12.1/jquery-ui.js"></script>
13+
<script src="https://code.highcharts.com/highcharts.js"></script>
14+
<script src="https://code.highcharts.com/modules/exporting.js"></script>
15+
<link rel="shortcut icon" href="img/favicon.ico" type="image/x-icon">
16+
<link rel="icon" href="img/favicon.ico" type="image/x-icon">
17+
<script src="js/menu.js"></script>
18+
<link rel="stylesheet" href="css/main.css" type="text/css"/>
19+
<link rel="stylesheet" href="css/jquery-ui.css">
20+
<link rel="stylesheet" href="css/token-input.css" type="text/css" />
21+
</head>
22+
<body>
23+
24+
<!--#include virtual="ssi/menu.html"-->
25+
26+
<main>
27+
<h1>Credits</h1>
28+
<h2>How to cite</h2>
29+
<p><strong>Bermúdez Sabel, Clara Martínez Cantón, Pablo Ruiz Fabo. 2019. <em>DISCOver: an interface to explore the DISCO corpus.</em> <a href="http://prf1.org/disco/">http://prf1.org/disco/</a></strong>
30+
<h2>Dataset</h2>
31+
<p>This interface visualizes and analyses the <strong>data</strong> available at:</p>
32+
<blockquote>Ruiz Fabo, Pablo, Helena Bermúdez Sabel, Clara Martínez Cantón, and José
33+
Calvo Tello. 2017. <em>Diachronic Spanish Sonnet Corpus</em> (DISCO). Madrid: UNED. <a
34+
href="https://github.com/pruizf/disco" target="_blank"
35+
>https://github.com/pruizf/disco</a>. <a
36+
href="https://zenodo.org/badge/latestdoi/103841064" target="_blank"><img
37+
src="https://zenodo.org/badge/103841064.svg" /></a>
38+
</blockquote>
39+
<p>This dataset was enhanced, and <strong>rhyme annotation</strong> was added using the tool <a
40+
href="https://github.com/versotym/rhymeTagger" target="_blank">RhymeTagger</a>,
41+
developed by <a target="_blank" href="http://versologie.cz/v2/web_content/plechac.php?lang=en">Petr Plecháč</a> (Ústav pro českou literaturu AV ČR, v. v. i.).</p>
42+
<p>The <strong>rhyme database</strong> (including the query and visualizations resources) <a href="http://versologie.cz/v2/tool_gunstick" target="_blank">Gunstick</a>, the rhyme database and related tools developed
43+
by the <a href="http://www.versologie.cz/en/" target="_blank">Versologie</a> research group.</p>
44+
<p>The interface was developed thanks to a <a href="https://www.avcr.cz/en/academic-public/support-of-research/josef-dobrovsky-fellowship/" target="_blank">Josef Dobrovský Fellowship</a>, funded by the Akademie věd České republiky (year 2018).</p>
45+
</main>
46+
<!--#include virtual="ssi/footer.html"-->
47+
48+
</body>
49+
50+
</html>

0 commit comments

Comments
 (0)