|
5 | 5 | "metadata": {}, |
6 | 6 | "source": [ |
7 | 7 | "# Runes\n", |
8 | | - "Note: in order to use this Jupyter notebook, you need at least python 3.6 or above." |
| 8 | + "Note: in order to use this **Jupyter notebook**, you need at least **python 3.6** or above." |
9 | 9 | ] |
10 | 10 | }, |
11 | 11 | { |
12 | 12 | "cell_type": "markdown", |
13 | 13 | "metadata": {}, |
14 | 14 | "source": [ |
| 15 | + "### Configuration\n", |
| 16 | + "\n", |
15 | 17 | "Install required modules.\n", |
16 | 18 | "```bash\n", |
17 | 19 | "$ sudo pip3.6 install requests lxml \n", |
|
22 | 24 | "cell_type": "markdown", |
23 | 25 | "metadata": {}, |
24 | 26 | "source": [ |
25 | | - "Configure ipython.\n", |
| 27 | + "Configure **ipython**.\n", |
26 | 28 | "\n", |
27 | 29 | "```bash\n", |
28 | 30 | "$ ipython profile create\n", |
29 | 31 | "$ ipython locate\n", |
30 | 32 | "$ nano .ipython/profile_default/ipython_config.py\n", |
31 | 33 | "```\n", |
32 | | - " Add it a the end of the file (without '#'):\n", |
| 34 | + " Add it a the end of the file:\n", |
33 | 35 | "```bash\n", |
34 | 36 | "c.InteractiveShellApp.exec_lines = [\n", |
35 | | - " 'import sys; sys.path.append(\"/home/<username>/<path to runes-analyzer>\")'\n", |
| 37 | + " 'import sys; sys.path.append(\"/home/pi/cltk_data\")'\n", |
36 | 38 | "]\n", |
37 | 39 | "```\n", |
38 | | - " And... It's done!" |
| 40 | + "It is necessary to do that because it makes things easier to utilize data furnished by CLTK. You will see later in the notebook how it is used.\n", |
| 41 | + "\n", |
| 42 | + "And... It's done!" |
39 | 43 | ] |
40 | 44 | }, |
41 | 45 | { |
42 | 46 | "cell_type": "markdown", |
43 | 47 | "metadata": {}, |
44 | 48 | "source": [ |
45 | | - "In a shell, install the module **runesanalyzer**\n", |
46 | | - "```bash\n", |
47 | | - "$ git clone https://github.com/cltk/runes-analyzer.git\n", |
48 | | - "$ cd runes-analyzer\n", |
49 | | - "$ python3.6 setup.py install \n", |
50 | | - "```\n", |
51 | | - "Install the kernel associated with python3.6 [https://ipython.readthedocs.io/en/stable/install/kernel_install.html](https://ipython.readthedocs.io/en/stable/install/kernel_install.html) " |
| 49 | + "Install the **kernel** associated with **python3.6** [https://ipython.readthedocs.io/en/stable/install/kernel_install.html](https://ipython.readthedocs.io/en/stable/install/kernel_install.html) " |
52 | 50 | ] |
53 | 51 | }, |
54 | 52 | { |
|
61 | 59 | "```" |
62 | 60 | ] |
63 | 61 | }, |
| 62 | + { |
| 63 | + "cell_type": "markdown", |
| 64 | + "metadata": {}, |
| 65 | + "source": [ |
| 66 | + "### Runes and CLTK\n", |
| 67 | + "\n", |
| 68 | + "How can we work on runes with CLK?" |
| 69 | + ] |
| 70 | + }, |
64 | 71 | { |
65 | 72 | "cell_type": "code", |
66 | 73 | "execution_count": 1, |
|
79 | 86 | }, |
80 | 87 | { |
81 | 88 | "cell_type": "code", |
82 | | - "execution_count": 2, |
83 | | - "metadata": {}, |
84 | | - "outputs": [ |
85 | | - { |
86 | | - "name": "stdout", |
87 | | - "output_type": "stream", |
88 | | - "text": [ |
89 | | - "[ᚠ, ᚢ, ᚦ, ᚨ, ᚱ, ᚲ, ᚷ, ᚹ, ᚺ, ᚾ, ᛁ, ᛃ, ᛇ, ᛈ, ᛉ, ᛊ, ᛏ, ᛒ, ᛖ, ᛗ, ᛚ, ᛜ, ᛟ, ᛞ]\n", |
90 | | - "[ᚠ, ᚢ, ᚦ, ᚭ, ᚱ, ᚴ, ᚼ, ᚾ, ᛁ, ᛅ, ᛋ, ᛏ, ᛒ, ᛖ, ᛘ, ᛚ, ᛦ]\n", |
91 | | - "[ᚠ, ᚢ, ᚦ, ᚭ, ᚱ, ᚴ, ᚽ, ᚿ, ᛁ, ᛅ, ᛌ, ᛐ, ᛓ, ᛖ, ᛙ, ᛚ, ᛧ]\n" |
92 | | - ] |
93 | | - } |
94 | | - ], |
| 89 | + "execution_count": null, |
| 90 | + "metadata": {}, |
| 91 | + "outputs": [], |
95 | 92 | "source": [ |
96 | | - "for alphabet in [runes.ELDER_FUTHARK, runes.YOUNGER_FUTHARK, runes.SHORT_TWIG_YOUNGER_FUTHARK]:\n", |
97 | | - " print(alphabet)" |
| 93 | + "\n", |
| 94 | + "from old_norse.text import old_norse_runic_transcriptions\n", |
| 95 | + "from old_norse import *\n", |
| 96 | + "#text.old_norse_runic_transcription\n", |
| 97 | + "#from old_norse_runic_transcriptions import *\n", |
| 98 | + "#old_norse_runic_transcriptions.denmark\n", |
| 99 | + "#help(old_norse_runic_transcriptions)\n", |
| 100 | + "from old_norse.text.old_norse_runic_transcriptions.denmark.data import little_jelling_stone \n", |
| 101 | + "little_jelling_stone \n", |
| 102 | + "#help(text)\n", |
| 103 | + "#import sys\n", |
| 104 | + "#sys.path" |
98 | 105 | ] |
99 | 106 | }, |
100 | 107 | { |
|
104 | 111 | "The oldest runic inscriptions found are from 200 AC. They have always denoted Germanic languages. Until the 8th century, the elder *futhark* alphabet was used. It was compouned with 24 characters: ᚠ, ᚢ, ᚦ, ᚨ, ᚱ, ᚲ, ᚷ, ᚹ, ᚺ, ᚾ, ᛁ, ᛃ, ᛇ, ᛈ, ᛉ, ᛊ, ᛏ, ᛒ, ᛖ, ᛗ, ᛚ, ᛜ, ᛟ, ᛞ. The word *Futhark* comes from the 6 first characters of the alphabet: ᚠ (f), ᚢ (u), ᚦ (th), ᚨ (a), ᚱ (r), ᚲ (k). Later, this alphabet was reduced to 16 runes, the *younger futhark* ᚠ, ᚢ, ᚦ, ᚭ, ᚱ, ᚴ, ᚼ, ᚾ, ᛁ, ᛅ, ᛋ, ᛏ, ᛒ, ᛖ, ᛘ, ᛚ, ᛦ, with more ambiguity on sounds. Shapes of runes may vary according to which matter they are carved on, that is why there is a variant of the *younger futhark* like this: ᚠ, ᚢ, ᚦ, ᚭ, ᚱ, ᚴ, ᚽ, ᚿ, ᛁ, ᛅ, ᛌ, ᛐ, ᛓ, ᛖ, ᛙ, ᛚ, ᛧ." |
105 | 112 | ] |
106 | 113 | }, |
| 114 | + { |
| 115 | + "cell_type": "markdown", |
| 116 | + "metadata": {}, |
| 117 | + "source": [ |
| 118 | + "Get the available runic alphabets with **RunicAlphabetName**" |
| 119 | + ] |
| 120 | + }, |
107 | 121 | { |
108 | 122 | "cell_type": "code", |
109 | | - "execution_count": 3, |
| 123 | + "execution_count": null, |
110 | 124 | "metadata": {}, |
111 | 125 | "outputs": [], |
112 | 126 | "source": [ |
113 | | - "from cltk.corpus.old_norse import runes" |
| 127 | + "from cltk.corpus.old_norse.runes import RunicAlphabetName" |
| 128 | + ] |
| 129 | + }, |
| 130 | + { |
| 131 | + "cell_type": "code", |
| 132 | + "execution_count": null, |
| 133 | + "metadata": {}, |
| 134 | + "outputs": [], |
| 135 | + "source": [ |
| 136 | + "for name in RunicAlphabetName:\n", |
| 137 | + " print(name.value)" |
114 | 138 | ] |
115 | 139 | }, |
116 | 140 | { |
117 | 141 | "cell_type": "markdown", |
118 | 142 | "metadata": {}, |
119 | 143 | "source": [ |
120 | | - "Get the available runic alphabets" |
| 144 | + "Retrieve the contents of the alphabets:" |
121 | 145 | ] |
122 | 146 | }, |
123 | 147 | { |
124 | 148 | "cell_type": "code", |
125 | | - "execution_count": 4, |
| 149 | + "execution_count": null, |
126 | 150 | "metadata": {}, |
127 | 151 | "outputs": [], |
128 | 152 | "source": [ |
129 | | - "from cltk.corpus.old_norse.runes import RunicAlphabetName" |
| 153 | + "for alphabet in [runes.ELDER_FUTHARK, runes.YOUNGER_FUTHARK, runes.SHORT_TWIG_YOUNGER_FUTHARK]:\n", |
| 154 | + " print(alphabet)" |
| 155 | + ] |
| 156 | + }, |
| 157 | + { |
| 158 | + "cell_type": "markdown", |
| 159 | + "metadata": {}, |
| 160 | + "source": [ |
| 161 | + "### Runic inscriptions\n", |
| 162 | + "\n", |
| 163 | + "May I get examples from the real world? Of course! For that, we use **CorpusImporter** class from CLTK to import the data contained in a CLTK project named \"old_norse_runic_transcriptions\"." |
130 | 164 | ] |
131 | 165 | }, |
132 | 166 | { |
133 | 167 | "cell_type": "code", |
134 | | - "execution_count": 5, |
135 | | - "metadata": {}, |
136 | | - "outputs": [ |
137 | | - { |
138 | | - "name": "stdout", |
139 | | - "output_type": "stream", |
140 | | - "text": [ |
141 | | - "elder_futhark\n", |
142 | | - "younger_futhark\n", |
143 | | - "short_twig_younger_futhark\n" |
144 | | - ] |
145 | | - } |
146 | | - ], |
| 168 | + "execution_count": null, |
| 169 | + "metadata": {}, |
| 170 | + "outputs": [], |
147 | 171 | "source": [ |
148 | | - "for name in RunicAlphabetName:\n", |
149 | | - " print(name.value)" |
| 172 | + "import os\n", |
| 173 | + "from cltk.corpus.utils.importer import CorpusImporter\n", |
| 174 | + "\n", |
| 175 | + "onc = CorpusImporter(\"old_norse\")\n", |
| 176 | + "onc.import_corpus(\"old_norse_runic_transcriptions\")" |
150 | 177 | ] |
151 | 178 | }, |
152 | 179 | { |
153 | 180 | "cell_type": "markdown", |
154 | 181 | "metadata": {}, |
155 | 182 | "source": [ |
156 | | - "Some famous runic inscriptions like the Jelling stones." |
| 183 | + "Once the corpus has been downloaded, you can visualize ome famous runic inscriptions like the Jelling stones in the *data.py* file." |
157 | 184 | ] |
158 | 185 | }, |
159 | 186 | { |
160 | 187 | "cell_type": "code", |
161 | | - "execution_count": 6, |
162 | | - "metadata": {}, |
163 | | - "outputs": [ |
164 | | - { |
165 | | - "data": { |
166 | | - "text/plain": [ |
167 | | - "'᛬ᚴᚢᚱᛘᛦ᛬ᚴᚢᚾᚢᚴᛦ᛬ᚴ(ᛅᚱ)ᚦᛁ᛬ᚴᚢᛒᛚ᛬ᚦᚢᛋᛁ᛬ᛅ(ᚠᛏ)᛬ᚦᚢᚱᚢᛁ᛬ᚴᚢᚾᚢ᛬ᛋᛁᚾᛅ᛬ᛏᛅᚾᛘᛅᚱᚴᛅᛦ᛬ᛒᚢᛏ᛬'" |
168 | | - ] |
169 | | - }, |
170 | | - "execution_count": 6, |
171 | | - "metadata": {}, |
172 | | - "output_type": "execute_result" |
173 | | - } |
174 | | - ], |
175 | | - "source": [ |
176 | | - "little_jelling_stone = '᛬ᚴᚢᚱᛘᛦ᛬ᚴᚢᚾᚢᚴᛦ᛬ᚴ(ᛅᚱ)ᚦᛁ᛬ᚴᚢᛒᛚ᛬ᚦᚢᛋᛁ᛬ᛅ(ᚠᛏ)᛬ᚦᚢᚱᚢᛁ᛬ᚴᚢᚾᚢ᛬ᛋᛁᚾᛅ᛬ᛏᛅᚾᛘᛅᚱᚴᛅᛦ᛬ᛒᚢᛏ᛬'\n", |
| 188 | + "execution_count": null, |
| 189 | + "metadata": {}, |
| 190 | + "outputs": [], |
| 191 | + "source": [ |
| 192 | + "from old_norse.text.old_norse_runic_transcriptions.denmark.data import little_jelling_stone , big_jelling_stone\n", |
177 | 193 | "little_jelling_stone" |
178 | 194 | ] |
179 | 195 | }, |
180 | 196 | { |
181 | 197 | "cell_type": "code", |
182 | | - "execution_count": 7, |
| 198 | + "execution_count": null, |
183 | 199 | "metadata": {}, |
184 | 200 | "outputs": [], |
185 | 201 | "source": [ |
186 | | - "big_jelling_stone = 'ᚼᛅᚱᛅᛚᛏᚱ᛬ᚴᚢᚾᚢᚴᛦ᛬ᛒᛅᚦ᛬ᚴᛅᚢᚱᚢᛅ ᚴᚢᛒᛚ᛬ᚦᛅᚢᛋᛁ᛬ᛅᚠᛏ᛬ᚴᚢᚱᛘ ᚠᛅᚦᚢᚱ ᛋᛁᚾ ᛅᚢᚴ ᛅᚠᛏ᛬ᚦᚭᚢᚱᚢᛁ᛬ᛘᚢᚦᚢᚱ᛬ᛋᛁᚾᛅ᛬ᛋᛅ ᚼᛅᚱᛅᛚᛏᚱ(᛬)ᛁᛅᛋ᛬ᛋᚭᛦ᛫ᚢᛅᚾ᛫ᛏᛅᚾᛘᛅᚢᚱᚴ\\nᛅᛚᛅ᛫ᛅᚢᚴ᛫ᚾᚢᚱᚢᛁᚴ\\n᛫ᛅᚢᚴ᛫ᛏ(ᛅ)ᚾᛁ(᛫ᚴᛅᚱᚦᛁ᛫)ᚴᚱᛁᛋᛏᚾᚭ'" |
| 202 | + "big_jelling_stone" |
187 | 203 | ] |
188 | 204 | }, |
189 | 205 | { |
190 | 206 | "cell_type": "markdown", |
191 | 207 | "metadata": {}, |
192 | 208 | "source": [ |
| 209 | + "### Encoding and data format\n", |
| 210 | + "\n", |
193 | 211 | "Runes are encoded in UTF-8 from \\u16A0 ᚠ to \\u16FF ᛪ. See https://en.wikipedia.org/wiki/Runic_(Unicode_block)" |
194 | 212 | ] |
195 | 213 | }, |
|
205 | 223 | }, |
206 | 224 | { |
207 | 225 | "cell_type": "code", |
208 | | - "execution_count": 8, |
209 | | - "metadata": {}, |
210 | | - "outputs": [ |
211 | | - { |
212 | | - "data": { |
213 | | - "text/plain": [ |
214 | | - "ᚠ" |
215 | | - ] |
216 | | - }, |
217 | | - "execution_count": 8, |
218 | | - "metadata": {}, |
219 | | - "output_type": "execute_result" |
220 | | - } |
221 | | - ], |
| 226 | + "execution_count": null, |
| 227 | + "metadata": {}, |
| 228 | + "outputs": [], |
222 | 229 | "source": [ |
223 | 230 | "runes.ELDER_FUTHARK[0]" |
224 | 231 | ] |
|
232 | 239 | }, |
233 | 240 | { |
234 | 241 | "cell_type": "code", |
235 | | - "execution_count": 9, |
236 | | - "metadata": {}, |
237 | | - "outputs": [ |
238 | | - { |
239 | | - "data": { |
240 | | - "text/plain": [ |
241 | | - "ᚠ" |
242 | | - ] |
243 | | - }, |
244 | | - "execution_count": 9, |
245 | | - "metadata": {}, |
246 | | - "output_type": "execute_result" |
247 | | - } |
248 | | - ], |
| 242 | + "execution_count": null, |
| 243 | + "metadata": {}, |
| 244 | + "outputs": [], |
249 | 245 | "source": [ |
250 | 246 | "runes.Rune(runes.RunicAlphabetName.elder_futhark, \"\\u16A0\", \"f\", \"f\", \"fehu\")" |
251 | 247 | ] |
|
254 | 250 | "cell_type": "markdown", |
255 | 251 | "metadata": {}, |
256 | 252 | "source": [ |
257 | | - "Use the **Transcriber** class to get a basic transcription of a runic inscription." |
| 253 | + "### Runic transcription\n", |
| 254 | + "Use the **Transcriber** class to get a basic transcription of a runic inscription. To transcribe correctly a runic inscription, you have to take care about which runic alphabets it was written in. In the following exampls, the *younger Futhark* was used. An incorrect alphabet makes the transcription quite useless as in the second example." |
258 | 255 | ] |
259 | 256 | }, |
260 | 257 | { |
261 | 258 | "cell_type": "code", |
262 | | - "execution_count": 10, |
263 | | - "metadata": {}, |
264 | | - "outputs": [ |
265 | | - { |
266 | | - "data": { |
267 | | - "text/plain": [ |
268 | | - "'᛫kurmR᛫kunukR᛫k(ar)þi᛫kubl᛫þusi᛫a(ft)᛫þurui᛫kunu᛫sina᛫tanmarkaR᛫but᛫'" |
269 | | - ] |
270 | | - }, |
271 | | - "execution_count": 10, |
272 | | - "metadata": {}, |
273 | | - "output_type": "execute_result" |
274 | | - } |
275 | | - ], |
| 259 | + "execution_count": null, |
| 260 | + "metadata": {}, |
| 261 | + "outputs": [], |
276 | 262 | "source": [ |
277 | 263 | "runes.Transcriber.transcribe(little_jelling_stone, runes.YOUNGER_FUTHARK)" |
278 | 264 | ] |
279 | 265 | }, |
280 | 266 | { |
281 | 267 | "cell_type": "code", |
282 | | - "execution_count": 11, |
| 268 | + "execution_count": null, |
283 | 269 | "metadata": {}, |
284 | 270 | "outputs": [], |
285 | 271 | "source": [ |
286 | | - "from runesanalyzer import scraper" |
| 272 | + "runes.Transcriber.transcribe(little_jelling_stone, runes.ELDER_FUTHARK)" |
| 273 | + ] |
| 274 | + }, |
| 275 | + { |
| 276 | + "cell_type": "code", |
| 277 | + "execution_count": null, |
| 278 | + "metadata": {}, |
| 279 | + "outputs": [], |
| 280 | + "source": [ |
| 281 | + "from old_norse.text.old_norse_runic_transcriptions.sweden import scraper" |
287 | 282 | ] |
288 | 283 | }, |
289 | 284 | { |
|
295 | 290 | }, |
296 | 291 | { |
297 | 292 | "cell_type": "code", |
298 | | - "execution_count": 12, |
| 293 | + "execution_count": null, |
299 | 294 | "metadata": {}, |
300 | | - "outputs": [ |
301 | | - { |
302 | | - "data": { |
303 | | - "text/plain": [ |
304 | | - "<function runesanalyzer.scraper.retrieve_sweden_runic_inscription()>" |
305 | | - ] |
306 | | - }, |
307 | | - "execution_count": 12, |
308 | | - "metadata": {}, |
309 | | - "output_type": "execute_result" |
310 | | - } |
311 | | - ], |
| 295 | + "outputs": [], |
312 | 296 | "source": [ |
313 | | - "scraper.retrieve_sweden_runic_inscription" |
| 297 | + "scraper.retrieve_sweden_runic_inscriptions" |
314 | 298 | ] |
315 | 299 | }, |
316 | 300 | { |
317 | 301 | "cell_type": "markdown", |
318 | 302 | "metadata": {}, |
319 | 303 | "source": [ |
320 | 304 | "Future tasks:\n", |
321 | | - "* normalizing runic inscriptions and transcriptions\n", |
322 | | - "* making a statistics module to analyze frequencies of words, runes, spellings in runic inscriptions.\n", |
323 | | - "* getting more runic inscriptions from Norway, Denmark, etc\n", |
324 | | - "* using phonetical rules [module](https://github.com/cltk/cltk/blob/master/cltk/phonology/utils.py) to get a normalized pronunciation of Old norse inscriptions written with runes.\n", |
| 305 | + "* normalizing runic inscriptions and transcriptions,\n", |
| 306 | + "* tag runic inscriptions with locations and estimated dates,\n", |
| 307 | + "* making a statistics module to analyze frequencies of words, runes, spellings in runic inscriptions,\n", |
| 308 | + "* getting more runic inscriptions from Norway, Denmark, etc,\n", |
| 309 | + "* using phonetical rules [module](https://github.com/cltk/cltk/blob/master/cltk/phonology/utils.py) to get a normalized, pronunciation of Old norse inscriptions written with runes.\n", |
325 | 310 | " " |
326 | 311 | ] |
327 | 312 | }, |
|
0 commit comments