Skip to content

Commit 0101bc8

Browse files
authored
Merge branch 'master' into TheFlash10-patch-1
2 parents f1507bc + b66de59 commit 0101bc8

12 files changed

Lines changed: 760 additions & 358 deletions

1 CLTK Setup.ipynb

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
"\n",
99
"## Mac\n",
1010
"\n",
11-
"<https://www.python.org/downloads/> (currently is 3.5.2)\n",
11+
"See <https://www.python.org/downloads/> (current version is 3.6.4).\n",
1212
"\n",
1313
"\n",
1414
"## Linux\n",
@@ -17,13 +17,11 @@
1717
"\n",
1818
"``` bash\n",
1919
"$ curl -O https://raw.githubusercontent.com/kylepjohnson/python3_bootstrap/master/install.sh\n",
20-
"\n",
2120
"$ chmod +x install.sh\n",
22-
"\n",
2321
"$ ./install.sh\n",
2422
"```\n",
2523
"\n",
26-
"This Linux build from source will take ~5 mins."
24+
"This Linux build from source will take around 5 minutes."
2725
]
2826
},
2927
{
@@ -32,16 +30,16 @@
3230
"source": [
3331
"# Install Git\n",
3432
"\n",
35-
"CLTK uses Git for corpus management. For Mac, install it from here: <https://git-scm.com/downloads>. For Linux, check if present (`git --version`); if not then use your package manager to get it (e.g., `apt-get install git`)."
33+
"The CLTK uses Git for corpus management. For Mac, install it from here: <https://git-scm.com/downloads>. For Linux, check if present (`git --version`); if not then use your package manager to get it (e.g., `apt-get install git`)."
3634
]
3735
},
3836
{
3937
"cell_type": "markdown",
4038
"metadata": {},
4139
"source": [
42-
"# Make virtual environment\n",
40+
"# Create a virtual environment\n",
4341
"\n",
44-
"This makes a special environment (a \"sandbox\") just for the cltk. If something goes wrong, you can just delete it and start again.\n",
42+
"This makes a special environment (a \"sandbox\") just for the CLTK. If something goes wrong, you can just delete it and start again.\n",
4543
"\n",
4644
"``` bash\n",
4745
"$ cd ~/\n",
@@ -83,9 +81,9 @@
8381
"cell_type": "markdown",
8482
"metadata": {},
8583
"source": [
86-
"# Test Jupter\n",
84+
"# Test Jupyter\n",
8785
"\n",
88-
"Launch a notebook (such as this one) from the Terminal with `jupyter notebook`. Then open your preferred browser to <http://localhost:8888>."
86+
"From your `cltk` directory, launch a notebook (such as this one) from the Terminal with `jupyter notebook`. Then open your preferred browser to <http://localhost:8888>."
8987
]
9088
},
9189
{
@@ -94,7 +92,7 @@
9492
"source": [
9593
"# Download these tutorials\n",
9694
"\n",
97-
"Now or sometime later, you may find these instructions at <https://github.com/kylepjohnson/notebooks/tree/master/public_talks/2016_12_08_harvard_classics>."
95+
"You may find these instructions at <https://github.com/cltk/tutorials>."
9896
]
9997
},
10098
{
@@ -103,7 +101,7 @@
103101
"source": [
104102
"# Join GitHub\n",
105103
"\n",
106-
"A nice way to share code. Do this later, then come visit us at <https://github.com/cltk/cltk/>."
104+
"GitHub is a nice way to share code. Come visit us at <https://github.com/cltk/cltk/>!"
107105
]
108106
}
109107
],
@@ -123,7 +121,7 @@
123121
"name": "python",
124122
"nbconvert_exporter": "python",
125123
"pygments_lexer": "ipython3",
126-
"version": "3.6.1"
124+
"version": "3.6.4"
127125
}
128126
},
129127
"nbformat": 4,

2 Import corpora.ipynb

Lines changed: 20 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
"source": [
77
"The CLTK has a distributed infrastructure that lets you download official CLTK texts or other corpora shared by others. For full docs, see <http://docs.cltk.org/en/latest/importing_corpora.html>.\n",
88
"\n",
9-
"To get started, from the Terminal, open a new Jupyter notebook from within your `~/cltk` directory (see notebook 1 for instructions): `jupyter notebook`. Then go to <http://localhost:8888>."
9+
"To get started, from the Terminal, open a new Jupyter notebook from within your `~/cltk` directory (see notebook 1 \"CLTK Setup\" for instructions): `jupyter notebook`. Then go to <http://localhost:8888>."
1010
]
1111
},
1212
{
@@ -20,34 +20,31 @@
2020
},
2121
{
2222
"cell_type": "code",
23-
"execution_count": 1,
24-
"metadata": {
25-
"collapsed": true
26-
},
23+
"execution_count": 2,
24+
"metadata": {},
2725
"outputs": [],
2826
"source": [
29-
"# this is the import of the right part of the CLTK library\n",
27+
"# This is the import of the right part of the CLTK library\n",
28+
"\n",
3029
"from cltk.corpus.utils.importer import CorpusImporter"
3130
]
3231
},
3332
{
3433
"cell_type": "code",
35-
"execution_count": 2,
36-
"metadata": {
37-
"collapsed": true
38-
},
34+
"execution_count": 3,
35+
"metadata": {},
3936
"outputs": [],
4037
"source": [
4138
"# See https://github.com/cltk for all official corpora\n",
4239
"\n",
4340
"my_latin_downloader = CorpusImporter('latin')\n",
4441
"\n",
45-
"# 'my_latin_downloader' is the variable by which we now call the CorpusImporter"
42+
"# Now 'my_latin_downloader' is the variable by which we call the CorpusImporter"
4643
]
4744
},
4845
{
4946
"cell_type": "code",
50-
"execution_count": 3,
47+
"execution_count": 4,
5148
"metadata": {},
5249
"outputs": [
5350
{
@@ -70,7 +67,7 @@
7067
" 'latin_text_poeti_ditalia']"
7168
]
7269
},
73-
"execution_count": 3,
70+
"execution_count": 4,
7471
"metadata": {},
7572
"output_type": "execute_result"
7673
}
@@ -88,10 +85,8 @@
8885
},
8986
{
9087
"cell_type": "code",
91-
"execution_count": 4,
92-
"metadata": {
93-
"collapsed": true
94-
},
88+
"execution_count": 5,
89+
"metadata": {},
9590
"outputs": [],
9691
"source": [
9792
"my_latin_downloader.import_corpus('latin_text_latin_library')\n",
@@ -335,9 +330,9 @@
335330
"cell_type": "markdown",
336331
"metadata": {},
337332
"source": [
338-
"# Convert TEI XML corpus\n",
333+
"# Convert TEI XML texts\n",
339334
"\n",
340-
"Here we'll convert the 1K Years' Greek corpus from TEI XML to plaintext"
335+
"Here we'll convert the First 1K Years' Greek corpus from TEI XML to plain text."
341336
]
342337
},
343338
{
@@ -358,7 +353,7 @@
358353
"outputs": [],
359354
"source": [
360355
"#! If you get the following error: 'Install `bs4` and `lxml` to parse these TEI files.'\n",
361-
"# then run: `pip install bs4 lxml`\n",
356+
"# then run: `pip install bs4 lxml`.\n",
362357
"\n",
363358
"onekgreek_tei_xml_to_text()"
364359
]
@@ -377,15 +372,16 @@
377372
}
378373
],
379374
"source": [
380-
"# count the converted plaintext files:\n",
375+
"# Count the converted plaintext files\n",
376+
"\n",
381377
"!ls -l ~/cltk_data/greek/text/greek_text_first1kgreek_plaintext/ | wc -l"
382378
]
383379
},
384380
{
385381
"cell_type": "markdown",
386382
"metadata": {},
387383
"source": [
388-
"# Import local corpus"
384+
"# Import local corpora"
389385
]
390386
},
391387
{
@@ -438,7 +434,7 @@
438434
}
439435
],
440436
"source": [
441-
"!ls -l /home/kyle/cltk_data/originals/"
437+
"!ls -l /home/kyle/cltk_data/originals/ # To be adapted to your file system"
442438
]
443439
}
444440
],
@@ -458,7 +454,7 @@
458454
"name": "python",
459455
"nbconvert_exporter": "python",
460456
"pygments_lexer": "ipython3",
461-
"version": "3.6.1"
457+
"version": "3.6.4"
462458
}
463459
},
464460
"nbformat": 4,

0 commit comments

Comments
 (0)