Skip to content
This repository was archived by the owner on Mar 8, 2021. It is now read-only.

Commit 8c95cbd

Browse files
committed
small adjustments and error corrections
1 parent f6dd43f commit 8c95cbd

2 files changed

Lines changed: 30 additions & 20 deletions

File tree

Chapter 9 - Learning from Examples.ipynb

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"metadata": {
33
"name": "",
4-
"signature": "sha256:cfdf2e2f87626e61a47356de271a32958896c29bce6ec4af4456a82dc25b3c9b"
4+
"signature": "sha256:75c7f5b623b7ad7980d3aefb8f0f625a1452e13d82768f661cda9499d7ce63d7"
55
},
66
"nbformat": 3,
77
"nbformat_minor": 0,
@@ -70,15 +70,15 @@
7070
" <td> 0</td>\n",
7171
" </tr>\n",
7272
" <tr>\n",
73-
" <th>ham</th>\n",
73+
" <th>spam</th>\n",
7474
" <td> 0</td>\n",
7575
" <td> 1</td>\n",
7676
" <td> 1</td>\n",
7777
" <td> 0</td>\n",
7878
" <td> 0</td>\n",
7979
" </tr>\n",
8080
" <tr>\n",
81-
" <th>ham</th>\n",
81+
" <th>spam</th>\n",
8282
" <td> 0</td>\n",
8383
" <td> 0</td>\n",
8484
" <td> 0</td>\n",
@@ -165,7 +165,8 @@
165165
],
166166
"language": "python",
167167
"metadata": {},
168-
"outputs": []
168+
"outputs": [],
169+
"prompt_number": 1
169170
},
170171
{
171172
"cell_type": "markdown",
@@ -331,7 +332,7 @@
331332
" </tbody>\n",
332333
"</table>\n",
333334
"\n",
334-
"What is the probability of $P(y=\\textrm{Walt Whitman}|x = [7, 4, 0, 0, 0, 12, 6, 8, 3, 0])$? And what is the probability of $P(y=\\textrm{J.K. Rowling}|x = [7, 4, 0, 0, 0, 12, 6, 8, 3, 0])$?"
335+
"What is the probability of $P(y=\\textrm{Walt Whitman}|x = [12, 10, 1, 8, 0, 4, 0, 0, 0, 4])$? And what is the probability of $P(y=\\textrm{J.K. Rowling}|x = [7, 4, 0, 0, 0, 12, 6, 8, 3, 0])$?"
335336
]
336337
},
337338
{
@@ -472,10 +473,11 @@
472473
"cell_type": "markdown",
473474
"metadata": {},
474475
"source": [
475-
"**d)** Now that we know how to compute the posterior probability, it is time to implement our naive bayes learner. We will start with implementing the `fit` method. The `fit` method has two core jobs:\n",
476+
"**d)** Now that we know how to compute the posterior probability, it is time to implement our naive bayes learner. We will start with implementing the `fit` method. The `fit` method has three core jobs:\n",
476477
"\n",
477478
"1. extract all the counts of each feature given each class;\n",
478-
"2. count how often each class label occurs in the training data.\n",
479+
"2. count how often each class label occurs in the training data;\n",
480+
"3. count the number of unique features.\n",
479481
"\n",
480482
"The `NaiveBayesLearner` below provides the skeleton of our class. Implement the `fit` method."
481483
]
@@ -500,6 +502,8 @@
500502
" self.C = # insert your code here (class counts)\n",
501503
" self.N = # insert your code here (feature counts per class)\n",
502504
" # add the feature counts per class here\n",
505+
" \n",
506+
" self.V = len(set(x for y_x in self.N for x in self.N[y_x])) # number of unique features\n",
503507
" \n",
504508
" def predict(self, x):\n",
505509
" \"\"\"Predict the outcome for example x. Choose the most\n",
@@ -570,6 +574,7 @@
570574
" self.N = defaultdict(Counter)\n",
571575
" for x, y_x in zip(X, y):\n",
572576
" self.N[y_x] += Counter(x)\n",
577+
" self.V = len(set(x for y_x in self.N for x in self.N[y_x]))\n",
573578
"\n",
574579
" def prior(self, y):\n",
575580
" \"\"\"Return the prior probability of class y.\"\"\"\n",
@@ -701,6 +706,7 @@
701706
" self.N = defaultdict(Counter)\n",
702707
" for x, y_x in zip(X, y):\n",
703708
" self.N[y_x] += Counter(x)\n",
709+
" self.V = len(set(x for y_x in self.N for x in self.N[y_x]))\n",
704710
"\n",
705711
" def prior(self, y):\n",
706712
" \"\"\"Return the prior probability of class y.\"\"\"\n",
@@ -709,7 +715,7 @@
709715
" def probability(self, x, y):\n",
710716
" \"\"\"Apply Laplace Smoothing to give a probability\n",
711717
" estimate of feature x given y.\"\"\"\n",
712-
" return (self.N[y][x] + 1.0) / (sum(self.N[y].values()) + len(self.N))\n",
718+
" return (self.N[y][x] + 1.0) / (sum(self.N[y].values()) + self.V)\n",
713719
"\n",
714720
" def predict(self, x):\n",
715721
" \"\"\"Predict the outcome for example x. Choose the most\n",

answerbook/Chapter 9 - Learning from Examples.ipynb

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"metadata": {
33
"name": "",
4-
"signature": "sha256:0d06f7c0dcbe740f33bc07e6b2f3b75bd99464160080eeb0dc89c4b381ba494a"
4+
"signature": "sha256:945f626408fb0a4ebaefe616111b44b5fbd70d5fef9361265f942a7392cac81e"
55
},
66
"nbformat": 3,
77
"nbformat_minor": 0,
@@ -70,15 +70,15 @@
7070
" <td> 0</td>\n",
7171
" </tr>\n",
7272
" <tr>\n",
73-
" <th>ham</th>\n",
73+
" <th>spam</th>\n",
7474
" <td> 0</td>\n",
7575
" <td> 1</td>\n",
7676
" <td> 1</td>\n",
7777
" <td> 0</td>\n",
7878
" <td> 0</td>\n",
7979
" </tr>\n",
8080
" <tr>\n",
81-
" <th>ham</th>\n",
81+
" <th>spam</th>\n",
8282
" <td> 0</td>\n",
8383
" <td> 0</td>\n",
8484
" <td> 0</td>\n",
@@ -332,7 +332,7 @@
332332
" </tbody>\n",
333333
"</table>\n",
334334
"\n",
335-
"What is the probability of $P(y=\\textrm{Walt Whitman}|x = [7, 4, 0, 0, 0, 12, 6, 8, 3, 0])$? And what is the probability of $P(y=\\textrm{J.K. Rowling}|x = [7, 4, 0, 0, 0, 12, 6, 8, 3, 0])$?"
335+
"What is the probability of $P(y=\\textrm{Walt Whitman}|x = [12, 10, 1, 8, 0, 4, 0, 0, 0, 4])$? And what is the probability of $P(y=\\textrm{J.K. Rowling}|x = [7, 4, 0, 0, 0, 12, 6, 8, 3, 0])$?"
336336
]
337337
},
338338
{
@@ -473,10 +473,11 @@
473473
"cell_type": "markdown",
474474
"metadata": {},
475475
"source": [
476-
"**d)** Now that we know how to compute the posterior probability, it is time to implement our naive bayes learner. We will start with implementing the `fit` method. The `fit` method has two core jobs:\n",
476+
"**d)** Now that we know how to compute the posterior probability, it is time to implement our naive bayes learner. We will start with implementing the `fit` method. The `fit` method has three core jobs:\n",
477477
"\n",
478478
"1. extract all the counts of each feature given each class;\n",
479-
"2. count how often each class label occurs in the training data.\n",
479+
"2. count how often each class label occurs in the training data;\n",
480+
"3. count the number of unique features.\n",
480481
"\n",
481482
"The `NaiveBayesLearner` below provides the skeleton of our class. Implement the `fit` method."
482483
]
@@ -502,6 +503,7 @@
502503
" self.N = defaultdict(Counter) # insert your code here (feature counts per class)\n",
503504
" for x, y_x in zip(X, y):\n",
504505
" self.N[y_x] += Counter(x)\n",
506+
" self.V = len(set(x for y_x in self.N for x in self.N[y_x])) # number of unique features\n",
505507
" \n",
506508
" def predict(self, x):\n",
507509
" \"\"\"Predict the outcome for example x. Choose the most\n",
@@ -572,6 +574,7 @@
572574
" self.N = defaultdict(Counter)\n",
573575
" for x, y_x in zip(X, y):\n",
574576
" self.N[y_x] += Counter(x)\n",
577+
" self.V = len(set(x for y_x in self.N for x in self.N[y_x]))\n",
575578
"\n",
576579
" def prior(self, y):\n",
577580
" \"\"\"Return the prior probability of class y.\"\"\"\n",
@@ -583,7 +586,7 @@
583586
" \"\"\"Apply Laplace Smoothing to give a probability\n",
584587
" estimate of feature x given y.\"\"\"\n",
585588
" # insert your code here\n",
586-
" return (self.N[y][x] + 1.0) / (sum(self.N[y].values()) + len(self.N))\n",
589+
" return (self.N[y][x] + 1.0) / (sum(self.N[y].values()) + self.V)\n",
587590
"\n",
588591
"# these tests should return True if your code is correct\n",
589592
"nb = NaiveBayesLearner()\n",
@@ -706,6 +709,7 @@
706709
" self.N = defaultdict(Counter)\n",
707710
" for x, y_x in zip(X, y):\n",
708711
" self.N[y_x] += Counter(x)\n",
712+
" self.V = len(set(x for y_x in self.N for x in self.N[y_x]))\n",
709713
"\n",
710714
" def prior(self, y):\n",
711715
" \"\"\"Return the prior probability of class y.\"\"\"\n",
@@ -714,7 +718,7 @@
714718
" def probability(self, x, y):\n",
715719
" \"\"\"Apply Laplace Smoothing to give a probability\n",
716720
" estimate of feature x given y.\"\"\"\n",
717-
" return (self.N[y][x] + 1.0) / (sum(self.N[y].values()) + len(self.N))\n",
721+
" return (self.N[y][x] + 1.0) / (sum(self.N[y].values()) + self.V)\n",
718722
"\n",
719723
" def predict(self, x):\n",
720724
" \"\"\"Predict the outcome for example x. Choose the most\n",
@@ -1659,7 +1663,7 @@
16591663
"input": [
16601664
"from IPython.core.display import HTML\n",
16611665
"def css_styling():\n",
1662-
" styles = open(\"styles/custom.css\", \"r\").read()\n",
1666+
" styles = open(\"../styles/custom.css\", \"r\").read()\n",
16631667
" return HTML(styles)\n",
16641668
"css_styling()"
16651669
],
@@ -1723,13 +1727,13 @@
17231727
],
17241728
"metadata": {},
17251729
"output_type": "pyout",
1726-
"prompt_number": 370,
1730+
"prompt_number": 10,
17271731
"text": [
1728-
"<IPython.core.display.HTML at 0x117f224e0>"
1732+
"<IPython.core.display.HTML at 0x103fc0ac8>"
17291733
]
17301734
}
17311735
],
1732-
"prompt_number": 370
1736+
"prompt_number": 10
17331737
},
17341738
{
17351739
"cell_type": "markdown",

0 commit comments

Comments
 (0)