Skip to content

Commit 517df17

Browse files
committed
new version of classification tutorial
1 parent 8d26b8e commit 517df17

1 file changed

Lines changed: 10 additions & 10 deletions

File tree

notebooks/v2/intro_data_science/Classification_and_Model_Evaluation.ipynb

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@
143143
"id": "JYLCmVag2YxO"
144144
},
145145
"source": [
146-
"### Loading the data\n",
146+
"### Load the data\n",
147147
"\n",
148148
"We'll query data using the Data Commons API, storing it in a Pandas DataFrame. To demonstrate the concepts with a more reasonably sized dataset, we'll first analyze the 500 largest cities by population. To do that:\n",
149149
"* We use the statistical variable `Count_Person`.\n",
@@ -1719,9 +1719,9 @@
17191719
"id": "bFJOOjyOVje8"
17201720
},
17211721
"source": [
1722-
"**1A)** Which model do you think is better, Classifier 1 or Classifier 2? Explain your reasoning.\n",
1722+
"**1.1A)** Which model do you think is better, Classifier 1 or Classifier 2? Explain your reasoning.\n",
17231723
"\n",
1724-
"**1B)** Classifier 2 has a higher accuracy than Classifier 1, but has a more complicated decision boundary. Which do you think would generalize best to new data?\n"
1724+
"**1.1B)** Classifier 2 has a higher accuracy than Classifier 1, but has a more complicated decision boundary. Which do you think would generalize best to new data?\n"
17251725
]
17261726
},
17271727
{
@@ -1912,9 +1912,9 @@
19121912
"id": "51H3zw11xMnF"
19131913
},
19141914
"source": [
1915-
"**2A)** In light of all the new data points, now which classifier do you think is better, Classifer 1 or Classifier 2? Explain your reasoning.\n",
1915+
"**1.2A)** In light of all the new data points, now which classifier do you think is better, Classifer 1 or Classifier 2? Explain your reasoning.\n",
19161916
"\n",
1917-
"**2B)** In question 1, Classifier 1 had a *lower* accuracy than Classifier 2. After adding more data points, we now see the reverse, with Classifier 1 having a *higher* accuracy than Classifier 2. What happened? Give an explanation (or at least your best guess) for why this is."
1917+
"**1.2B)** In question 1, Classifier 1 had a *lower* accuracy than Classifier 2. After adding more data points, we now see the reverse, with Classifier 1 having a *higher* accuracy than Classifier 2. What happened? Give an explanation (or at least your best guess) for why this is."
19181918
]
19191919
},
19201920
{
@@ -2425,11 +2425,11 @@
24252425
"\n",
24262426
"> There are two classes, A and B. We have 100 data points in our dataset. Of these 100 data points, 99 points are labeled class A, while only 1 of the data points is labeled class B.\n",
24272427
"\n",
2428-
"**2.1A)** Consider a model that always predicts class A. What is the accuracy of this always-A model?\n",
2428+
"**2.1.1A)** Consider a model that always predicts class A. What is the accuracy of this always-A model?\n",
24292429
"\n",
2430-
"**2.1B)** How well do you expect the always-A model to perform on new, previously unseen data? Assume the new data follows the same distribution as the original 100 data points.\n",
2430+
"**2.1.1B)** How well do you expect the always-A model to perform on new, previously unseen data? Assume the new data follows the same distribution as the original 100 data points.\n",
24312431
"\n",
2432-
"**2.1C)** Run the following code block to calculate the classification accuracy of our large model. Is the accuracy higher or lower than you expected?"
2432+
"**2.1.1C)** Run the following code block to calculate the classification accuracy of our large model. Is the accuracy higher or lower than you expected?"
24332433
]
24342434
},
24352435
{
@@ -2659,8 +2659,8 @@
26592659
"id": "Sw8y60E1N7AV"
26602660
},
26612661
"source": [
2662-
"#### 2.4.1) What about regression? -- Mean Squared Error\n",
2663-
"Different models and different problems often use different accuracy metrics. You may have noticed that classification accuracy doesn't make much sense for regression problems, where instead of predicting a label, the model predicts a numeric value. In regression, a common accuracy metric is the Mean Squared Error, or MSE.\n",
2662+
"#### 2.4.1) What about regression? -- mean squared error\n",
2663+
"Different models and different problems often use different accuracy metrics. You may have noticed that classification accuracy doesn't make much sense for regression problems, where instead of predicting a label, the model predicts a numeric value. In regression, a common accuracy metric is the mean squared error, or MSE.\n",
26642664
"\n",
26652665
"$ MSE = \\frac{1}{\\text{# total data points}}\\sum_{\\text{all data points}}(\\text{predicted value} - \\text{actual value})^2$\n",
26662666
"\n",

0 commit comments

Comments
 (0)