Skip to content

Commit 679a735

Browse files
authored
Add files via upload
1 parent 5c63687 commit 679a735

12 files changed

Lines changed: 19207 additions & 0 deletions

File tree

Medical_insurance_codealong/data/insurance.csv

Lines changed: 1339 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 233 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,233 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Medical Costs : How your profile affects your medical charges?\n",
8+
"\n",
9+
"\n",
10+
"Today we will explore a data set dedicated to the cost of treatment of different patients. The cost of treatment depends on many factors: diagnosis, type of clinic, city of residence, age and so on. We have no data on the diagnosis of patients. But we have other information that can help us to make a conclusion about the health of patients and practice regression analysis. In any case, I wish you to be healthy! Let's look at our data."
11+
]
12+
},
13+
{
14+
"cell_type": "markdown",
15+
"metadata": {},
16+
"source": [
17+
"### Columns\n",
18+
"\n",
19+
" - age: age of primary beneficiary\n",
20+
"\n",
21+
" - sex: insurance contractor gender, female, male\n",
22+
"\n",
23+
" - bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height,objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9\n",
24+
"\n",
25+
" - children: Number of children covered by health insurance / Number of dependents\n",
26+
"\n",
27+
" - smoker: Smoking\n",
28+
"\n",
29+
" - region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest.\n",
30+
"\n",
31+
" - charges: Individual medical costs billed by health insurance\n"
32+
]
33+
},
34+
{
35+
"cell_type": "code",
36+
"execution_count": null,
37+
"metadata": {},
38+
"outputs": [],
39+
"source": [
40+
"# Import Libraries\n",
41+
"\n",
42+
"import numpy as np\n",
43+
"import pandas as pd\n",
44+
"import seaborn as sns\n",
45+
"import matplotlib.pyplot as plt\n",
46+
"import os\n",
47+
"from sklearn.preprocessing import MinMaxScaler\n",
48+
"from sklearn.model_selection import train_test_split\n",
49+
"from sklearn import metrics\n",
50+
"from sklearn.metrics import mean_squared_error\n",
51+
"%matplotlib inline"
52+
]
53+
},
54+
{
55+
"cell_type": "markdown",
56+
"metadata": {},
57+
"source": [
58+
"## Load the data"
59+
]
60+
},
61+
{
62+
"cell_type": "code",
63+
"execution_count": null,
64+
"metadata": {},
65+
"outputs": [],
66+
"source": []
67+
},
68+
{
69+
"cell_type": "markdown",
70+
"metadata": {},
71+
"source": [
72+
"### 1. Exploratory Data Analysis"
73+
]
74+
},
75+
{
76+
"cell_type": "code",
77+
"execution_count": null,
78+
"metadata": {},
79+
"outputs": [],
80+
"source": []
81+
},
82+
{
83+
"cell_type": "markdown",
84+
"metadata": {},
85+
"source": [
86+
"#### 1.2 Examining the Relationship of Charges to the Categorical Features\n",
87+
"\n"
88+
]
89+
},
90+
{
91+
"cell_type": "code",
92+
"execution_count": null,
93+
"metadata": {},
94+
"outputs": [],
95+
"source": []
96+
},
97+
{
98+
"cell_type": "markdown",
99+
"metadata": {},
100+
"source": [
101+
"\n",
102+
"#### 1.3 Charges Between Gender\n"
103+
]
104+
},
105+
{
106+
"cell_type": "code",
107+
"execution_count": null,
108+
"metadata": {},
109+
"outputs": [],
110+
"source": []
111+
},
112+
{
113+
"cell_type": "markdown",
114+
"metadata": {},
115+
"source": [
116+
"#### 1.4 Charges between Smokers and non-Smokers"
117+
]
118+
},
119+
{
120+
"cell_type": "code",
121+
"execution_count": null,
122+
"metadata": {},
123+
"outputs": [],
124+
"source": []
125+
},
126+
{
127+
"cell_type": "markdown",
128+
"metadata": {},
129+
"source": [
130+
"#### 1.5 Charges Among Regions"
131+
]
132+
},
133+
{
134+
"cell_type": "code",
135+
"execution_count": null,
136+
"metadata": {},
137+
"outputs": [],
138+
"source": []
139+
},
140+
{
141+
"cell_type": "markdown",
142+
"metadata": {},
143+
"source": [
144+
"#### 1.6 In Relation to Other Features"
145+
]
146+
},
147+
{
148+
"cell_type": "code",
149+
"execution_count": null,
150+
"metadata": {},
151+
"outputs": [],
152+
"source": []
153+
},
154+
{
155+
"cell_type": "markdown",
156+
"metadata": {},
157+
"source": [
158+
"#### 1.7 Smokers vs Non- Smokers"
159+
]
160+
},
161+
{
162+
"cell_type": "code",
163+
"execution_count": null,
164+
"metadata": {},
165+
"outputs": [],
166+
"source": []
167+
},
168+
{
169+
"cell_type": "markdown",
170+
"metadata": {},
171+
"source": [
172+
"### 2. Pre-Processing the Data"
173+
]
174+
},
175+
{
176+
"cell_type": "code",
177+
"execution_count": null,
178+
"metadata": {},
179+
"outputs": [],
180+
"source": []
181+
},
182+
{
183+
"cell_type": "markdown",
184+
"metadata": {},
185+
"source": [
186+
"### 3. Quantifying the effect of the features to the medical charges"
187+
]
188+
},
189+
{
190+
"cell_type": "code",
191+
"execution_count": null,
192+
"metadata": {},
193+
"outputs": [],
194+
"source": []
195+
},
196+
{
197+
"cell_type": "markdown",
198+
"metadata": {},
199+
"source": [
200+
"\n",
201+
"### 4. Basic Machine Learning: Comparison Between Selected Regression Models\n"
202+
]
203+
},
204+
{
205+
"cell_type": "code",
206+
"execution_count": null,
207+
"metadata": {},
208+
"outputs": [],
209+
"source": []
210+
}
211+
],
212+
"metadata": {
213+
"kernelspec": {
214+
"display_name": "Python 3",
215+
"language": "python",
216+
"name": "python3"
217+
},
218+
"language_info": {
219+
"codemirror_mode": {
220+
"name": "ipython",
221+
"version": 3
222+
},
223+
"file_extension": ".py",
224+
"mimetype": "text/x-python",
225+
"name": "python",
226+
"nbconvert_exporter": "python",
227+
"pygments_lexer": "ipython3",
228+
"version": "3.6.5"
229+
}
230+
},
231+
"nbformat": 4,
232+
"nbformat_minor": 2
233+
}

0 commit comments

Comments
 (0)