|
136 | 136 | "source": [ |
137 | 137 | "# create a dataframe from a Python dict\n", |
138 | 138 | "# (example taken from https://wesmckinney.com/book/pandas-basics.html)\n", |
139 | | - "data = {\"state\": [\"Ohio\", \"Ohio\", \"Ohio\", \"Nevada\", \"Nevada\", \"Nevada\"],\n", |
140 | | - " \"year\": [2000, 2001, 2002, 2001, 2002, 2003],\n", |
141 | | - " \"pop\": [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]}\n", |
| 139 | + "data = {\n", |
| 140 | + " \"state\": [\"Ohio\", \"Ohio\", \"Ohio\", \"Nevada\", \"Nevada\", \"Nevada\"],\n", |
| 141 | + " \"year\": [2000, 2001, 2002, 2001, 2002, 2003],\n", |
| 142 | + " \"pop\": [1.5, 1.7, 3.6, 2.4, 2.9, 3.2],\n", |
| 143 | + "}\n", |
142 | 144 | "\n", |
143 | 145 | "# create the DataFrame\n", |
144 | 146 | "df = pd.DataFrame(data)\n", |
|
353 | 355 | "cell_type": "markdown", |
354 | 356 | "metadata": {}, |
355 | 357 | "source": [ |
356 | | - "The ``read_csv()`` function is highly customisable and allows you to modify the behaviour of the function to suit your needs. At a minimum, you have to specify the file path as a Python string, which can point to a local file or a URL address. Another typical parameter is to set the delimiter type of the file to import. By default, the ``read_csv()`` function assumes a comma, a typical separator for CSV files (meaning comma-separated values). In the example above the separator used in the file \"bulk_composition.csv\" is not a comma but a semicolon and the delimiter is specified inside the function as ``delimiter`` (you can also use the alias ``sep``). If you import a .txt or a .tab (tab-separated) file you will need to specify ``'\\t'`` as the delimiter and so on. Another example would be the ``skiprows`` parameter which allows you to define the number of lines to skip as often text/CSV files contain information in the first few lines that are not the actual tabular data, etc. The ``read_csv()`` function has 50 possible parameters at the time this is written, which gives an idea of the control one can exercise when reading tabular data files. You can read the details of each of them in the following link https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html\n", |
| 358 | + "The ``read_csv()`` function is highly customisable and allows you to modify the behaviour of the function to suit your needs. At a minimum, you have to specify the file path as a Python string, which can point to a local file or a URL address. Another typical parameter is to set the delimiter type of the file to import. By default, the ``read_csv()`` function assumes a comma, a typical separator for CSV files (meaning comma-separated values). In the example above the separator used in the file \"bulk_composition.csv\" is not a comma but a semicolon and the delimiter is specified inside the function as ``delimiter`` (you can also use the alias ``sep``). If you import a .txt or a .tsv (tab-separated) file you will need to specify ``'\\t'`` as the delimiter and so on. Another example would be the ``skiprows`` parameter which allows you to define the number of lines to skip as often text/CSV files contain metadata in the first few lines that are not the actual tabular data, etc. The ``read_csv()`` function has 50 possible parameters at the time this is written, which gives an idea of the control one can exercise when reading tabular data files. You can read the details of each of them in the following link https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html\n", |
357 | 359 | "\n", |
358 | 360 | "> 👉 Pandas also allows you to read data copied into your clipboard using the method ``read_clipboard()``.\n", |
359 | 361 | "\n", |
|
523 | 525 | " 5 CO_pyroxenite 11 non-null float64\n", |
524 | 526 | " 6 Unnamed: 6 0 non-null float64\n", |
525 | 527 | "dtypes: float64(6), object(1)\n", |
526 | | - "memory usage: 744.0+ bytes\n" |
| 528 | + "memory usage: 748.0+ bytes\n" |
527 | 529 | ] |
528 | 530 | } |
529 | 531 | ], |
|
2017 | 2019 | "name": "stdout", |
2018 | 2020 | "output_type": "stream", |
2019 | 2021 | "text": [ |
2020 | | - "Notebook tested in 2024-01-09 using:\n", |
2021 | | - "Python 3.10.13 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:15:57) [MSC v.1916 64 bit (AMD64)]\n", |
| 2022 | + "Notebook tested in 2024-04-10 using:\n", |
| 2023 | + "Python 3.11.8 | packaged by Anaconda, Inc. | (main, Feb 26 2024, 21:34:05) [MSC v.1916 64 bit (AMD64)]\n", |
2022 | 2024 | "Pandas 2.1.4\n" |
2023 | 2025 | ] |
2024 | 2026 | } |
|
2050 | 2052 | "name": "python", |
2051 | 2053 | "nbconvert_exporter": "python", |
2052 | 2054 | "pygments_lexer": "ipython3", |
2053 | | - "version": "3.10.13" |
| 2055 | + "version": "3.11.8" |
2054 | 2056 | }, |
2055 | 2057 | "vscode": { |
2056 | 2058 | "interpreter": { |
|
0 commit comments