datascience

A Berkeley library for introductory data science.

written by Professor John DeNero, Professor David Culler, Sam Lau, and Alvin Wan

For an example of usage, see the Berkeley Data 8 class.

Installation

Use pip to install the package:

pip install datascience

To verify that the package is installed correctly, run:

python -c "import datascience; print(datascience.__version__)"

Quick Start Guide

After installing the package, you can start using datascience by importing it in Python:

from datascience import Table

# Create a simple table
data = Table().with_columns(
    "Name", ["Alice", "Bob", "Charlie"],
    "Age", [25, 30, 35]
)

# Display the table
data.show()

Basic Data Manipulation

Adding a new column

data = data.with_column("Height (cm)", [165, 180, 175])

Sorting the table by age

sorted_data = data.sort("Age", descending=True)
sorted_data.show()

Key Functions and Methods

Table Creation

Table() : Creates an empty table
Table.with_columns(column_name, values, ...) : Adds multiple columns to a table

Data Manipulation

Table.with_column(column_name, values) : Adds a single column
Table.drop(column_name) : Removes a column from the table.
Table.sort(column_name, descending=False) : Sorts rows based on a column.

Data Visualization

Table.plot(column_x, column_y) : Plots a graph using two columns.
Table.hist(column) : Generates a histogram.
Table.scatter(column_x, column_y) : Creates a scatter plot.

Troubleshooting Guide

1. Installation Issues

Problem: ModuleNotFoundError: No module named 'datascience' Solution: Ensure the package is installed using:

pip install --upgrade datascience

2. Import Errors

Problem: ImportError: cannot import name 'Table' from 'datascience' Solution: Try the following:

Verify installation by running:

python -c "import datascience; print(datascience.__version__)"

3. Display Issues in Jupyter Notebook

Problem: Tables are not displaying correctly in Jupyter Notebook. Solution:

pip install ipython notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datascience

Installation

Quick Start Guide

Basic Data Manipulation

Adding a new column

Sorting the table by age

Key Functions and Methods

Table Creation

Data Manipulation

Data Visualization

Troubleshooting Guide

1. Installation Issues

2. Import Errors

3. Display Issues in Jupyter Notebook

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

datascience

Installation

Quick Start Guide

Basic Data Manipulation

Adding a new column

Sorting the table by age

Key Functions and Methods

Table Creation

Data Manipulation

Data Visualization

Troubleshooting Guide

1. Installation Issues

2. Import Errors

3. Display Issues in Jupyter Notebook