Skip to content

Latest commit

 

History

History
103 lines (72 loc) · 2.7 KB

File metadata and controls

103 lines (72 loc) · 2.7 KB

datascience

A Berkeley library for introductory data science.

written by Professor John DeNero, Professor David Culler, Sam Lau, and Alvin Wan

For an example of usage, see the Berkeley Data 8 class.

Documentation Status Build Status Coverage Status

Installation

Use pip to install the package:

pip install datascience

To verify that the package is installed correctly, run:

python -c "import datascience; print(datascience.__version__)"

Quick Start Guide

After installing the package, you can start using datascience by importing it in Python:

from datascience import Table

# Create a simple table
data = Table().with_columns(
    "Name", ["Alice", "Bob", "Charlie"],
    "Age", [25, 30, 35]
)

# Display the table
data.show()

Basic Data Manipulation

Adding a new column

data = data.with_column("Height (cm)", [165, 180, 175])

Sorting the table by age

sorted_data = data.sort("Age", descending=True)
sorted_data.show()

Key Functions and Methods

Table Creation

  • Table() : Creates an empty table
  • Table.with_columns(column_name, values, ...) : Adds multiple columns to a table

Data Manipulation

  • Table.with_column(column_name, values) : Adds a single column
  • Table.drop(column_name) : Removes a column from the table.
  • Table.sort(column_name, descending=False) : Sorts rows based on a column.

Data Visualization

  • Table.plot(column_x, column_y) : Plots a graph using two columns.
  • Table.hist(column) : Generates a histogram.
  • Table.scatter(column_x, column_y) : Creates a scatter plot.

Troubleshooting Guide

1. Installation Issues

Problem: ModuleNotFoundError: No module named 'datascience' Solution: Ensure the package is installed using:

pip install --upgrade datascience

2. Import Errors

Problem: ImportError: cannot import name 'Table' from 'datascience' Solution: Try the following:

Verify installation by running:

python -c "import datascience; print(datascience.__version__)"

3. Display Issues in Jupyter Notebook

Problem: Tables are not displaying correctly in Jupyter Notebook. Solution:

pip install ipython notebook