A Python package that provides enhanced Jupyter representation capabilities through proxy objects, enabling efficient Apache Arrow-based serialization for pandas DataFrames/Series and pickle-based serialization for generic Python objects in Jupyter environments.
To install the library, run the following command.
pip install jupyter-mimetypes- Efficient Serialization: Apache Arrow format for pandas DataFrames and Series
- Universal Fallback: Pickle-based serialization for any Python object
- Jupyter Integration: Seamless MIME bundle support for Jupyter display system
- Type Safety: Complete type annotations and mypy compatibility
import pandas as pd
from jupyter_mimetypes import serialize_object, deserialize_object
# Create a pandas DataFrame
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'age': [25, 30, 35],
'city': ['New York', 'London', 'Tokyo']
})
# Serialize the DataFrame
data, metadata = serialize_object(df)
print(f"Serialized to {len(data)} MIME types: {list(data.keys())}")
# Deserialize back to original object
restored_df = deserialize_object(data, metadata)
print(f"Restored DataFrame shape: {restored_df.shape}")Example using the jupyter-kernel-client:
import pandas as pd
from jupyter_kernel_client import KernelClient
from jupyter_mimetypes import get_variable, set_variable
# Connect to a Jupyter kernel
with KernelClient(server_url="http://localhost:8888", token=SERVER_TOKEN) as client:
# Execute code in the kernel
client.execute("""
import pandas as pd
import numpy as np
# Create a large DataFrame with mixed types
np.random.seed(42)
df = pd.DataFrame({
'values': np.random.randn(1000),
'categories': np.random.choice(['A', 'B', 'C'], 1000),
'integers': np.random.randint(1, 100, 1000)
})
""")
# Retrieve the DataFrame from the kernel
retrieved_df = client.get_variable("df")
print(f"Retrieved DataFrame: {retrieved_df.shape}")
np.random.seed(42)
df2 = pd.DataFrame({
'values': np.random.randn(1000),
'categories': np.random.choice(['A', 'B', 'C'], 1000),
'integers': np.random.randint(1, 100, 1000)
})
client.set_variable("df2", df2)
client.execute("print(df2)")To remove the library, run the following.
pip uninstall jupyter-mimetypes- Apache Arrow: High-performance serialization for pandas DataFrames and Series
- Pickle: Universal Python object serialization as fallback
- ProxyObject: Wraps objects with custom
_repr_mimebundle_methods - MIME Type Registry: Maps object types to appropriate serialization functions
- Base64 Encoding: Ensures safe string transport of binary data
- Type Detection: Automatic selection of optimal serialization backend
serialize_object(obj, mimetype=None)- Serialize any Python objectdeserialize_object(data, metadata)- Deserialize from MIME bundleget_variable(name, mimetype=None, globals_dict=None)- Display variable with custom MIME typesset_variable(name, data, metadata, globals_dict)- Set deserialized variable in namespace
application/vnd.apache.arrow.stream- pandas DataFrames and Seriesapplication/x-python-pickle- Generic Python objects
# Clone the repository
git clone https://github.com/datalayer/jupyter-mimetypes.git
cd jupyter-mimetypes
# Install in development mode with all dependencies
pip install -e ".[test,lint,typing]"
# Set up pre-commit hooks (optional but recommended)
pre-commit installThe project maintains high code quality standards:
- Type Safety: 100% mypy compliance with strict settings
- Code Formatting: Ruff for linting and formatting
- Documentation: NumPy-style docstrings with numpydoc validation
- Testing: Comprehensive test suite with 100+ tests
# Run all tests
pytest
# Run with coverage
pytest --cov=jupyter_mimetypes
# Run specific test categories
pytest tests/test_api.py # Core API tests
pytest tests/_io/ # Serialization backend tests
pytest tests/test_integration.py # Integration tests (requires Jupyter)# Run all pre-commit hooks
pre-commit run --all-files
# Individual checks
ruff check . # Linting
ruff format . # Formatting
mypy jupyter_mimetypes/ # Type checking- All new features must include comprehensive tests
- Documentation must follow NumPy docstring standards
- Type annotations are required for all public APIs
- Integration tests should cover real-world usage scenarios
See RELEASE.md for detailed release instructions.