Skip to content

Commit 0f314b6

Browse files
authored
Merge pull request #1 from pedrocamargo/pedro/pytables_for_h5py
First stab at cleaning the code
2 parents 337ea4d + 92e4d41 commit 0f314b6

14 files changed

Lines changed: 936 additions & 797 deletions

File tree

.github/workflows/ci.yml

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
name: Python package
2+
3+
on:
4+
push:
5+
branches:
6+
- 'master'
7+
pull_request:
8+
9+
10+
jobs:
11+
lint:
12+
runs-on: ubuntu-latest
13+
steps:
14+
- uses: actions/checkout@v4
15+
16+
- name: Setup uv
17+
uses: astral-sh/setup-uv@v5
18+
19+
- name: Set up Python 3.13
20+
run: uv python install 3.13
21+
22+
- name: Install dependencies
23+
run: uv sync --dev --all-extras
24+
25+
- name: Lint check
26+
run: uv run ruff check --output-format=github
27+
28+
29+
test:
30+
needs: lint
31+
runs-on: ${{ matrix.os }}
32+
concurrency:
33+
group: ${{ github.workflow }}-${{ github.ref }}-${{ matrix.os }}-${{ matrix.python-version }}
34+
cancel-in-progress: true
35+
strategy:
36+
fail-fast: false
37+
matrix:
38+
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
39+
os: [ubuntu-latest, windows-latest, macos-latest]
40+
exclude:
41+
- os: macos-latest
42+
python-version: "3.9"
43+
- os: macos-latest
44+
python-version: "3.10"
45+
46+
steps:
47+
- uses: actions/checkout@v4
48+
- name: Setup uv
49+
uses: astral-sh/setup-uv@v5
50+
51+
- name: Set up Python ${{ matrix.python-version }}
52+
run: uv python install ${{ matrix.python-version }}
53+
54+
- name: Install dependencies and test
55+
run: |
56+
uv sync --dev --all-extras
57+
uv run pytest
58+

CHANGES.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,5 @@ v0.3.3, 2017-03-07 -- Python 3 compatible
66
v0.3.4, 2018-04-05 -- Fix incorrect version being saved
77
v0.3.4.1, 2018-10-19 -- python3 compatible next for generator
88
v0.3.5, 2019-12-21 -- add validator and clean-up repo structure
9+
v0.4.0, 2025-12-21 -- Replaces PyTables with H5Py, cleans repo structure, adds CI
910

MANIFEST.in

Lines changed: 0 additions & 3 deletions
This file was deleted.

README.md

Lines changed: 50 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# OMX Python API Documentation
22

3-
The Python OMX API borrows heavily from PyTables. An OMX file extends the equivalent PyTables File object, so anything you can do in PyTables you can do with OMX as well. This API attempts to be very Pythonic, including dictionary-style lookup of matrix names.
3+
The Python OMX API is built on top of h5py. An OMX file extends the equivalent h5py File object, so anything you can do in h5py you can do with OMX as well. This API attempts to be very Pythonic, including dictionary-style lookup of matrix names.
44

55
* [Pre-requisites](#pre-requisites)
66
* [Installation](#installation)
@@ -12,16 +12,22 @@ The Python OMX API borrows heavily from PyTables. An OMX file extends the equiva
1212

1313
# Pre-requisites
1414

15-
Python 2.6+, PyTables 3.1+, and NumPy. Python 3 is now supported too.
15+
Python 3.9+, h5py 2.10+, and NumPy.
1616

17-
On Windows, the easiest way to get these is from [Anaconda](https://www.continuum.io/downloads#windows) or from Chris Gohlke's python binaries [page](http://www.lfd.uci.edu/~gohlke/pythonlibs/). On Linux, your distribution already has these available.
17+
Binaries for all these dependencies are readily available from PyPI and can be installed via pip.
1818

1919
# Installation
2020

2121
The easiest way to get OMX on Python is to use pip. Get the latest package (called OpenMatrix) from the [Python Package Index](https://pypi.python.org/pypi)
2222

2323
`pip install openmatrix`
2424

25+
Using uv is also possible and much faster:
26+
27+
`pip install uv`
28+
`uv pip install openmatrix`
29+
30+
2531
This command will fetch openmatrix from the PyPi repository and download/install it for you. The package name "omx" was already taken on pip for a lame xml library that no one uses. Thus our little project goes by "openmatrix" on pip instead of "omx". This means your import statements should look like,
2632

2733
`import openmatrix as omx`
@@ -33,31 +39,30 @@ and NOT:
3339
# Quick-Start Sample Code
3440

3541
```python
36-
from __future__ import print_function
3742
import openmatrix as omx
3843
import numpy as np
3944

4045
# Create some data
41-
ones = np.ones((100,100))
42-
twos = 2.0*ones
46+
ones = np.ones((100, 100))
47+
twos = 2.0 * ones
4348

4449
# Create an OMX file (will overwrite existing file!)
4550
print('Creating myfile.omx')
46-
myfile = omx.open_file('myfile.omx','w') # use 'a' to append/edit an existing file
51+
myfile = omx.open_file('myfile.omx', 'w') # use 'a' to append/edit an existing file
4752

4853
# Write to the file.
4954
myfile['m1'] = ones
5055
myfile['m2'] = twos
51-
myfile['m3'] = ones + twos # numpy array math is fast
56+
myfile['m3'] = ones + twos # numpy array math is fast
5257
myfile.close()
5358

5459
# Open an OMX file for reading only
5560
print('Reading myfile.omx')
5661
myfile = omx.open_file('myfile.omx')
5762

58-
print ('Shape:', myfile.shape()) # (100,100)
59-
print ('Number of tables:', len(myfile)) # 3
60-
print ('Table names:', myfile.list_matrices()) # ['m1','m2',',m3']
63+
print('Shape:', myfile.shape()) # (100,100)
64+
print('Number of tables:', len(myfile)) # 3
65+
print('Table names:', myfile.list_matrices()) # ['m1','m2',',m3']
6166

6267
# Work with data. Pass a string to select matrix by name:
6368
# -------------------------------------------------------
@@ -76,8 +81,8 @@ my_very_special_zone_value = m2[10][25]
7681

7782
# FANCY: Use attributes to find matrices
7883
# --------------------------------------
79-
myfile.close() # was opened read-only, so let's reopen.
80-
myfile = omx.open_file('myfile.omx','a') # append mode: read/write existing file
84+
myfile.close() # was opened read-only, so let's reopen.
85+
myfile = omx.open_file('myfile.omx', 'a') # append mode: read/write existing file
8186

8287
myfile['m1'].attrs.timeperiod = 'am'
8388
myfile['m1'].attrs.mode = 'hwy'
@@ -87,13 +92,13 @@ myfile['m2'].attrs.timeperiod = 'md'
8792
myfile['m3'].attrs.timeperiod = 'am'
8893
myfile['m3'].attrs.mode = 'trn'
8994

90-
print('attributes:', myfile.list_all_attributes()) # ['mode','timeperiod']
95+
print('attributes:', myfile.list_all_attributes()) # ['mode','timeperiod']
9196

9297
# Use a DICT to select matrices via attributes:
9398

94-
all_am_trips = myfile[ {'timeperiod':'am'} ] # [m1,m3]
95-
all_hwy_trips = myfile[ {'mode':'hwy'} ] # [m1]
96-
all_am_trn_trips = myfile[ {'mode':'trn','timeperiod':'am'} ] # [m3]
99+
all_am_trips = myfile[{'timeperiod': 'am'}] # [m1,m3]
100+
all_hwy_trips = myfile[{'mode': 'hwy'}] # [m1]
101+
all_am_trn_trips = myfile[{'mode': 'trn', 'timeperiod': 'am'}] # [m3]
97102

98103
print('sum of some tables:', np.sum(all_am_trips))
99104

@@ -102,23 +107,23 @@ print('sum of some tables:', np.sum(all_am_trips))
102107
# (any mapping would work, such as a mapping with large gaps between zone
103108
# numbers. For this simple case we'll just assume TAZ numbers are 1-100.)
104109

105-
taz_equivs = np.arange(1,101) # 1-100 inclusive
110+
taz_equivs = np.arange(1, 101) # 1-100 inclusive
106111

107112
myfile.create_mapping('taz', taz_equivs)
108-
print('mappings:', myfile.list_mappings()) # ['taz']
113+
print('mappings:', myfile.list_mappings()) # ['taz']
109114

110-
tazs = myfile.mapping('taz') # Returns a dict: {1:0, 2:1, 3:2, ..., 100:99}
115+
tazs = myfile.mapping('taz') # Returns a dict: {1:0, 2:1, 3:2, ..., 100:99}
111116
m3 = myfile['m3']
112-
print('cell value:', m3[tazs[100]][tazs[100]]) # 3.0 (taz (100,100) is cell [99][99])
117+
print('cell value:', m3[tazs[100]][tazs[100]]) # 3.0 (taz (100,100) is cell [99][99])
113118

114119
myfile.close()
115120
```
116121

117122
# Testing
118-
Testing is done with [nose](https://nose.readthedocs.io/en/latest/). Run the tests via:
123+
Testing is done with [pytest](https://docs.pytest.org/). Run the tests via:
119124

120125
```
121-
openmatrix\test> nosetests -v
126+
pytest
122127
```
123128

124129
# OMX File Validator
@@ -132,7 +137,7 @@ omx-validate my_file.omx
132137

133138
### File Objects
134139

135-
OMX File objects extend Pytables.File, so all Pytables functions work normally. We've also added some useful stuff to make things even easier.
140+
OMX File objects extend h5py.File, so all h5py functions work normally. We've also added some useful stuff to make things even easier.
136141

137142
### Writing Data
138143

@@ -148,25 +153,24 @@ will call createMatrix() for you and populate it with the specified array.
148153

149154
### Accessing Data
150155

151-
You can access matrix objects by name, using dictionary lookup e.g. `myfile['hwydist']` or using PyTables path notation, e.g. `myfile.root.hwydist`
156+
You can access matrix objects by name, using dictionary lookup e.g. `myfile['hwydist']`.
152157

153158
### Matrix objects
154159

155-
OMX matrices extend numpy arrays. An OMX matrix object extends a Pytables/HDF5 "node" which means all HDF5 methods and properties behave normally. Generally these datasets are going to be numpy CArray objects of arbitrary shape.
160+
OMX matrices are h5py Dataset objects. An OMX matrix object extends an h5py Dataset which means all h5py methods and properties behave normally.
156161
You can access a matrix object by name using:
157162

158163
* dictionary syntax, e.g. `myfile['hwydist']`
159-
* or by Pytables path syntax, e.g. `myfile.root.hwydist`
160164

161165
Once you have a matrix object, you can perform normal numpy math on it or you can access rows and columns pythonically:
162166

163167
```python
164168
myfile['biketime'][0][0] = 0.60 * myfile['bikedist'][0][0]
165-
total_trips = numpy.sum(myfile.root.trips)`
169+
total_trips = np.sum(myfile.root.trips)`
166170
```
167171

168172
### Properties
169-
Every Matrix has its own dictionary of key/value pair attributes (properties) which can be accessed using the standard Pytables .attrs field. Add as many attributes as you like; attributes can be string, ints, floats, and lists:
173+
Every Matrix has its own dictionary of key/value pair attributes (properties) which can be accessed using the standard h5py .attrs field. Add as many attributes as you like; attributes can be string, ints, floats, and lists:
170174

171175
```python
172176
print mymatrix.attrs
@@ -202,7 +206,7 @@ OMX module version string. Currently '0.3.5' as of this writing. This is the Py
202206
### `__omx_version__`
203207
OMX file format version. Currently '0.2'. This is the OMX file format specification that omx-python adheres to.
204208

205-
### `open_file`(filename, mode='r', title='', root_uep='/', filters=Filters(complevel=1, complib='zlib', shuffle=True, bitshuffle=False, fletcher32=False, least_significant_digit=None), shape=None, **kwargs)
209+
### `open_file`(filename, mode='r', title='', filters=None, shape=None, **kwargs)
206210
Open or create a new OMX file. New files will be created with default
207211
zlib compression enabled.
208212
@@ -218,7 +222,7 @@ OMX file format version. Currently '0.2'. This is the OMX file format specificat
218222
title : string
219223
Short description of this file, used when creating the file. Default is ''.
220224
Ignored in read-only mode.
221-
filters : tables.Filters
225+
filters : dict
222226
HDF5 default filter options for compression, shuffling, etc. Default for
223227
OMX standard file format is: zlib compression level 1, and shuffle=True.
224228
Only specify this if you want something other than the recommended standard
@@ -258,46 +262,48 @@ OMX file format version. Currently '0.2'. This is the OMX file format specificat
258262
259263
Returns:
260264
--------
261-
mapping : tables.array
265+
mapping : h5py.Dataset
262266
Returns the created mapping.
263267
264268
Raises:
265269
LookupError : if the mapping exists and overwrite=False
266270

267-
### `create_matrix`(self, name, atom=None, shape=None, title='', filters=None, chunkshape=None, byteorder=None, createparents=False, obj=None, attrs=None)
268-
Create an OMX Matrix (CArray) at the root level. User must pass in either
269-
an existing numpy matrix, or a shape and an atom type.
271+
### `create_matrix`(self, name, shape=None, title='', filters=None, chunks=True, obj=None, dtype=None, attrs=None)
272+
Create an OMX Matrix (Dataset) at the root level. User must pass in either
273+
an existing numpy matrix, or a shape and a dtype.
270274
271275
Parameters
272276
----------
273277
name : string
274278
The name of this matrix. Stored in HDF5 as the leaf name.
275279
title : string
276280
Short description of this matrix. Default is ''.
277-
obj : numpy.CArray
281+
obj : numpy.ndarray
278282
Existing numpy array from which to create this OMX matrix. If obj is passed in,
279-
then shape and atom can be left blank. If obj is not passed in, then a shape and
280-
atom must be specified instead. Default is None.
283+
then shape and dtype can be left blank. If obj is not passed in, then a shape and
284+
dtype must be specified instead. Default is None.
281285
shape : numpy.array
282286
Optional shape of the matrix. Shape is an int32 numpy array of format (rows,columns).
283-
If shape is not specified, an existing numpy CArray must be passed in instead,
287+
If shape is not specified, an existing numpy array must be passed in instead,
284288
as the 'obj' parameter. Default is None.
285-
atom : atom_type
286-
Optional atom type of the data. Can be int32, float32, etc. Default is None.
289+
dtype : dtype
290+
Optional data type of the data. Can be 'int32', 'float32', etc. Default is None.
287291
If None specified, then obj parameter must be passed in instead.
288-
filters : tables.Filters
292+
filters : dict
289293
Set of HDF5 filters (compression, etc) used for creating the matrix.
290294
Default is None. See HDF5 documentation for details. Note: while the default here
291295
is None, the default set of filters set at the OMX parent file level is
292296
zlib compression level 1. Those settings usually trickle down to the table level.
297+
chunks : tuple or bool
298+
Chunk shape, or True to enable auto-chunking. Default is True.
293299
attrs : dict
294300
Dictionary of attribute names and values to be attached to this matrix.
295301
Default is None.
296302
297303
Returns
298304
-------
299-
matrix : tables.carray
300-
HDF5 CArray matrix
305+
matrix : h5py.Dataset
306+
HDF5 Dataset matrix
301307

302308
### `delete_mapping`(self, title)
303309
Remove a mapping.

0 commit comments

Comments
 (0)