Skip to content

Commit 54fc8dc

Browse files
committed
cleanup
1 parent 38e83fb commit 54fc8dc

2 files changed

Lines changed: 351 additions & 7 deletions

File tree

README.md

Lines changed: 351 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,357 @@
1-
# OMX: Open Matrix
1+
# OMX Python API Documentation
22

3-
# Introduction
3+
The Python OMX API borrows heavily from PyTables. An OMX file extends the equivalent PyTables File object, so anything you can do in PyTables you can do with OMX as well. This API attempts to be very Pythonic, including dictionary-style lookup of matrix names.
44

5-
An OMX matrix file is a structured collection of two-dimensional array objects and associated metadata. OMX is built on top of the well-established HDF5 scientific data storage standard. An OMX file has a specific layout that is intended to ensure that complete and consistent information about the matrix data is stored and that the data can be retrieved correctly and efficiently. We hope for the modeling industry to adopt the OMX standard, and we will periodically review the specification to make revisions as necessary.
5+
* [Pre-requisites](#pre-requisites)
6+
* [Installation](#installation)
7+
* [Quick Start Code](#quick-start-sample-code)
8+
* [Usage Notes](#usage-notes)
9+
* [API Reference](#api-reference)
610

7-
# More Information
11+
# Pre-requisites
812

9-
The OMX project is now entirely on GitHub. Check out our linked wiki (https://github.com/osPlanning/omx/wiki) for more information, including API user guides, how to import and export OMX matrices from EMME, VISUM, TransCAD, and Cube, where to get the OMX viewer, and other background information on how OMX was created. Note that our previous site (https://sites.google.com/site/openmodeldata/) and listserv (https://groups.google.com/forum/?fromgroups#!forum/openmodeldata-discuss) are no longer being maintained.
13+
Python 2.6+, PyTables 3.1+, and NumPy. Python 3 is now supported too.
1014

11-
# License
15+
On Windows, the easiest way to get these is from [Anaconda](https://www.continuum.io/downloads#windows) or from Chris Gohlke's python binaries [page](http://www.lfd.uci.edu/~gohlke/pythonlibs/). On Linux, your distribution already has these available.
1216

13-
All code written in the OMX project, including all API implementations, is licensed under the Apache License, version 2.0. All code (c) by its respective authors. See LICENSE.TXT for the full Apache 2.0 license text.
17+
# Installation
18+
19+
The easiest way to get OMX on Python is to use pip. Get the latest package (called OpenMatrix) from the [Python Package Index](https://pypi.python.org/pypi)
20+
21+
`pip install openmatrix`
22+
23+
This command will fetch openmatrix from the PyPi repository and download/install it for you. The package name "omx" was already taken on pip for a lame xml library that no one uses. Thus our little project goes by "openmatrix" on pip instead of "omx". This means your import statements should look like,
24+
25+
`import openmatrix as omx`
26+
27+
and NOT:
28+
29+
`import omx`
30+
31+
# Quick-Start Sample Code
32+
33+
```python
34+
from __future__ import print_function
35+
import openmatrix as omx
36+
import numpy as np
37+
38+
# Create some data
39+
ones = np.ones((100,100))
40+
twos = 2.0*ones
41+
42+
# Create an OMX file (will overwrite existing file!)
43+
print('Creating myfile.omx')
44+
myfile = omx.open_file('myfile.omx','w') # use 'a' to append/edit an existing file
45+
46+
# Write to the file.
47+
myfile['m1'] = ones
48+
myfile['m2'] = twos
49+
myfile['m3'] = ones + twos # numpy array math is fast
50+
myfile.close()
51+
52+
# Open an OMX file for reading only
53+
print('Reading myfile.omx')
54+
myfile = omx.open_file('myfile.omx')
55+
56+
print ('Shape:', myfile.shape()) # (100,100)
57+
print ('Number of tables:', len(myfile)) # 3
58+
print ('Table names:', myfile.list_matrices()) # ['m1','m2',',m3']
59+
60+
# Work with data. Pass a string to select matrix by name:
61+
# -------------------------------------------------------
62+
m1 = myfile['m1']
63+
m2 = myfile['m2']
64+
m3 = myfile['m3']
65+
66+
# halves = m1 * 0.5 # CRASH! Don't modify an OMX object directly.
67+
# # Create a new numpy array, and then edit it.
68+
halves = np.array(m1) * 0.5
69+
70+
first_row = m2[0]
71+
first_row[:] = 0.5 * first_row[:]
72+
73+
my_very_special_zone_value = m2[10][25]
74+
75+
# FANCY: Use attributes to find matrices
76+
# --------------------------------------
77+
myfile.close() # was opened read-only, so let's reopen.
78+
myfile = omx.open_file('myfile.omx','a') # append mode: read/write existing file
79+
80+
myfile['m1'].attrs.timeperiod = 'am'
81+
myfile['m1'].attrs.mode = 'hwy'
82+
83+
myfile['m2'].attrs.timeperiod = 'md'
84+
85+
myfile['m3'].attrs.timeperiod = 'am'
86+
myfile['m3'].attrs.mode = 'trn'
87+
88+
print('attributes:', myfile.list_all_attributes()) # ['mode','timeperiod']
89+
90+
# Use a DICT to select matrices via attributes:
91+
92+
all_am_trips = myfile[ {'timeperiod':'am'} ] # [m1,m3]
93+
all_hwy_trips = myfile[ {'mode':'hwy'} ] # [m1]
94+
all_am_trn_trips = myfile[ {'mode':'trn','timeperiod':'am'} ] # [m3]
95+
96+
print('sum of some tables:', np.sum(all_am_trips))
97+
98+
# SUPER FANCY: Create a mapping to use TAZ numbers instead of matrix offsets
99+
# --------------------------------------------------------------------------
100+
# (any mapping would work, such as a mapping with large gaps between zone
101+
# numbers. For this simple case we'll just assume TAZ numbers are 1-100.)
102+
103+
taz_equivs = np.arange(1,101) # 1-100 inclusive
104+
105+
myfile.create_mapping('taz', taz_equivs)
106+
print('mappings:', myfile.list_mappings()) # ['taz']
107+
108+
tazs = myfile.mapping('taz') # Returns a dict: {1:0, 2:1, 3:2, ..., 100:99}
109+
m3 = myfile['m3']
110+
print('cell value:', m3[tazs[100]][tazs[100]]) # 3.0 (taz (100,100) is cell [99][99])
111+
112+
myfile.close()
113+
```
114+
115+
# Usage Notes
116+
117+
### File Objects
118+
119+
OMX File objects extend Pytables.File, so all Pytables functions work normally. We've also added some useful stuff to make things even easier.
120+
121+
### Writing Data
122+
123+
Writing data to an OMX file is simple: You must provide a name, and you must provide either an existing numpy (or python) array, or a shape and an "atom". You can optionally provide a descriptive title, a list of tags, and other implementation minutiae.
124+
125+
The easiest way to do all that is to use python dictionary nomenclature:
126+
127+
```python
128+
myfile['matrixname'] = mynumpyobject
129+
```
130+
131+
will call createMatrix() for you and populate it with the specified array.
132+
133+
### Accessing Data
134+
135+
You can access matrix objects by name, using dictionary lookup e.g. `myfile['hwydist']` or using PyTables path notation, e.g. `myfile.root.hwydist`
136+
137+
### Matrix objects
138+
139+
OMX matrices extend numpy arrays. An OMX matrix object extends a Pytables/HDF5 "node" which means all HDF5 methods and properties behave normally. Generally these datasets are going to be numpy CArray objects of arbitrary shape.
140+
You can access a matrix object by name using:
141+
142+
* dictionary syntax, e.g. `myfile['hwydist']`
143+
* or by Pytables path syntax, e.g. `myfile.root.hwydist`
144+
145+
Once you have a matrix object, you can perform normal numpy math on it or you can access rows and columns pythonically:
146+
147+
```python
148+
myfile['biketime'][0][0] = 0.60 * myfile['bikedist'][0][0]
149+
total_trips = numpy.sum(myfile.root.trips)`
150+
```
151+
152+
### Properties
153+
Every Matrix has its own dictionary of key/value pair attributes (properties) which can be accessed using the standard Pytables .attrs field. Add as many attributes as you like; attributes can be string, ints, floats, and lists:
154+
155+
```python
156+
print mymatrix.attrs
157+
print mymatrix.attrs.myfield
158+
print mymatrix.attrs['myfield']
159+
```
160+
161+
### Tags
162+
163+
If you create tags for your objects, you can also look up matrices by those tags. You can assign tags to any matrix using the 'tags' property attribute. Tags are a list of strings, e.g. ['skims','am','hwy']. To retrieve the list of matrices that matches a given set of tags, pass in a tuple of tags when using dictionary-style lookups:
164+
165+
```python
166+
list_all_hwy_skims = mybigfile[ ('skims','hwy') ]
167+
```
168+
169+
This will always return a list (which can be empty). A matrix will only be included in the returned list if ALL tags specified match exactly. Tags are case-sensitive.
170+
171+
### Mappings
172+
173+
A mapping allows rows and columns to be accessed using an integer value other than a zero-based offset. For instance zone numbers often start at "1" not "0", and there can be significant gaps between zone numbers; they're rarely fully sequential. An OMX file can contain multiple mappings.
174+
175+
* Use the dictionary from mapping() to translate from an key value to a matrix lookup offset, e.g. `taznumber[1] -> matrix[0]`
176+
* Use the list from mapentries() to translate the other way; i.e. from an offset to an index value, e.g. `matrix[0] -> 1` (where 1 is the TAZ number).
177+
178+
179+
# API Reference
180+
181+
## Global Properties
182+
183+
### `__version__`
184+
OMX module version string. Currently '0.3.3' as of this writing. This is the Python API version.
185+
186+
### `__omx_version__`
187+
OMX file format version. Currently '0.2'. This is the OMX file format specification that omx-python adheres to.
188+
189+
### `open_file`(filename, mode='r', title='', root_uep='/', filters=Filters(complevel=1, complib='zlib', shuffle=True, bitshuffle=False, fletcher32=False, least_significant_digit=None), shape=None, **kwargs)
190+
Open or create a new OMX file. New files will be created with default
191+
zlib compression enabled.
192+
193+
Parameters
194+
----------
195+
filename : string
196+
Name or path and name of file
197+
mode : string
198+
'r' for read-only;
199+
'w' to write (erases existing file);
200+
'a' to read/write an existing file (will create it if doesn't exist).
201+
Ignored in read-only mode.
202+
title : string
203+
Short description of this file, used when creating the file. Default is ''.
204+
Ignored in read-only mode.
205+
filters : tables.Filters
206+
HDF5 default filter options for compression, shuffling, etc. Default for
207+
OMX standard file format is: zlib compression level 1, and shuffle=True.
208+
Only specify this if you want something other than the recommended standard
209+
HDF5 zip compression.
210+
'None' will create enormous uncompressed files.
211+
Only 'zlib' compression is guaranteed to be available on all HDF5 implementations.
212+
See HDF5 docs for more detail.
213+
shape: numpy.array
214+
Shape of matrices in this file. Default is None. Specify a valid shape
215+
(e.g. (1000,1200)) to enforce shape-checking for all added objects.
216+
If shape is not specified, the first added matrix will not be shape-checked
217+
and all subsequently added matrices must match the shape of the first matrix.
218+
All tables in an OMX file must have the same shape.
219+
220+
Returns
221+
-------
222+
f : openmatrix.File
223+
The file object for reading and writing.
224+
225+
## File Objects
226+
227+
### `create_mapping`(self, title, entries, overwrite=False)
228+
Create an equivalency index, which maps a raw data dimension to
229+
another integer value. Once created, mappings can be referenced by
230+
offset or by key.
231+
232+
Parameters:
233+
-----------
234+
title : string
235+
Name of this mapping
236+
entries : list
237+
List of n equivalencies for the mapping. n must match one data
238+
dimension of the matrix.
239+
overwrite : boolean
240+
True to allow overwriting an existing mapping, False will raise
241+
a LookupError if the mapping already exists. Default is False.
242+
243+
Returns:
244+
--------
245+
mapping : tables.array
246+
Returns the created mapping.
247+
248+
Raises:
249+
LookupError : if the mapping exists and overwrite=False
250+
251+
### `create_matrix`(self, name, atom=None, shape=None, title='', filters=None, chunkshape=None, byteorder=None, createparents=False, obj=None, attrs=None)
252+
Create an OMX Matrix (CArray) at the root level. User must pass in either
253+
an existing numpy matrix, or a shape and an atom type.
254+
255+
Parameters
256+
----------
257+
name : string
258+
The name of this matrix. Stored in HDF5 as the leaf name.
259+
title : string
260+
Short description of this matrix. Default is ''.
261+
obj : numpy.CArray
262+
Existing numpy array from which to create this OMX matrix. If obj is passed in,
263+
then shape and atom can be left blank. If obj is not passed in, then a shape and
264+
atom must be specified instead. Default is None.
265+
shape : numpy.array
266+
Optional shape of the matrix. Shape is an int32 numpy array of format (rows,columns).
267+
If shape is not specified, an existing numpy CArray must be passed in instead,
268+
as the 'obj' parameter. Default is None.
269+
atom : atom_type
270+
Optional atom type of the data. Can be int32, float32, etc. Default is None.
271+
If None specified, then obj parameter must be passed in instead.
272+
filters : tables.Filters
273+
Set of HDF5 filters (compression, etc) used for creating the matrix.
274+
Default is None. See HDF5 documentation for details. Note: while the default here
275+
is None, the default set of filters set at the OMX parent file level is
276+
zlib compression level 1. Those settings usually trickle down to the table level.
277+
attrs : dict
278+
Dictionary of attribute names and values to be attached to this matrix.
279+
Default is None.
280+
281+
Returns
282+
-------
283+
matrix : tables.carray
284+
HDF5 CArray matrix
285+
286+
### `delete_mapping`(self, title)
287+
Remove a mapping.
288+
289+
Raises:
290+
-------
291+
LookupError : if the specified mapping does not exist.
292+
293+
### `list_all_attributes`(self)
294+
Return set of all attributes used for any Matrix in this File
295+
296+
Returns
297+
-------
298+
all_attributes : set
299+
The combined set of all attribute names that exist on any matrix in this file.
300+
301+
### `list_mappings`(self)
302+
List all mappings in this file
303+
304+
Returns:
305+
--------
306+
mappings : list
307+
List of the names of all mappings in the OMX file. Mappings
308+
are stored internally in the 'lookup' subset of the HDF5 file
309+
structure. Returns empty list if there are no mappings.
310+
311+
### `list_matrices`(self)
312+
List the matrix names in this File
313+
314+
Returns
315+
-------
316+
matrices : list
317+
List of all matrix names stored in this OMX file.
318+
319+
### `map_entries`(self, title)
320+
Return a list of entries for the specified mapping.
321+
Throws LookupError if the specified mapping does not exist.
322+
323+
### `mapping`(self, title)
324+
Return dict containing key:value pairs for specified mapping. Keys
325+
represent the map item and value represents the array offset.
326+
327+
Parameters:
328+
-----------
329+
title : string
330+
Name of the mapping to be returned
331+
332+
Returns:
333+
--------
334+
mapping : dict
335+
Dictionary where each key is the map item, and the value
336+
represents the array offset.
337+
338+
Raises:
339+
-------
340+
LookupError : if the specified mapping does not exist.
341+
342+
### `shape`(self)
343+
Get the one and only shape of all matrices in this File
344+
345+
Returns
346+
-------
347+
shape : tuple
348+
Tuple of (rows,columns) for this matrix and file.
349+
350+
### `version`(self)
351+
Return the OMX file format of this OMX file, embedded in the OMX_VERSION file attribute.
352+
Returns None if the OMX_VERSION attribute is not set.
353+
354+
355+
## Exceptions
356+
* LookupError
357+
* ShapeError

example.omx

-2.29 MB
Binary file not shown.

0 commit comments

Comments
 (0)