|
1 | | -# OMX: Open Matrix |
| 1 | +# OMX Python API Documentation |
2 | 2 |
|
3 | | -# Introduction |
| 3 | +The Python OMX API borrows heavily from PyTables. An OMX file extends the equivalent PyTables File object, so anything you can do in PyTables you can do with OMX as well. This API attempts to be very Pythonic, including dictionary-style lookup of matrix names. |
4 | 4 |
|
5 | | -An OMX matrix file is a structured collection of two-dimensional array objects and associated metadata. OMX is built on top of the well-established HDF5 scientific data storage standard. An OMX file has a specific layout that is intended to ensure that complete and consistent information about the matrix data is stored and that the data can be retrieved correctly and efficiently. We hope for the modeling industry to adopt the OMX standard, and we will periodically review the specification to make revisions as necessary. |
| 5 | +* [Pre-requisites](#pre-requisites) |
| 6 | +* [Installation](#installation) |
| 7 | +* [Quick Start Code](#quick-start-sample-code) |
| 8 | +* [Usage Notes](#usage-notes) |
| 9 | +* [API Reference](#api-reference) |
6 | 10 |
|
7 | | -# More Information |
| 11 | +# Pre-requisites |
8 | 12 |
|
9 | | -The OMX project is now entirely on GitHub. Check out our linked wiki (https://github.com/osPlanning/omx/wiki) for more information, including API user guides, how to import and export OMX matrices from EMME, VISUM, TransCAD, and Cube, where to get the OMX viewer, and other background information on how OMX was created. Note that our previous site (https://sites.google.com/site/openmodeldata/) and listserv (https://groups.google.com/forum/?fromgroups#!forum/openmodeldata-discuss) are no longer being maintained. |
| 13 | +Python 2.6+, PyTables 3.1+, and NumPy. Python 3 is now supported too. |
10 | 14 |
|
11 | | -# License |
| 15 | +On Windows, the easiest way to get these is from [Anaconda](https://www.continuum.io/downloads#windows) or from Chris Gohlke's python binaries [page](http://www.lfd.uci.edu/~gohlke/pythonlibs/). On Linux, your distribution already has these available. |
12 | 16 |
|
13 | | -All code written in the OMX project, including all API implementations, is licensed under the Apache License, version 2.0. All code (c) by its respective authors. See LICENSE.TXT for the full Apache 2.0 license text. |
| 17 | +# Installation |
| 18 | + |
| 19 | +The easiest way to get OMX on Python is to use pip. Get the latest package (called OpenMatrix) from the [Python Package Index](https://pypi.python.org/pypi) |
| 20 | + |
| 21 | + `pip install openmatrix` |
| 22 | + |
| 23 | +This command will fetch openmatrix from the PyPi repository and download/install it for you. The package name "omx" was already taken on pip for a lame xml library that no one uses. Thus our little project goes by "openmatrix" on pip instead of "omx". This means your import statements should look like, |
| 24 | + |
| 25 | + `import openmatrix as omx` |
| 26 | + |
| 27 | +and NOT: |
| 28 | + |
| 29 | + `import omx` |
| 30 | + |
| 31 | +# Quick-Start Sample Code |
| 32 | + |
| 33 | +```python |
| 34 | +from __future__ import print_function |
| 35 | +import openmatrix as omx |
| 36 | +import numpy as np |
| 37 | + |
| 38 | +# Create some data |
| 39 | +ones = np.ones((100,100)) |
| 40 | +twos = 2.0*ones |
| 41 | + |
| 42 | +# Create an OMX file (will overwrite existing file!) |
| 43 | +print('Creating myfile.omx') |
| 44 | +myfile = omx.open_file('myfile.omx','w') # use 'a' to append/edit an existing file |
| 45 | + |
| 46 | +# Write to the file. |
| 47 | +myfile['m1'] = ones |
| 48 | +myfile['m2'] = twos |
| 49 | +myfile['m3'] = ones + twos # numpy array math is fast |
| 50 | +myfile.close() |
| 51 | + |
| 52 | +# Open an OMX file for reading only |
| 53 | +print('Reading myfile.omx') |
| 54 | +myfile = omx.open_file('myfile.omx') |
| 55 | + |
| 56 | +print ('Shape:', myfile.shape()) # (100,100) |
| 57 | +print ('Number of tables:', len(myfile)) # 3 |
| 58 | +print ('Table names:', myfile.list_matrices()) # ['m1','m2',',m3'] |
| 59 | + |
| 60 | +# Work with data. Pass a string to select matrix by name: |
| 61 | +# ------------------------------------------------------- |
| 62 | +m1 = myfile['m1'] |
| 63 | +m2 = myfile['m2'] |
| 64 | +m3 = myfile['m3'] |
| 65 | + |
| 66 | +# halves = m1 * 0.5 # CRASH! Don't modify an OMX object directly. |
| 67 | +# # Create a new numpy array, and then edit it. |
| 68 | +halves = np.array(m1) * 0.5 |
| 69 | + |
| 70 | +first_row = m2[0] |
| 71 | +first_row[:] = 0.5 * first_row[:] |
| 72 | + |
| 73 | +my_very_special_zone_value = m2[10][25] |
| 74 | + |
| 75 | +# FANCY: Use attributes to find matrices |
| 76 | +# -------------------------------------- |
| 77 | +myfile.close() # was opened read-only, so let's reopen. |
| 78 | +myfile = omx.open_file('myfile.omx','a') # append mode: read/write existing file |
| 79 | + |
| 80 | +myfile['m1'].attrs.timeperiod = 'am' |
| 81 | +myfile['m1'].attrs.mode = 'hwy' |
| 82 | + |
| 83 | +myfile['m2'].attrs.timeperiod = 'md' |
| 84 | + |
| 85 | +myfile['m3'].attrs.timeperiod = 'am' |
| 86 | +myfile['m3'].attrs.mode = 'trn' |
| 87 | + |
| 88 | +print('attributes:', myfile.list_all_attributes()) # ['mode','timeperiod'] |
| 89 | + |
| 90 | +# Use a DICT to select matrices via attributes: |
| 91 | + |
| 92 | +all_am_trips = myfile[ {'timeperiod':'am'} ] # [m1,m3] |
| 93 | +all_hwy_trips = myfile[ {'mode':'hwy'} ] # [m1] |
| 94 | +all_am_trn_trips = myfile[ {'mode':'trn','timeperiod':'am'} ] # [m3] |
| 95 | + |
| 96 | +print('sum of some tables:', np.sum(all_am_trips)) |
| 97 | + |
| 98 | +# SUPER FANCY: Create a mapping to use TAZ numbers instead of matrix offsets |
| 99 | +# -------------------------------------------------------------------------- |
| 100 | +# (any mapping would work, such as a mapping with large gaps between zone |
| 101 | +# numbers. For this simple case we'll just assume TAZ numbers are 1-100.) |
| 102 | + |
| 103 | +taz_equivs = np.arange(1,101) # 1-100 inclusive |
| 104 | + |
| 105 | +myfile.create_mapping('taz', taz_equivs) |
| 106 | +print('mappings:', myfile.list_mappings()) # ['taz'] |
| 107 | + |
| 108 | +tazs = myfile.mapping('taz') # Returns a dict: {1:0, 2:1, 3:2, ..., 100:99} |
| 109 | +m3 = myfile['m3'] |
| 110 | +print('cell value:', m3[tazs[100]][tazs[100]]) # 3.0 (taz (100,100) is cell [99][99]) |
| 111 | + |
| 112 | +myfile.close() |
| 113 | +``` |
| 114 | + |
| 115 | +# Usage Notes |
| 116 | + |
| 117 | +### File Objects |
| 118 | + |
| 119 | +OMX File objects extend Pytables.File, so all Pytables functions work normally. We've also added some useful stuff to make things even easier. |
| 120 | + |
| 121 | +### Writing Data |
| 122 | + |
| 123 | +Writing data to an OMX file is simple: You must provide a name, and you must provide either an existing numpy (or python) array, or a shape and an "atom". You can optionally provide a descriptive title, a list of tags, and other implementation minutiae. |
| 124 | + |
| 125 | +The easiest way to do all that is to use python dictionary nomenclature: |
| 126 | + |
| 127 | +```python |
| 128 | +myfile['matrixname'] = mynumpyobject |
| 129 | +``` |
| 130 | + |
| 131 | +will call createMatrix() for you and populate it with the specified array. |
| 132 | + |
| 133 | +### Accessing Data |
| 134 | + |
| 135 | +You can access matrix objects by name, using dictionary lookup e.g. `myfile['hwydist']` or using PyTables path notation, e.g. `myfile.root.hwydist` |
| 136 | + |
| 137 | +### Matrix objects |
| 138 | + |
| 139 | +OMX matrices extend numpy arrays. An OMX matrix object extends a Pytables/HDF5 "node" which means all HDF5 methods and properties behave normally. Generally these datasets are going to be numpy CArray objects of arbitrary shape. |
| 140 | +You can access a matrix object by name using: |
| 141 | + |
| 142 | +* dictionary syntax, e.g. `myfile['hwydist']` |
| 143 | +* or by Pytables path syntax, e.g. `myfile.root.hwydist` |
| 144 | + |
| 145 | +Once you have a matrix object, you can perform normal numpy math on it or you can access rows and columns pythonically: |
| 146 | + |
| 147 | +```python |
| 148 | +myfile['biketime'][0][0] = 0.60 * myfile['bikedist'][0][0] |
| 149 | +total_trips = numpy.sum(myfile.root.trips)` |
| 150 | +``` |
| 151 | + |
| 152 | +### Properties |
| 153 | +Every Matrix has its own dictionary of key/value pair attributes (properties) which can be accessed using the standard Pytables .attrs field. Add as many attributes as you like; attributes can be string, ints, floats, and lists: |
| 154 | + |
| 155 | +```python |
| 156 | +print mymatrix.attrs |
| 157 | +print mymatrix.attrs.myfield |
| 158 | +print mymatrix.attrs['myfield'] |
| 159 | +``` |
| 160 | + |
| 161 | +### Tags |
| 162 | + |
| 163 | +If you create tags for your objects, you can also look up matrices by those tags. You can assign tags to any matrix using the 'tags' property attribute. Tags are a list of strings, e.g. ['skims','am','hwy']. To retrieve the list of matrices that matches a given set of tags, pass in a tuple of tags when using dictionary-style lookups: |
| 164 | + |
| 165 | +```python |
| 166 | +list_all_hwy_skims = mybigfile[ ('skims','hwy') ] |
| 167 | +``` |
| 168 | + |
| 169 | +This will always return a list (which can be empty). A matrix will only be included in the returned list if ALL tags specified match exactly. Tags are case-sensitive. |
| 170 | + |
| 171 | +### Mappings |
| 172 | + |
| 173 | +A mapping allows rows and columns to be accessed using an integer value other than a zero-based offset. For instance zone numbers often start at "1" not "0", and there can be significant gaps between zone numbers; they're rarely fully sequential. An OMX file can contain multiple mappings. |
| 174 | + |
| 175 | +* Use the dictionary from mapping() to translate from an key value to a matrix lookup offset, e.g. `taznumber[1] -> matrix[0]` |
| 176 | +* Use the list from mapentries() to translate the other way; i.e. from an offset to an index value, e.g. `matrix[0] -> 1` (where 1 is the TAZ number). |
| 177 | + |
| 178 | + |
| 179 | +# API Reference |
| 180 | + |
| 181 | +## Global Properties |
| 182 | + |
| 183 | +### `__version__` |
| 184 | +OMX module version string. Currently '0.3.3' as of this writing. This is the Python API version. |
| 185 | + |
| 186 | +### `__omx_version__` |
| 187 | +OMX file format version. Currently '0.2'. This is the OMX file format specification that omx-python adheres to. |
| 188 | + |
| 189 | +### `open_file`(filename, mode='r', title='', root_uep='/', filters=Filters(complevel=1, complib='zlib', shuffle=True, bitshuffle=False, fletcher32=False, least_significant_digit=None), shape=None, **kwargs) |
| 190 | + Open or create a new OMX file. New files will be created with default |
| 191 | + zlib compression enabled. |
| 192 | + |
| 193 | + Parameters |
| 194 | + ---------- |
| 195 | + filename : string |
| 196 | + Name or path and name of file |
| 197 | + mode : string |
| 198 | + 'r' for read-only; |
| 199 | + 'w' to write (erases existing file); |
| 200 | + 'a' to read/write an existing file (will create it if doesn't exist). |
| 201 | + Ignored in read-only mode. |
| 202 | + title : string |
| 203 | + Short description of this file, used when creating the file. Default is ''. |
| 204 | + Ignored in read-only mode. |
| 205 | + filters : tables.Filters |
| 206 | + HDF5 default filter options for compression, shuffling, etc. Default for |
| 207 | + OMX standard file format is: zlib compression level 1, and shuffle=True. |
| 208 | + Only specify this if you want something other than the recommended standard |
| 209 | + HDF5 zip compression. |
| 210 | + 'None' will create enormous uncompressed files. |
| 211 | + Only 'zlib' compression is guaranteed to be available on all HDF5 implementations. |
| 212 | + See HDF5 docs for more detail. |
| 213 | + shape: numpy.array |
| 214 | + Shape of matrices in this file. Default is None. Specify a valid shape |
| 215 | + (e.g. (1000,1200)) to enforce shape-checking for all added objects. |
| 216 | + If shape is not specified, the first added matrix will not be shape-checked |
| 217 | + and all subsequently added matrices must match the shape of the first matrix. |
| 218 | + All tables in an OMX file must have the same shape. |
| 219 | + |
| 220 | + Returns |
| 221 | + ------- |
| 222 | + f : openmatrix.File |
| 223 | + The file object for reading and writing. |
| 224 | + |
| 225 | +## File Objects |
| 226 | + |
| 227 | +### `create_mapping`(self, title, entries, overwrite=False) |
| 228 | + Create an equivalency index, which maps a raw data dimension to |
| 229 | + another integer value. Once created, mappings can be referenced by |
| 230 | + offset or by key. |
| 231 | + |
| 232 | + Parameters: |
| 233 | + ----------- |
| 234 | + title : string |
| 235 | + Name of this mapping |
| 236 | + entries : list |
| 237 | + List of n equivalencies for the mapping. n must match one data |
| 238 | + dimension of the matrix. |
| 239 | + overwrite : boolean |
| 240 | + True to allow overwriting an existing mapping, False will raise |
| 241 | + a LookupError if the mapping already exists. Default is False. |
| 242 | + |
| 243 | + Returns: |
| 244 | + -------- |
| 245 | + mapping : tables.array |
| 246 | + Returns the created mapping. |
| 247 | + |
| 248 | + Raises: |
| 249 | + LookupError : if the mapping exists and overwrite=False |
| 250 | + |
| 251 | +### `create_matrix`(self, name, atom=None, shape=None, title='', filters=None, chunkshape=None, byteorder=None, createparents=False, obj=None, attrs=None) |
| 252 | + Create an OMX Matrix (CArray) at the root level. User must pass in either |
| 253 | + an existing numpy matrix, or a shape and an atom type. |
| 254 | + |
| 255 | + Parameters |
| 256 | + ---------- |
| 257 | + name : string |
| 258 | + The name of this matrix. Stored in HDF5 as the leaf name. |
| 259 | + title : string |
| 260 | + Short description of this matrix. Default is ''. |
| 261 | + obj : numpy.CArray |
| 262 | + Existing numpy array from which to create this OMX matrix. If obj is passed in, |
| 263 | + then shape and atom can be left blank. If obj is not passed in, then a shape and |
| 264 | + atom must be specified instead. Default is None. |
| 265 | + shape : numpy.array |
| 266 | + Optional shape of the matrix. Shape is an int32 numpy array of format (rows,columns). |
| 267 | + If shape is not specified, an existing numpy CArray must be passed in instead, |
| 268 | + as the 'obj' parameter. Default is None. |
| 269 | + atom : atom_type |
| 270 | + Optional atom type of the data. Can be int32, float32, etc. Default is None. |
| 271 | + If None specified, then obj parameter must be passed in instead. |
| 272 | + filters : tables.Filters |
| 273 | + Set of HDF5 filters (compression, etc) used for creating the matrix. |
| 274 | + Default is None. See HDF5 documentation for details. Note: while the default here |
| 275 | + is None, the default set of filters set at the OMX parent file level is |
| 276 | + zlib compression level 1. Those settings usually trickle down to the table level. |
| 277 | + attrs : dict |
| 278 | + Dictionary of attribute names and values to be attached to this matrix. |
| 279 | + Default is None. |
| 280 | + |
| 281 | + Returns |
| 282 | + ------- |
| 283 | + matrix : tables.carray |
| 284 | + HDF5 CArray matrix |
| 285 | + |
| 286 | +### `delete_mapping`(self, title) |
| 287 | + Remove a mapping. |
| 288 | + |
| 289 | + Raises: |
| 290 | + ------- |
| 291 | + LookupError : if the specified mapping does not exist. |
| 292 | + |
| 293 | +### `list_all_attributes`(self) |
| 294 | + Return set of all attributes used for any Matrix in this File |
| 295 | + |
| 296 | + Returns |
| 297 | + ------- |
| 298 | + all_attributes : set |
| 299 | + The combined set of all attribute names that exist on any matrix in this file. |
| 300 | + |
| 301 | +### `list_mappings`(self) |
| 302 | + List all mappings in this file |
| 303 | + |
| 304 | + Returns: |
| 305 | + -------- |
| 306 | + mappings : list |
| 307 | + List of the names of all mappings in the OMX file. Mappings |
| 308 | + are stored internally in the 'lookup' subset of the HDF5 file |
| 309 | + structure. Returns empty list if there are no mappings. |
| 310 | + |
| 311 | +### `list_matrices`(self) |
| 312 | + List the matrix names in this File |
| 313 | + |
| 314 | + Returns |
| 315 | + ------- |
| 316 | + matrices : list |
| 317 | + List of all matrix names stored in this OMX file. |
| 318 | + |
| 319 | +### `map_entries`(self, title) |
| 320 | + Return a list of entries for the specified mapping. |
| 321 | + Throws LookupError if the specified mapping does not exist. |
| 322 | + |
| 323 | +### `mapping`(self, title) |
| 324 | + Return dict containing key:value pairs for specified mapping. Keys |
| 325 | + represent the map item and value represents the array offset. |
| 326 | + |
| 327 | + Parameters: |
| 328 | + ----------- |
| 329 | + title : string |
| 330 | + Name of the mapping to be returned |
| 331 | + |
| 332 | + Returns: |
| 333 | + -------- |
| 334 | + mapping : dict |
| 335 | + Dictionary where each key is the map item, and the value |
| 336 | + represents the array offset. |
| 337 | + |
| 338 | + Raises: |
| 339 | + ------- |
| 340 | + LookupError : if the specified mapping does not exist. |
| 341 | + |
| 342 | +### `shape`(self) |
| 343 | + Get the one and only shape of all matrices in this File |
| 344 | + |
| 345 | + Returns |
| 346 | + ------- |
| 347 | + shape : tuple |
| 348 | + Tuple of (rows,columns) for this matrix and file. |
| 349 | + |
| 350 | +### `version`(self) |
| 351 | + Return the OMX file format of this OMX file, embedded in the OMX_VERSION file attribute. |
| 352 | + Returns None if the OMX_VERSION attribute is not set. |
| 353 | + |
| 354 | + |
| 355 | +## Exceptions |
| 356 | +* LookupError |
| 357 | +* ShapeError |
0 commit comments