Skip to content

Commit b9494ae

Browse files
d-w-moorealanking
authored andcommitted
[#712] irods query limit as part of irods.client_configuration
Note that as a configuration setting, it may optionally be loaded from an environment variable.
1 parent c65912e commit b9494ae

7 files changed

Lines changed: 124 additions & 16 deletions

File tree

README.md

Lines changed: 40 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -543,7 +543,19 @@ loading of the settings from the settings file, assuming it exists.
543543
(Failure to find the file at the indicated path will be logged as a
544544
warning.)
545545

546-
Settings can also be saved and loaded manually using the `save()` and
546+
If the process of loading configuration settings generates an unhandled exception,
547+
the `irods` module will consult the value of the environment variable
548+
`PYTHON_IRODSCLIENT_CONFIGURATION_LOAD_ERRORS_FATAL` and either (if `True`) incur
549+
a fatal error or (if `False`) simply log it as a warning. By default this
550+
environment variable is taken as `False` if not set, although that is likely to
551+
change to `True` in the future.
552+
This decision indicates a desire by the client library's designers to flag such
553+
errors as fatal by aborting the process if possible, rather than aquiescing silently
554+
to an unintended consequence. It is recommended that all client library users
555+
set this environment variable to `True` if their applications load settings
556+
from the environment or configuration files.
557+
558+
Settings can be saved and loaded manually using the `save()` and
547559
`load()` functions in the `irods.client_configuration`
548560
module. Each of these functions accepts an optional `file`
549561
parameter which, if set to a non-empty string, will override the
@@ -609,6 +621,12 @@ certain environment variables:
609621
- Default Value: `False`
610622
- Environment Variable Override: `PYTHON_IRODSCLIENT_CONFIG__LEGACY_AUTH__PAM__FORCE_USE_OF_DEDICATED_PAM_API`
611623

624+
- Setting: The maximum number of rows a Query iteration will retrieve. Can be overridden using Query's `limit` method. This is synonymous with the variable `irods.query.IRODS_QUERY_LIMIT`, and a read (or write) of this configuration setting will read (or affect) that variable.
625+
- Dotted Name: `genquery1.irods_query_limit`
626+
- Type: `int`
627+
- Default Value: 500 (The library's traditional maximum batch row count, also stored as `irods.query._IRODS_QUERY_LIMIT`)
628+
- Environment Variable Override: `PYTHON_IRODSCLIENT_CONFIG__GENQUERY1__IRODS_QUERY_LIMIT`
629+
612630
- Setting: Default choice of XML parser for all new threads.
613631
- Dotted Name: `connections.xml_parser_default`
614632
- Type: `str`
@@ -1176,8 +1194,8 @@ Also, core.py rules may only be run directly by a rodsadmin, currently.
11761194
[See this issue for discussion](https://github.com/irods/irods_rule_engine_plugin_python/issues/105).
11771195

11781196

1179-
General Queries
1180-
---------------
1197+
GenQuery1 Queries
1198+
-----------------
11811199

11821200
A session object's `query` method accepts a list of columns and models to be included in any returned rows.
11831201

@@ -1285,6 +1303,25 @@ parameter to the query:
12851303
+---------------------+
12861304
```
12871305

1306+
If the user desires a smaller paging size, this can be accomplished by changing
1307+
the `genquery1.irods_query_limit` configuration setting. This directly changes
1308+
the `irods.query.IRODS_QUERY_LIMIT` value and therefore affects Query objects'
1309+
`all` and `get_batches` methods.
1310+
1311+
Note, however, that expressions such as `list(Query(...))` and `(row for row in
1312+
Query)` are not affected by this setting.
1313+
1314+
The setting may be given any positive integer value. Attempting to set it to
1315+
zero or a negative number will not affect the value of `IRODS_QUERY_LIMIT` but will
1316+
raise a `ConfigurationValueError` if done in the course of a running iRODS
1317+
client application. Importantly, if the invalid setting is attempted by a loading
1318+
configuration, the otherwise fatal error could be absorbed into a warning-level log action,
1319+
depending on the value of the `PYTHON_IRODSCLIENT_CONFIGURATION_LOAD_ERRORS_FATAL`
1320+
environment variable. See: [Python iRODS Client Settings File](#python-irods-client-settings-file).
1321+
1322+
A copy of the original value of `IRODS_QUERY_LIMIT` (before overriding values are loaded from the
1323+
configuration) is stored within `irods.query.cached_values.IRODS_QUERY_LIMIT`.
1324+
12881325
Specific Queries
12891326
----------------
12901327

irods/client_configuration/__init__.py

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
# Duplicate here for convenience
1313
from .. import DEFAULT_CONFIG_PATH
1414

15-
logger = logging.Logger(__name__)
15+
logger = logging.getLogger(__name__)
1616

1717

1818
class iRODSConfiguration:
@@ -57,9 +57,31 @@ def xml_parser_default(self, str_value):
5757

5858
return set_default_XML_by_name(str_value)
5959

60-
6160
connections = ConnectionsProperties()
6261

62+
class ConfigurationError(BaseException): pass
63+
class ConfigurationValueError(ValueError,ConfigurationError): pass
64+
65+
class Genquery1_Properties(iRODSConfiguration, metaclass=iRODSConfigAliasMetaclass):
66+
67+
@property
68+
def irods_query_limit(self):
69+
import irods.query
70+
return irods.query.IRODS_QUERY_LIMIT
71+
72+
@irods_query_limit.setter
73+
def irods_query_limit(self, target_value):
74+
import irods.query
75+
requested = int(target_value)
76+
77+
if requested <= 0:
78+
raise ConfigurationValueError(f'Error setting IRODS_QUERY_LIMIT to [{requested}]. Use positive values only.')
79+
80+
irods.query.IRODS_QUERY_LIMIT = requested
81+
82+
genquery1 = Genquery1_Properties()
83+
84+
6385
# #############################################################################
6486
#
6587
# Classes for building client configuration categories

irods/helpers/__init__.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@
1515
"get_data_object",
1616
]
1717

18-
1918
class StopTestsException(Exception):
2019

2120
def __init__(self, *args, **kwargs):
@@ -144,3 +143,16 @@ def get_collection(sess, logical_path):
144143
return sess.collections.get(logical_path)
145144
except ex.CollectionDoesNotExist:
146145
return None
146+
147+
148+
# Utility class and factory function for storing the original value of variables within the given namespace.
149+
def create_value_cache(namespace:dict):
150+
class CachedValues:
151+
__namespace = namespace
152+
153+
@classmethod
154+
def make_entry(cls, name):
155+
cached_value = cls.__namespace[name]
156+
setattr(cls,name,property(lambda self: cached_value))
157+
158+
return CachedValues()

irods/manager/access_manager.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ def get(self, target, report_raw_acls=True, **kw):
7373
results = (
7474
self.sess.query(user_type.name, user_type.zone, access_type.name)
7575
.filter(*conditions)
76-
.all()
76+
._all()
7777
)
7878

7979
def get_usertype(row):

irods/manager/metadata_manager.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ def get(self, model_cls, path):
8383
columns = (model.id, model.name, model.value, model.units)
8484
if self.use_timestamps:
8585
columns += (model.create_time, model.modify_time)
86-
results = self.sess.query(*columns).filter(*conditions).all()
86+
results = self.sess.query(*columns).filter(*conditions)._all()
8787

8888
def meta_opts(row):
8989
opts = {"avu_id": row[model.id]}

irods/query.py

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
import os
2-
31
from collections import OrderedDict
42

53
from irods import MAX_SQL_ROWS
@@ -36,7 +34,8 @@
3634
"SELECT_COUNT": 6,
3735
}
3836

39-
IRODS_QUERY_LIMIT = os.getenv("IRODS_QUERY_LIMIT", 500)
37+
IRODS_QUERY_LIMIT = 500
38+
4039
class Query:
4140

4241
def __init__(self, sess, *args, **kwargs):
@@ -198,7 +197,7 @@ def _kw_message(self):
198197
return StringStringMap(dct)
199198

200199
def _message(self):
201-
max_rows = IRODS_QUERY_LIMIT if self._limit == -1 else self._limit
200+
max_rows = IRODS_QUERY_LIMIT if self._limit < 0 else self._limit
202201
args = {
203202
"maxRows": max_rows,
204203
"continueInx": self._continue_index,
@@ -233,6 +232,13 @@ def close(self):
233232
"""
234233
self.limit(0).execute()
235234

235+
def _all(self):
236+
"""Internally used version of all(). Unlike the public version, the returned iterator when
237+
executed to completion is unaffected by IRODS_QUERY_LIMIT. This is in order to accurately
238+
reflect the enumeration of sub-objects, e.g. AVU's associated with an object.
239+
"""
240+
return self.get_results()
241+
236242
def all(self):
237243
result_set = self.execute()
238244
if result_set.continue_index > 0:
@@ -382,3 +388,9 @@ def get_results(self):
382388
for result_set in self.get_batches():
383389
for result in result_set:
384390
yield result
391+
392+
393+
# Record a copy of the original value of IRODS_QUERY_LIMIT for possible future access.
394+
import irods.helpers as helpers
395+
cached_values = helpers.create_value_cache(globals())
396+
cached_values.make_entry('IRODS_QUERY_LIMIT')

irods/test/query_test.py

Lines changed: 29 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,14 +26,15 @@
2626
CAT_UNKNOWN_SPECIFIC_QUERY,
2727
CAT_INVALID_ARGUMENT,
2828
)
29-
from irods.query import SpecificQuery
29+
from irods import MAX_SQL_ROWS
30+
import irods.client_configuration as config
3031
from irods.column import Like, NotLike, Between, In
32+
import irods.keywords as kw
3133
from irods.meta import iRODSMeta
34+
from irods.query import SpecificQuery
3235
from irods.rule import Rule
33-
from irods import MAX_SQL_ROWS
34-
from irods.test.helpers import irods_shared_reg_resc_vault
3536
import irods.test.helpers as helpers
36-
import irods.keywords as kw
37+
from irods.test.helpers import irods_shared_reg_resc_vault
3738

3839
IRODS_STATEMENT_TABLE_SIZE = 50
3940

@@ -1048,6 +1049,30 @@ def test_negating_columns_in_genquery1_results__issue_755(self):
10481049
row = list(q.all())[0]
10491050
self.assertEqual(columns_to_negate & row.keys(), intersect)
10501051

1052+
def test_set_query_limit__issue_712(self):
1053+
# Test requires that 0 < num_limited_results < num_total_objects.
1054+
num_limited_results = 4
1055+
num_total_objects = num_limited_results + 6
1056+
data_objs = []
1057+
try:
1058+
# Create a number of data objects.
1059+
for name in range(num_total_objects):
1060+
data_objs.append(self.sess.data_objects.create(f'{self.coll_path}/issue_712_obj{name}'))
1061+
1062+
# Test a query limit via configuration.
1063+
with config.loadlines(
1064+
entries=[dict(setting="genquery1.irods_query_limit", value=num_limited_results)]
1065+
):
1066+
limited_results = list(self.sess.query(DataObject.id).filter(Like(DataObject.name, '%issue_712_obj%')).all())
1067+
self.assertEqual(num_limited_results, len(limited_results))
1068+
1069+
# Test the query limit is no longer in effect.
1070+
non_limited_results = list(self.sess.query(DataObject.id).filter(Like(DataObject.name, '%issue_712_obj%')).all())
1071+
self.assertEqual(num_total_objects, len(non_limited_results))
1072+
finally:
1073+
for d in data_objs:
1074+
d.unlink(force=True)
1075+
10511076

10521077
class TestSpecificQuery(unittest.TestCase):
10531078

0 commit comments

Comments
 (0)