Skip to content

Commit a8973df

Browse files
OutlyingWestOutlyingWestElias Wernerelwer
authored
Refactor kernel tests: migrate from YAML-based to notebook-based test cases (#46)
* Migrated kernel tests from YAML-driven definitions to Jupyter notebooks (`.ipynb`). Summary of changes: - All YAML test inputs replaced with `.ipynb` notebooks. - Each notebook-based test can now be run independently of the others. - Replaced `check_stream_output()` with `check_from_notebook()` to parse and execute notebook cells. - Added environment variable setup (`SCORP_ENABLE_TRACING`, `SCORP_TOTAL_MEMORY`, etc.) for consistent test behavior. * fix capital letters in env variables, smaller clarifications * Update README.md * Update README.md * Update README.md --------- Co-authored-by: OutlyingWest <alexeybuv7@amail.com> Co-authored-by: Elias Werner <elias.werner@tu-dresden.de> Co-authored-by: Elias Werner <eliwerner3@googlemail.com>
1 parent 4e44e26 commit a8973df

25 files changed

Lines changed: 1238 additions & 1252 deletions

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ For binding to Score-P, the kernel uses the [Score-P Python bindings](https://gi
4646
To install the kernel and required dependencies:
4747

4848
```
49-
pip install scorep_jupyter
49+
pip install scorep-jupyter
5050
python -m scorep_jupyter.install
5151
```
5252

@@ -67,7 +67,7 @@ From the Score-P Python bindings:
6767
> Please make sure that `scorep-config` is in your `PATH` variable.
6868
> For Ubuntu LTS systems there is a non-official ppa of Score-P available: https://launchpad.net/~andreasgocht/+archive/ubuntu/scorep .
6969
70-
To use the coarse-grained performance measurements, simply install the JUmPER extension via:
70+
To use the coarse-grained performance measurements, simply install the jumper ipython extension via:
7171

7272
```
7373
pip install jumper_extension
@@ -110,17 +110,17 @@ MARSHALLER=[dill,cloudpickle]
110110
MODE=[disk,memory]
111111
```
112112

113-
When using persistence in `disk` mode, user can also define directory to which serializer output will be saved with `SCOREP_KERNEL_PERSISTENCE_DIR` environment variable.
113+
When using persistence in `disk` mode, user can also define directory to which serializer output will be saved with `SCOREP_JUPYTER_PERSISTENCE_DIR` environment variable.
114114
```
115-
%env SCOREP_KERNEL_PERSISTENCE_DIR=path/to/dir
115+
%env SCOREP_JUPYTER_PERSISTENCE_DIR=path/to/dir
116116
```
117-
To see the detailed report for marshalling steps - `JUMPER_MARSHALLING_DETAILED_REPORT` environment variable can be set.
117+
To see the detailed report for marshalling steps - `SCOREP_JUPYTER_MARSHALLING_DETAILED_REPORT` environment variable can be set.
118118
```
119-
%env JUMPER_MARSHALLING_DETAILED_REPORT=1
119+
%env SCOREP_JUPYTER_MARSHALLING_DETAILED_REPORT=1
120120
```
121-
You can disable visual animations shown during long-running tasks by setting the `JUMPER_DISABLE_PROCESSING_ANIMATIONS` environment variable.
121+
You can disable visual animations shown during long-running tasks by setting the `SCOREP_JUPYTER_DISABLE_PROCESSING_ANIMATIONS` environment variable.
122122
```
123-
%env JUMPER_DISABLE_PROCESSING_ANIMATIONS=1
123+
%env SCOREP_JUPYTER_DISABLE_PROCESSING_ANIMATIONS=1
124124
```
125125

126126
`%%execute_with_scorep`
@@ -204,7 +204,7 @@ Similar yields for cloudpickle. Use the `%%marshalling_settings` magic command t
204204
When dealing with big data structures, there might be a big runtime overhead at the beginning and the end of a Score-P cell. This is due to additional data saving and loading processes for persistency in the background. However this does not affect the actual user code and the Score-P measurements.
205205

206206
## Logging Configuration
207-
To adjust logging and obtain more detailed output about the behavior of the JUmPER kernel, refer to the `src/logging_config.py` file.
207+
To adjust logging and obtain more detailed output about the behavior of the scorep_jupyter kernel, refer to the `src/logging_config.py` file.
208208

209209
This file contains configuration options for controlling the verbosity, format, and destination of log messages. You can customize it to suit your debugging needs.
210210

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ dependencies = [
2424
"jupyter-client",
2525
"astunparse",
2626
"dill",
27+
"nbformat",
2728
"scorep"
2829
]
2930

src/parallel_marshall/parallel_marshall.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,18 @@
1010
# mode is automatically determined by the file object that is passed for
1111
# dumping
1212
mode = ""
13-
backend = str(os.environ.get("scorep_jupyter_PARALLEL_MARSHALL_BACKEND", "dill"))
14-
if os.environ.get("scorep_jupyter_PARALLEL_MARSHALL_NWORKERS"):
13+
backend = str(
14+
os.environ.get("SCOREP_JUPYTER_PARALLEL_MARSHALL_BACKEND", "dill")
15+
)
16+
if os.environ.get("SCOREP_JUPYTER_PARALLEL_MARSHALL_NWORKERS"):
1517
workers = min(
16-
int(os.environ.get("scorep_jupyter_PARALLEL_MARSHALL_NWORKERS")),
18+
int(os.environ.get("SCOREP_JUPYTER_PARALLEL_MARSHALL_NWORKERS")),
1719
multiprocessing.cpu_count(),
1820
multiprocessing.cpu_count(),
1921
)
2022
else:
2123
workers = multiprocessing.cpu_count()
22-
debug = int(os.environ.get("scorep_jupyter_PARALLEL_MARSHALL_DEBUG", 20))
24+
debug = int(os.environ.get("SCOREP_JUPYTER_PARALLEL_MARSHALL_DEBUG", 20))
2325

2426
logger = logging.getLogger(__name__)
2527
logging.basicConfig(filename="parallel_marshall.log", level=logging.INFO)

src/scorep_jupyter/install.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,13 @@
66
from scorep_jupyter.logo import logo_image
77

88
kernel_spec = {
9-
"argv": [sys.executable, "-m", "scorep_jupyter.kernel", "-f", "{connection_file}"],
9+
"argv": [
10+
sys.executable,
11+
"-m",
12+
"scorep_jupyter.kernel",
13+
"-f",
14+
"{connection_file}",
15+
],
1016
"name": "scorep_jupyter",
1117
"display_name": "Score-P_Python",
1218
"language": "python",

src/scorep_jupyter/kernel.py

Lines changed: 29 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
import datetime
2-
import json
32
import os
43
import re
54
import selectors
@@ -16,7 +15,10 @@
1615
from scorep_jupyter.userpersistence import PersHelper, scorep_script_name
1716
from scorep_jupyter.userpersistence import magics_cleanup, create_busy_spinner
1817
import importlib
19-
from scorep_jupyter.kernel_messages import KernelErrorCode, KERNEL_ERROR_MESSAGES
18+
from scorep_jupyter.kernel_messages import (
19+
KernelErrorCode,
20+
KERNEL_ERROR_MESSAGES,
21+
)
2022

2123
# import scorep_jupyter.multinode_monitor.slurm_monitor as slurm_monitor
2224

@@ -75,7 +77,7 @@ def __init__(self, **kwargs):
7577

7678
self.scorep_binding_args = []
7779

78-
os.environ["SCOREP_KERNEL_PERSISTENCE_DIR"] = "./"
80+
os.environ["SCOREP_JUPYTER_PERSISTENCE_DIR"] = "./"
7981
self.pershelper = PersHelper("dill", "memory")
8082

8183
self.mode = KernelMode.DEFAULT
@@ -99,7 +101,7 @@ def __init__(self, **kwargs):
99101
self.scorep_python_available_ = False
100102

101103
logging.config.dictConfig(LOGGING)
102-
self.log = logging.getLogger('kernel')
104+
self.log = logging.getLogger("kernel")
103105

104106
def cell_output(self, string, stream="stdout"):
105107
"""
@@ -143,9 +145,7 @@ def marshaller_settings(self, code):
143145
code_parts = code.split("\n", 1)
144146
content = code_parts[1] if len(code_parts) > 1 else ""
145147

146-
marshaller_match = re.search(
147-
r"MARSHALLER=([\w-]+)", content
148-
)
148+
marshaller_match = re.search(r"MARSHALLER=([\w-]+)", content)
149149
mode_match = re.search(r"MODE=([\w-]+)", content)
150150
marshaller = (
151151
marshaller_match.group(1) if marshaller_match else None
@@ -322,7 +322,8 @@ def start_writefile(self, code):
322322
dedent(
323323
f"""
324324
# This is the automatic conversion of
325-
# Jupyter Notebook -> Python script by scorep_jupyter kernel.
325+
# Jupyter Notebook -> Python script by scorep_jupyter
326+
# kernel.
326327
# Code corresponding to the cells not marked for
327328
# Score-P instrumentation is framed by
328329
# "with scorep.instrumenter.disable()
@@ -568,17 +569,22 @@ async def scorep_execute(
568569

569570
stdout_lock = threading.Lock()
570571
process_busy_spinner = create_busy_spinner(stdout_lock)
571-
process_busy_spinner.start('Process is running...')
572+
process_busy_spinner.start("Process is running...")
572573

573-
multicellmode_timestamps = []
574+
# Due to splitting into scorep-kernel and ipython extension,
575+
# multicell mode is not supported for coarse-grained measurements
576+
# anymore (in the extension) and we do not show the single cells in
577+
# the ipython extension visualizations after executing them with scorep
578+
# however, since we are using scorep anyway, the ipython extension is
579+
# not useful, since we can count hardware counters anyway
580+
# multicellmode_timestamps = []
574581

575582
try:
576-
multicellmode_timestamps = self.read_scorep_process_pipe(
577-
proc, stdout_lock
578-
)
579-
process_busy_spinner.stop('Done.')
583+
# multicellmode_timestamps =
584+
self.read_scorep_process_pipe(proc, stdout_lock)
585+
process_busy_spinner.stop("Done.")
580586
except KeyboardInterrupt:
581-
process_busy_spinner.stop('Kernel interrupted.')
587+
process_busy_spinner.stop("Kernel interrupted.")
582588

583589
# In disk mode, subprocess already terminated
584590
# after dumping persistence to file
@@ -703,12 +709,14 @@ def read_scorep_process_pipe(
703709
sel.unregister(key.fileobj)
704710
continue
705711

706-
decoded_line = line.decode(sys.getdefaultencoding(), errors='ignore')
712+
decoded_line = line.decode(
713+
sys.getdefaultencoding(), errors="ignore"
714+
)
707715

708716
if key.fileobj is proc.stderr:
709717
with stdout_lock:
710-
self.log.warning(f'{decoded_line.strip()}')
711-
elif 'MCM_TS' in decoded_line:
718+
self.log.warning(f"{decoded_line.strip()}")
719+
elif "MCM_TS" in decoded_line:
712720
multicellmode_timestamps.append(decoded_line)
713721
else:
714722
with stdout_lock:
@@ -866,12 +874,10 @@ def log_error(self, code: KernelErrorCode, **kwargs):
866874
mode = self.pershelper.mode
867875
marshaller = self.pershelper.marshaller
868876

869-
template = KERNEL_ERROR_MESSAGES.get(code, "Unknown error. Mode: {mode}, Marshaller: {marshaller}")
870-
message = template.format(
871-
mode=mode,
872-
marshaller=marshaller,
873-
**kwargs
877+
template = KERNEL_ERROR_MESSAGES.get(
878+
code, "Unknown error. Mode: {mode}, Marshaller: {marshaller}"
874879
)
880+
message = template.format(mode=mode, marshaller=marshaller, **kwargs)
875881

876882
self.log.error(message)
877883
self.cell_output("KernelError: " + message, "stderr")

src/scorep_jupyter/kernel_messages.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,15 @@ class KernelErrorCode(Enum):
2424
"Failed to set up persistence communication files/pipes "
2525
),
2626
KernelErrorCode.PERSISTENCE_DUMP_FAIL: (
27-
"[mode: {mode}] Failed to serialize notebook persistence ({direction}, marshaller: {marshaller})."
27+
"[mode: {mode}] Failed to serialize notebook persistence "
28+
"({direction}, marshaller: {marshaller})."
2829
),
2930
KernelErrorCode.PERSISTENCE_LOAD_FAIL: (
30-
"[mode: {mode}] Failed to load persistence ({direction}, marshaller: {marshaller})."
31+
"[mode: {mode}] Failed to load persistence "
32+
"({direction}, marshaller: {marshaller})."
3133
),
3234
KernelErrorCode.SCOREP_SUBPROCESS_FAIL: (
33-
"[mode: {mode}] Subprocess terminated unexpectedly. Persistence not recorded (marshaller: {marshaller})."
35+
"[mode: {mode}] Subprocess terminated unexpectedly. "
36+
"Persistence not recorded (marshaller: {marshaller})."
3437
),
3538
}

src/scorep_jupyter/logging_config.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,9 @@ def filter(self, record):
6464
"filters": {
6565
"jupyter_filter": {"()": JupyterLogFilter},
6666
"ignore_error_filter": {"()": IgnoreErrorFilter},
67-
"scorep_jupyter_kernel_only_filter": {"()": scorep_jupyterKernelOnlyFilter},
67+
"scorep_jupyter_kernel_only_filter": {
68+
"()": scorep_jupyterKernelOnlyFilter
69+
},
6870
},
6971
"root": {
7072
"handlers": [],

src/scorep_jupyter/userpersistence.py

Lines changed: 35 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@
1111
import uuid
1212
import importlib
1313

14-
1514
scorep_script_name = "scorep_script.py"
1615

1716

@@ -24,7 +23,7 @@ def __init__(self, marshaller="dill", mode="memory"):
2423
self.subprocess_definitions = ""
2524
self.subprocess_variables = []
2625
self.base_path = Path(
27-
os.environ["SCOREP_KERNEL_PERSISTENCE_DIR"]
26+
os.environ["SCOREP_JUPYTER_PERSISTENCE_DIR"]
2827
) / Path("./kernel_persistence/")
2928
self.paths = {
3029
"jupyter": {"os_environ": "", "sys_path": "", "var": ""},
@@ -195,7 +194,8 @@ def jupyter_update(self, code):
195194
jupyter_update = (
196195
"import sys\n"
197196
"import os\n"
198-
"from scorep_jupyter.userpersistence import load_runtime, load_variables\n"
197+
"from scorep_jupyter.userpersistence import load_runtime, "
198+
"load_variables\n"
199199
f"load_runtime(os.environ, sys.path,"
200200
f"'{self.paths['subprocess']['os_environ']}',"
201201
f"'{self.paths['subprocess']['sys_path']}',{self.marshaller})\n"
@@ -231,7 +231,7 @@ def parse(self, code, mode):
231231

232232
def set_dump_report_level(self):
233233
self.is_dump_detailed_report = int(
234-
os.getenv("scorep_jupyter_MARSHALLING_DETAILED_REPORT", "0")
234+
os.getenv("SCOREP_JUPYTER_MARSHALLING_DETAILED_REPORT", "0")
235235
)
236236

237237

@@ -241,7 +241,9 @@ def dump_runtime(
241241
# Don't dump environment variables set by Score-P bindings.
242242
# Will force it to re-initialize instead of calling reset_preload()
243243
filtered_os_environ_ = {
244-
k: v for k, v in os_environ_.items() if not k.startswith("SCOREP_")
244+
k: v
245+
for k, v in os_environ_.items()
246+
if not k.startswith("SCOREP_") or "SCOREP_JUPYTER" in k
245247
}
246248

247249
with os.fdopen(
@@ -271,8 +273,7 @@ def dump_variables(variables_names, globals_, var_dump_, marshaller):
271273
if non_persistent_class in globals().keys():
272274
user_variables[el].__class__ = globals()[non_persistent_class]
273275

274-
with (os.fdopen(os.open(var_dump_, os.O_WRONLY | os.O_CREAT), "wb")
275-
as file):
276+
with os.fdopen(os.open(var_dump_, os.O_WRONLY | os.O_CREAT), "wb") as file:
276277
marshaller.dump(user_variables, file)
277278

278279

@@ -369,36 +370,41 @@ def magics_cleanup(code):
369370
"""
370371
lines = code.splitlines(True)
371372
scorep_env = []
372-
373+
373374
# Cell magics that should skip entire cell content for persistence
374375
non_persistent_cell_magics = ["%%bash"] # Non-Python content
375376
# Cell magics that should keep Python content but skip magic line
376377
python_cell_magics = ["%%prun", "%%capture"]
377378
whitelist_prefixes_line = ["%prun", "%time"]
378-
379+
379380
# Check if this is a cell magic
380381
if lines and lines[0].strip().startswith("%%"):
381382
first_line = lines[0].strip()
382-
if any(first_line.startswith(prefix) for prefix in non_persistent_cell_magics):
383+
if any(
384+
first_line.startswith(prefix)
385+
for prefix in non_persistent_cell_magics
386+
):
383387
# For non-Python cell magics like %%bash
384388
# Skip the entire cell content for persistence
385389
return scorep_env, ""
386-
elif any(first_line.startswith(prefix) for prefix in python_cell_magics):
390+
elif any(
391+
first_line.startswith(prefix) for prefix in python_cell_magics
392+
):
387393
# For Python cell magics like %%prun, %%capture
388394
# Skip only the magic line, keep the Python content for persistence
389395
filtered_lines = lines[1:] # Skip first line (the magic)
390396
return scorep_env, "".join(filtered_lines)
391-
397+
392398
# Process line by line for non-cell magics or non-whitelisted cell magics
393399
filtered_lines = []
394-
400+
395401
for line in lines:
396402
stripped_line = line.strip()
397-
403+
398404
# Keep empty lines and comments
399405
if not stripped_line or stripped_line.startswith("#"):
400406
filtered_lines.append(line)
401-
407+
402408
# Handle %env specially
403409
elif stripped_line.startswith("%env"):
404410
env_var = stripped_line.split(" ", 1)[1]
@@ -410,22 +416,27 @@ def magics_cleanup(code):
410416
filtered_lines.append(f'os.environ["{key}"]="{val}"\n')
411417
else:
412418
key = env_var
413-
filtered_lines.append(f"print(\"env: {key}=os.environ['{key}']\")\n")
414-
419+
filtered_lines.append(
420+
f"print(\"env: {key}=os.environ['{key}']\")\n"
421+
)
422+
415423
# Handle whitelisted line magics - keep the command part
416-
elif any(stripped_line.startswith(prefix) for prefix in whitelist_prefixes_line):
424+
elif any(
425+
stripped_line.startswith(prefix)
426+
for prefix in whitelist_prefixes_line
427+
):
417428
parts = line.split(" ", 1)
418429
if len(parts) > 1:
419430
filtered_lines.append(parts[1])
420-
431+
421432
# Remove all other magic commands and shell commands
422433
elif stripped_line.startswith("%") or stripped_line.startswith("!"):
423434
continue
424-
435+
425436
# Keep regular Python code
426437
else:
427438
filtered_lines.append(line)
428-
439+
429440
nomagic_code = "".join(filtered_lines)
430441
return scorep_env, nomagic_code
431442

@@ -488,7 +499,9 @@ def stop(self, done_message="Done."):
488499

489500

490501
def create_busy_spinner(lock=None):
491-
is_enabled = os.getenv("scorep_jupyter_DISABLE_PROCESSING_ANIMATIONS") != "1"
502+
is_enabled = (
503+
os.getenv("SCOREP_JUPYTER_DISABLE_PROCESSING_ANIMATIONS") != "1"
504+
)
492505
if is_enabled:
493506
return BusySpinner(lock)
494507
else:

0 commit comments

Comments
 (0)