|
| 1 | +# coding: utf-8 |
| 2 | + |
| 3 | +# # jobflow |
| 4 | + |
| 5 | +# ## Define workflow with jobflow |
| 6 | +# |
| 7 | +# This tutorial will demonstrate how to use the PWD with `jobflow` and load the workflow with `aiida` and `pyiron`. |
| 8 | +# |
| 9 | +# [`jobflow`](https://joss.theoj.org/papers/10.21105/joss.05995) was developed to simplify the development of high-throughput workflows. It uses a decorator-based approach to define the “Job“s that can be connected to form complex workflows (“Flow“s). `jobflow` is the workflow language of the workflow library [`atomate2`](https://chemrxiv.org/engage/chemrxiv/article-details/678e76a16dde43c9085c75e9), designed to replace [atomate](https://www.sciencedirect.com/science/article/pii/S0927025617303919), which was central to the development of the [Materials Project](https://pubs.aip.org/aip/apm/article/1/1/011002/119685/Commentary-The-Materials-Project-A-materials) database. |
| 10 | + |
| 11 | +# First, we start by importing the job decorator and the Flow class from jobflow, as welll as the necessary modules from the python workflow definition and the example arithmetic workflow. |
| 12 | + |
| 13 | +# In[1]: |
| 14 | + |
| 15 | + |
| 16 | +from jobflow import job, Flow |
| 17 | + |
| 18 | + |
| 19 | +# In[2]: |
| 20 | + |
| 21 | + |
| 22 | +from python_workflow_definition.jobflow import write_workflow_json |
| 23 | + |
| 24 | + |
| 25 | +# In[3]: |
| 26 | + |
| 27 | + |
| 28 | +from workflow import ( |
| 29 | + get_sum as _get_sum, |
| 30 | + get_prod_and_div as _get_prod_and_div, |
| 31 | + get_square as _get_square, |
| 32 | +) |
| 33 | + |
| 34 | + |
| 35 | +# Using the job object decorator, the imported functions from the arithmetic workflow are transformed into jobflow “Job”s. These “Job”s can delay the execution of Python functions and can be chained into workflows (“Flow”s). A “Job” can return serializable outputs (e.g., a number, a dictionary, or a Pydantic model) or a so-called “Response” object, which enables the execution of dynamic workflows where the number of nodes is not known prior to the workflow’s execution. |
| 36 | + |
| 37 | +# In[4]: |
| 38 | + |
| 39 | + |
| 40 | +workflow_json_filename = "jobflow_simple.json" |
| 41 | + |
| 42 | + |
| 43 | +# In[5]: |
| 44 | + |
| 45 | + |
| 46 | +get_sum = job(_get_sum) |
| 47 | +# Note: one could also transfer the outputs to the datastore as well: get_prod_and_div = job(_get_prod_and_div, data=["prod", "div"]) |
| 48 | +# On the way from the general definition to the jobflow definition, we do this automatically to avoid overflow databases. |
| 49 | +get_prod_and_div = job(_get_prod_and_div) |
| 50 | +get_square = job(_get_square) |
| 51 | + |
| 52 | + |
| 53 | +# In[6]: |
| 54 | + |
| 55 | + |
| 56 | +prod_and_div = get_prod_and_div(x=1, y=2) |
| 57 | + |
| 58 | + |
| 59 | +# In[7]: |
| 60 | + |
| 61 | + |
| 62 | +tmp_sum = get_sum(x=prod_and_div.output.prod, y=prod_and_div.output.div) |
| 63 | + |
| 64 | + |
| 65 | +# In[8]: |
| 66 | + |
| 67 | + |
| 68 | +result = get_square(x=tmp_sum.output) |
| 69 | + |
| 70 | + |
| 71 | +# In[9]: |
| 72 | + |
| 73 | + |
| 74 | +flow = Flow([prod_and_div, tmp_sum, result]) |
| 75 | + |
| 76 | + |
| 77 | +# As jobflow itself is only a workflow language, the workflows are typically executed on high-performance computers with a workflow manager such as [Fireworks](https://onlinelibrary.wiley.com/doi/full/10.1002/cpe.3505) or [jobflow-remote](https://github.com/Matgenix/jobflow-remote). For smaller and test workflows, simple linear, non-parallel execution of the workflow graph can be performed with jobflow itself. All outputs of individual jobs are saved in a database. For high-throughput applications typically, a MongoDB database is used. For testing and smaller workflows, a memory database can be used instead. |
| 78 | + |
| 79 | +# In[10]: |
| 80 | + |
| 81 | + |
| 82 | +write_workflow_json(flow=flow, file_name=workflow_json_filename) |
| 83 | + |
| 84 | + |
| 85 | +# In[11]: |
| 86 | + |
| 87 | + |
| 88 | + |
| 89 | + |
| 90 | +# Finally, you can write the workflow data into a JSON file to be imported later. |
| 91 | + |
| 92 | +# ## Load Workflow with aiida |
| 93 | +# |
| 94 | +# In this part, we will demonstrate how to import the `jobflow` workflow into `aiida` via the PWD. |
| 95 | + |
| 96 | +# In[12]: |
| 97 | + |
| 98 | + |
| 99 | +from aiida import load_profile |
| 100 | + |
| 101 | +load_profile() |
| 102 | + |
| 103 | + |
| 104 | +# In[13]: |
| 105 | + |
| 106 | + |
| 107 | +from python_workflow_definition.aiida import load_workflow_json |
| 108 | + |
| 109 | + |
| 110 | +# We import the necessary modules from `aiida` and the PWD, as well as the workflow JSON file. |
| 111 | + |
| 112 | +# In[14]: |
| 113 | + |
| 114 | +wg = load_workflow_json(file_name=workflow_json_filename) |
| 115 | + |
| 116 | +wg |
| 117 | + |
| 118 | + |
| 119 | +# Finally, we are now able to run the workflow with `aiida`. |
| 120 | + |
| 121 | +# In[15]: |
| 122 | + |
| 123 | + |
| 124 | +wg.run() |
| 125 | + |
| 126 | + |
| 127 | +# ## Load Workflow with pyiron_base |
| 128 | +# |
| 129 | +# In this part, we will demonstrate how to import the `jobflow` workflow into `pyiron` via the PWD. |
| 130 | + |
| 131 | +# In[16]: |
| 132 | + |
| 133 | + |
| 134 | +from python_workflow_definition.pyiron_base import load_workflow_json |
| 135 | + |
| 136 | + |
| 137 | +# In[17]: |
| 138 | + |
| 139 | + |
| 140 | +delayed_object_lst = load_workflow_json(file_name=workflow_json_filename) |
| 141 | +delayed_object_lst[-1].draw() |
| 142 | + |
| 143 | + |
| 144 | +# In[18]: |
| 145 | + |
| 146 | + |
| 147 | +delayed_object_lst[-1].pull() |
| 148 | + |
| 149 | + |
| 150 | +# Here, the procedure is the same as before: Import the necessary `pyiron_base` module from the PWD, import the workflow JSON file and run the workflow with pyiron. |
| 151 | + |
| 152 | +# In[ ]: |
| 153 | + |
| 154 | + |
| 155 | + |
| 156 | + |
0 commit comments