Skip to content

Commit 8a7cf94

Browse files
author
DocMinus
committed
testing added, readmes updated
1 parent 8ad2a8f commit 8a7cf94

5 files changed

Lines changed: 122 additions & 32 deletions

File tree

README.md

Lines changed: 23 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -3,36 +3,42 @@
33

44
# Reaction Transform descriptors
55
Python code to calculate reaction transform descriptors as described in [CHEMRXIV](https://chemrxiv.org/engage/chemrxiv/article-details/649888d41dcbb92a5e8e3475), by [@DocMinus](https://github.com/docminus) and [@DrAlatriste](https://github.com/DrAlatriste). <br>
6-
Not a full fledged package, some scripting know-how necessary to use or incorporate in own code might be necessary.
76

87
## Installation
98
See _environment_ folder.
9+
Updated the installation with a setup file to enable the tools to be part of ones Python environment. Testing has also been added.
1010

11-
## Usage
12-
Run the provided script by providing a file with tab/semicolon separated data (also comma or space, though not recommended):<br>
13-
`python 2AB_reaction_TDs.py path/inputfilename`<br>
11+
## Example Usage
12+
Run the example script by providing a file with tab/semicolon separated data (also comma or space, though not recommended):
13+
```shell
14+
python AB2C_reaction_TDs_example.py inputfilename
15+
```
1416
<br>
15-
You can get help by calling the script using -h: `python 2AB_reaction_TDs.py -h` <br>
17+
You can get help by calling the script using -h: `python AB2C_reaction_TDs_example.py -h` <br>
1618
<br>
17-
This particular script uses fileformat<br>
18-
_ID reactant1 reactant2 product_<br>
19+
This particular script expects the input order of the file as<br>
20+
21+
_ID reactant1 reactant2 product_ <br>
1922
<br>
20-
The script will provide a simple cleaning of the structures; "extreme" broken structures might not get fixed with the provided method.<br>
23+
Simple cleaning of structures is included; "extreme" broken structures might not get fixed with the provided method.
2124
<br>
22-
Two small test-sets are provided with made up reactions, one of them containing a "faulty" structure to demonstrate correct filtration in the end result. Alternatively, run the _test.py_ script (see below)<br>
25+
Two small test-sets are provided with made up reactions, one of them containing a "faulty" structure to demonstrate correct filtration in the output result. <br>Execute via: `python AB2C_reaction_TDs_examples.py ./datsets/testreactions.tsv`<br>
2326

2427
## Syntax
2528
If you only want to use the TD function, your script requires the following minimum lines with the smiles as string tuples (even if only a single reaction):
29+
```shell
30+
from td_tools.rxntools import transform_descriptors
31+
32+
output_table = transform_descriptors(['smiles_reactant1'],['smiles_reactant2'],['product'])
2633
```
27-
from td_tools.rxntools import transform_descriptors
28-
29-
output_table = transform_descriptors(['smiles_reactant1'],['smiles_reactant2'],['product'])
34+
A cleaning function as well as a file reader function is included for larger datasets:
35+
```shell
36+
from td_tools.rxntools import clean_smiles_multi, read_rct2pd
3037
```
31-
A cleaning function as well as a file reader function is included for larger datasets.<br>
32-
Provided scripts include examples on how to concatenate the structures versus the TDs.<br>
33-
<br>
34-
For quick testing and timing use `Python test.py`.<br>
35-
Not a pytest package, but it nevertheless does the trick for quick demonstrating/testing.<br>
38+
The provided script includes examples on how to concatenate the structures versus the TDs.<br>
39+
40+
## Testing
41+
Python testing has been added instead of the previous test.py, see the README.md under /tests.<br>
3642
<br>
3743

3844
### Acknowledgments

environment/README.md

Lines changed: 25 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,33 @@
22

33
## Requirements
44
Python >= 3.9 is required to use the [modern style](https://peps.python.org/pep-0585/) of type annotations.<br>
5-
Recommended: 3.11 (due to increased performance over versions <=3.10)<br>
6-
Modules required are sort of standard for chemistry scripting, rdkit, pandas & numpy, the latter two are nowadays part of a standard conda install.
5+
Recommended: 3.11 (due to increased performance over vearlier versions)<br>
6+
Modules required are sort of standard for chemistry scripting, rdkit, pandas & numpy, the latter two are nowadays part of a standard conda install.
77

88

9-
## Installation with Anaconda/Miniconda
10-
If you nevertheless want a separate environment:<br>
11-
Run the two commands from the root directory.
9+
## Installation
10+
1. Anaconda/Miniconda
11+
If you nevertheless want a separate environment:<br>
12+
Run the two commands from the root directory.
1213

13-
```shell
14-
conda env create -f ./environment/conda.yaml
15-
conda activate rxn_tds
16-
```
14+
```shell
15+
conda env create -f ./environment/conda.yaml
16+
conda activate rxn_tds
17+
```
1718

18-
## Installation with Pip
19-
If you already have an environment you want to add this into, then:<br>
20-
Run the command from the root directory
19+
1b. (alternatively) Venv
20+
Note that venv would also work if you prefer that.
2121

22-
```shell
23-
python -m pip install -r ./environment/requirements.txt
24-
```
22+
2. Pip
23+
Now run the requirements with pip into this new environment or into any that you already have.<br>
24+
Run the command from the root directory
25+
26+
```shell
27+
pip install -r ./environment/requirements.txt
28+
pip install .
29+
```
30+
31+
The latter installs the rxn_tools into the environment. The example script would work without that, but testing requires that.
32+
33+
## Running Tests
34+
`pytest` is available for testing. See the README.md in /tests.

tests/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
## Running Tests
2+
`pytest` is available for testing. Follow these steps:
3+
1. Ensure you have installed the project dependencies, as described in the Installation section.
4+
2. Install the testing dependencies:
5+
```bash
6+
pip install pytest
7+
```
8+
followed by
9+
```bash
10+
pytest
11+
```
12+
13+
This command will discover and run all the test cases in the `tests/` directory.

tests/__init.py__

Whitespace-only changes.

tests/test_all.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
#!/usr/bin/env python
2+
# coding: utf-8
3+
""" test script, creating articifial data and testing the TDs calculation for reactions
4+
only tests the combination and final outcome, not the individual functions
5+
2024-02-22; DocMinus
6+
"""
7+
8+
import pandas as pd
9+
import pytest
10+
11+
from td_tools.rxntools import clean_smiles_multi, transform_descriptors
12+
13+
14+
def test_clean_smiles_multi_and_transform_descriptors():
15+
dataset_size = 4 # number of compounds
16+
# we define some faulty/missing compounds, then the output table should have 3 rows less than the input table
17+
reactant1 = ["CCCN" for _ in range(dataset_size - 1)]
18+
reactant1.append("cc") # incorrect structure
19+
reactant1.append("CCO")
20+
reactant1.append("CCCl")
21+
reactant1.append("CCCl")
22+
total_dataset_size = len(reactant1)
23+
24+
reactant2 = ["CCCCO" for _ in range(dataset_size)]
25+
reactant2.append("CC")
26+
reactant2.append("CCO")
27+
reactant2.append("CCBr")
28+
29+
product = ["ClCC1=C(B)C(P)=CC(Br)=C1O" for _ in range(dataset_size)]
30+
product.append("cc") # incorrect structure
31+
product.append("") # missing structure
32+
product.append("CCI")
33+
34+
""" a total of 3 rows faulty rows, from bottom of created table it would 2nd, 3rd and 4th last row."""
35+
36+
g0 = clean_smiles_multi(reactant1)
37+
g1 = clean_smiles_multi(reactant2)
38+
g2 = clean_smiles_multi(product)
39+
40+
TD_numbers = transform_descriptors(g0, g1, g2)
41+
print(f"{TD_numbers.shape = }, {TD_numbers.shape[1] = }")
42+
43+
final_table = pd.DataFrame({"Compound 1": g0, "Compound 2": g1, "Product": g2})
44+
final_table = final_table[~((final_table.iloc[:, :3] == "").any(axis=1))]
45+
final_table = pd.concat([final_table, TD_numbers], axis=1, join="inner")
46+
47+
# check if the final table is as expected (3 rows less than the input table)
48+
assert (
49+
final_table.shape[0] == total_dataset_size - 3
50+
), "Number of rows in the final table is not as expected."
51+
52+
# Check that the 2nd, 3rd, and 4th last rows have been removed
53+
removed_indices = [
54+
total_dataset_size - 2,
55+
total_dataset_size - 3,
56+
total_dataset_size - 4,
57+
]
58+
for index in removed_indices:
59+
assert (
60+
index not in final_table.index
61+
), f"Row {index} should have been removed but is still in the final table."

0 commit comments

Comments
 (0)