padelpy2 is a Python wrapper for the PaDEL-Descriptor software, enabling fast and flexible calculation of molecular descriptors and fingerprints directly from Python.
📚 Documentation: For full API usage details, see the padelpy2 documentation page.
- Compute 2D and 3D molecular descriptors and fingerprints using PaDEL-Descriptor
- Simple, object-oriented API for descriptor/fingerprint selection and calculation
- Native support for RDKit molecules
- Highly configurable (multithreading, 3D conversion, custom descriptor sets, etc.)
- Returns results as pandas DataFrames for easy downstream analysis
pip install padelpy2If you do not already have RDKit installed, you can install the PyPI build (version 2022.9.5) with:
pip install padelpy2[rdkit]Note: The PyPI build of RDKit (
rdkit-pypi==2022.9.5) is limited and may not support all features or platforms. For best compatibility and performance, it is strongly recommended to install RDKit via conda or your preferred package manager.
git clone https://github.com/cognitive-chemistry-labs/padelpy2
cd padelpy2
pip install .Requirements:
- Python 3.9–3.13
*If installing RDKit from pip, only Python 3.9–3.11 are supported. For Python 3.12+ use conda or another supported method. - RDKit (install via conda or pip; see note above)
- pandas
- Java Runtime Environment (JRE) 6 or higher must be installed and available on your system PATH. PaDEL-Descriptor is a Java application and will not run without Java. You can download Java from Oracle.
from rdkit import Chem
from padelpy2 import Calculator, descriptors
# Example molecules (SMILES)
smiles = [
"CN=C=O",
"CC(=O)NCCC1=CNc2c1cc(OC)cc2",
"OCCc1c(C)[n+](cs1)Cc2cnc(C)nc2N",
]
mols = [Chem.AddHs(Chem.MolFromSmiles(smi)) for smi in smiles]
# Calculate all available descriptors
calc = Calculator(descriptors)
results = calc(mols)
print(results)from padelpy2 import Calculator, descriptors_2d
calc = Calculator(descriptors_2d)
results = calc(mols)from padelpy2 import Calculator, descriptors_3d
calc = Calculator(descriptors_3d)
results = calc(mols)from padelpy2 import Calculator, fingerprints
calc = Calculator(fingerprints)
results = calc(mols)from padelpy2.descriptors import MolecularWeight, XLogP
calc = Calculator([MolecularWeight, XLogP])
results = calc(mols)from padelpy2.fingerprints import MACCSFingerprinter
calc = Calculator([MACCSFingerprinter])
results = calc(mols)The main interface for descriptor/fingerprint calculation.
Calculator(descriptors: Iterable[Descriptor or Fingerprint], config: PaDELConfig = None)descriptors: List of descriptor/fingerprint objects (see below)config: Optional configuration (threads, 3D conversion, etc.)
results = calc(mols)mols: List of RDKit Mol objects- Returns: pandas DataFrame
descriptors: All available descriptors (2D and 3D)descriptors_2d: Only 2D descriptorsdescriptors_3d: Only 3D descriptorsfingerprints: All available fingerprints
from padelpy2 import PaDELConfig
config = PaDELConfig(threads=4, convert3d=True)
calc = Calculator(descriptors, config=config)For advanced use cases, you can call the low-level PaDEL-Descriptor wrapper directly. This allows you to execute the underlying Java tool with custom arguments and file-based workflows.
from padelpy2.wrapper import padeldescriptor
# Calculate 2D descriptors for a directory of structure files (e.g., SDF or MOL)
output_csv = padeldescriptor(
d_2d=True,
mol_dir="/path/to/structures/", # directory or file with molecules
d_file="/path/to/output.csv", # output CSV file
threads=4, # number of threads
headless=True # run in headless mode (no GUI)
)
print(f"Results written to: {output_csv}")mol_dir: Path to a directory or file containing molecular structures (SDF, MOL, etc.)d_file: Output file for descriptors (CSV)d_2d,d_3d,fingerprints: Enable calculation of 2D, 3D descriptors, or fingerprintsthreads: Number of threads to useconfig,descriptortypes: Optional config or descriptor type filesconvert3d,removesalt,retainorder, etc.: Advanced options (see docstring inpadelpy2/wrapper.py)use_tempfile: IfTrueandd_fileis not set, a temporary file is used for output
Returns the path to the output file, or raises an error if the calculation fails.
See the function docstring in padelpy2/wrapper.py for a full list of options and details.
For more examples, see the examples/ directory.