Q-SPOC is a comprehensive Python toolkit for computing quantum and physicochemical molecular descriptors using semiempirical quantum chemical calculations. It provides a streamlined workflow for generating a wide range of descriptors that are essential for quantitative structure-activity relationship (QSAR) studies, machine learning applications, and computational chemistry research.
Q-SPOC offers the following capabilities:
- CDFT Descriptors: Vertical ionization potential (VIP), vertical electron affinity (VEA), Fukui functions, and dual descriptors
- ESP Descriptors: Electrostatic potential maps, molecular electrostatic potential (MPI), charge balance, surface area analysis
- Shape Descriptors: Molecular volume, sphericity, and surface area
- Orbital Descriptors: HOMO/LUMO energies, energy gaps, orbital density indices
- Internal Coordinates: Bond lengths, angles, and dihedral angles
- Atom Descriptors: CM5 atomic charges
- Bond Descriptors: Wiberg bond orders and bond dipoles
- Steric Parameters: Steric hindrance and van der Waals radius-based descriptors
- xtb (GFN-xTB): Fast semiempirical quantum chemistry calculations
- Gaussian16: High-accuracy DFT and ab initio calculations
- Multiwfn: Comprehensive wavefunction analysis and descriptor calculation
- Batch processing of multiple molecules
- Automatic structure generation and optimization
- Caching mechanisms to avoid redundant calculations
- Flexible input formats (SMILES lists, CSV, XLSX)
Q-SPOC requires the following external tools to be installed:
xtb is a fast semiempirical quantum chemistry tool. While it can be installed via Conda, we recommend downloading the latest official binary.
Step 1: Download the latest xtb release
# Get the latest version tag
latest_tag=$(curl -sI https://github.com/grimme-lab/xtb/releases/latest | grep -i location | awk -F/ '{print $NF}' | tr -d '\r')
echo "Latest xtb version: $latest_tag"
# Download the Linux binary
version=${latest_tag#v}
curl -LO --progress-bar "https://github.com/grimme-lab/xtb/releases/download/${latest_tag}/xtb-${version}-linux-x86_64.tar.xz"
# Extract the archive
tar -xf "xtb-${version}-linux-x86_64.tar.xz"
# Move to your desired installation folder
mv xtb-dist <destination_folder> # e.g., mv xtb-dist bin
/bin/rm -rf "xtb-${version}-linux-x86_64.tar.xz"Step 2: Configure environment variables
Add the following to your shell configuration file (.bashrc, .zshrc, etc.):
For Bash users:
echo "\nsource <xtb_install_folder>/share/xtb/config_env.bash\n" >> ~/.bashrcFor Zsh users:
echo "\nsource <xtb_install_folder>/share/xtb/config_env.bash\n" >> ~/.zshrcReplace <xtb_install_folder> with the actual path (e.g., ~/bin/xtb-dist).
Step 3: Verify installation
xtb --versionDownload Multiwfn from http://sobereva.com/multiwfn/ and follow the installation instructions. Ensure the Multiwfn executable is in your PATH.
For higher-accuracy calculations, install Gaussian16 and ensure the g16 command is available in your PATH.
Install Q-SPOC using pip:
pip install qspocOr install from source:
git clone https://github.com/deepsynthesis/qspoc.git
cd qspoc
pip install -e .from qspoc import QSPOCDesc
# Initialize the descriptor calculator
desc = QSPOCDesc(
multiwfn_path="/path/to/Multiwfn",
save_dir="./results",
exe_path_dict={"xtb": "xtb"} # Path to xtb executable
)
# Load molecules from SMILES
desc.load_data(
smiles_list=["CCO", "c1ccccc1", "CC(=O)O"],
name_list=["ethanol", "benzene", "acetic_acid"]
)
# Generate initial 3D structures
desc.get_init_sturct()
# Geometric optimization using xtb
desc.geometric_opt(method="xtb")
# Single-point energy calculation
desc.singlepoint_energy(method="xtb")
# Calculate descriptors
descriptors_df = desc.descriptor_calc(
include="all", # or specify: ['cdft', 'esp', 'shape', 'orbit', 'internal', 'atom', 'bond', 'steric']
save=True
)
print(descriptors_df)Install xTB and Multiwfn into ~/.qspoc/bin and write tool availability to
~/.qspoc/config.json:
qspoc install
qspoc checkInstall only one tool, or reinstall an existing tool:
qspoc install --tool xtb
qspoc install --tool multiwfn --forceqspoc desc molecules.csv \
--multiwfn-path /path/to/Multiwfn \
--xtb-path xtb \
--save-dir ./results \
--precision xtb \
--include allFor a Gaussian16 workflow, provide both xTB and Gaussian16 commands:
qspoc desc molecules.csv \
--multiwfn-path /path/to/Multiwfn \
--xtb-path xtb \
--g16-path g16 \
--precision g16 \
--g16-nprocs 32 \
--g16-memory 16GBPrepare a CSV or XLSX file with the following structure:
smiles,tag,charge,multiplicity
CCO,ethanol,0,1
c1ccccc1,benzene,0,1# Load from CSV file
desc.load_data_from_file(
file_path="molecules.csv",
smiles_tag="smiles",
tag_tag="tag"
)
# Continue with the workflow as shown above- Vertical ionization potential (VIP)
- Vertical electron affinity (VEA)
- Condensed Fukui functions (Fukui+, Fukui-)
- Condensed dual descriptors
- Minimum and maximum electrostatic potential
- Molecular electrostatic potential (MPI)
- Charge balance
- Positive and negative surface areas
- Average local ionization energy (ALIE)
- Local electron affinity (LEA)
- Molecular volume
- Molecular sphericity
- Total surface area
- HOMO and LUMO energies
- HOMO-LUMO energy gap
- Orbital density indices (ODI)
- Bond lengths
- Bond angles
- Dihedral angles
- CM5 atomic charges
- Wiberg bond orders
- Bond dipole moments
- Steric hindrance indices
- Van der Waals radius-based parameters
# Geometric optimization with custom settings
desc.geometric_opt(
method="xtb",
calc_command="--gfn 2 --tight", # xtb-specific options
recalc=False
)
# Single-point energy with Gaussian16
desc.singlepoint_energy(
method="g16",
calc_command="# B3LYP/6-31G(d) SP",
recalc=False
)# Calculate only specific descriptor categories
descriptors_df = desc.descriptor_calc(
include=["cdft", "esp", "orbit"],
save=True
)Q-SPOC supports two descriptor versions:
- "all": Calculate all available descriptors
- "stem": Calculate a streamlined set of essential descriptors for faster computation
desc = QSPOCDesc(
multiwfn_path="/path/to/Multiwfn",
save_dir="./results",
exe_path_dict={"xtb": "xtb"},
version="stem" # Use streamlined descriptors
)Q-SPOC also provides a command-line interface:
qspoc --help- Python >= 3.8
- numpy >= 1.20.0
- pandas >= 1.3.0
- rdkit >= 2023.3.1
- rich >= 13.0.0
- click >= 8.0.0
- tqdm >= 4.60.0
- matplotlib >= 3.10.6
- epam.indigo >= 1.36.1
qspoc/
├── src/qspoc/
│ ├── descriptor.py # Main QSPOCDesc class
│ ├── converter.py # File format converters
│ ├── rdkit_processor.py # RDKit integration
│ ├── utils.py # Utility functions
│ ├── multiwfn/ # Multiwfn interface
│ │ ├── interface.py # Multiwfn interface
│ │ └── descriptors/ # Descriptor modules
│ │ ├── cdft.py # CDFT descriptors
│ │ ├── esp.py # ESP descriptors
│ │ ├── shape.py # Shape descriptors
│ │ ├── orbit.py # Orbital descriptors
│ │ ├── internal.py # Internal coordinates
│ │ ├── atom.py # Atom descriptors
│ │ ├── bond.py # Bond descriptors
│ │ └── steric.py # Steric parameters
│ └── qm/ # Quantum chemistry interfaces
│ ├── interface.py # QM interface
│ ├── xtb.py # xtb wrapper
│ └── gaussian16.py # Gaussian16 wrapper
├── tests/ # Test suite
└── README.mdContributions are welcome! Please feel free to submit issues or pull requests on our GitHub repository.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use Q-SPOC in your research, please cite:
Q-SPOC: Quantum Structural and Physic-organic Descriptor
Version 0.5.1
https://github.com/deepsynthesis/qspoc- xtb - Semiempirical quantum chemistry package
- Multiwfn - Wavefunction analysis program
- RDKit - Cheminformatics library
- Gaussian - Quantum chemistry software
- Author: Zhenzhi Tan
- Email: zhenzhi-tan@outlook.com
- Issues: https://github.com/deepsynthesis/qspoc/issues
Future planned features:
- Multi-conformation embedding
- ORCA integration
- Additional descriptor types
- Machine learning model integration
- Web interface
For more detailed documentation, please visit: https://qspoc.readthedocs.io EOF