Skip to content

DeepSynthesis/Q-SPOC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

191 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Q-SPOC: Quantum Structural and Physic-organic Descriptor

Q-SPOC is a comprehensive Python toolkit for computing quantum and physicochemical molecular descriptors using semiempirical quantum chemical calculations. It provides a streamlined workflow for generating a wide range of descriptors that are essential for quantitative structure-activity relationship (QSAR) studies, machine learning applications, and computational chemistry research.

Features

Q-SPOC offers the following capabilities:

Descriptor Types

  • CDFT Descriptors: Vertical ionization potential (VIP), vertical electron affinity (VEA), Fukui functions, and dual descriptors
  • ESP Descriptors: Electrostatic potential maps, molecular electrostatic potential (MPI), charge balance, surface area analysis
  • Shape Descriptors: Molecular volume, sphericity, and surface area
  • Orbital Descriptors: HOMO/LUMO energies, energy gaps, orbital density indices
  • Internal Coordinates: Bond lengths, angles, and dihedral angles
  • Atom Descriptors: CM5 atomic charges
  • Bond Descriptors: Wiberg bond orders and bond dipoles
  • Steric Parameters: Steric hindrance and van der Waals radius-based descriptors

Quantum Chemistry Integration

  • xtb (GFN-xTB): Fast semiempirical quantum chemistry calculations
  • Gaussian16: High-accuracy DFT and ab initio calculations
  • Multiwfn: Comprehensive wavefunction analysis and descriptor calculation

Workflow Management

  • Batch processing of multiple molecules
  • Automatic structure generation and optimization
  • Caching mechanisms to avoid redundant calculations
  • Flexible input formats (SMILES lists, CSV, XLSX)

Installation

Prerequisites

Q-SPOC requires the following external tools to be installed:

1. Install xtb (Required)

xtb is a fast semiempirical quantum chemistry tool. While it can be installed via Conda, we recommend downloading the latest official binary.

Step 1: Download the latest xtb release

# Get the latest version tag
latest_tag=$(curl -sI https://github.com/grimme-lab/xtb/releases/latest | grep -i location | awk -F/ '{print $NF}' | tr -d '\r')
echo "Latest xtb version: $latest_tag"

# Download the Linux binary
version=${latest_tag#v}
curl -LO --progress-bar "https://github.com/grimme-lab/xtb/releases/download/${latest_tag}/xtb-${version}-linux-x86_64.tar.xz"

# Extract the archive
tar -xf "xtb-${version}-linux-x86_64.tar.xz"

# Move to your desired installation folder
mv xtb-dist <destination_folder>  # e.g., mv xtb-dist bin
/bin/rm -rf "xtb-${version}-linux-x86_64.tar.xz"

Step 2: Configure environment variables

Add the following to your shell configuration file (.bashrc, .zshrc, etc.):

For Bash users:

echo "\nsource <xtb_install_folder>/share/xtb/config_env.bash\n" >> ~/.bashrc

For Zsh users:

echo "\nsource <xtb_install_folder>/share/xtb/config_env.bash\n" >> ~/.zshrc

Replace <xtb_install_folder> with the actual path (e.g., ~/bin/xtb-dist).

Step 3: Verify installation

xtb --version

2. Install Multiwfn (Required)

Download Multiwfn from http://sobereva.com/multiwfn/ and follow the installation instructions. Ensure the Multiwfn executable is in your PATH.

3. Gaussian16 (Optional)

For higher-accuracy calculations, install Gaussian16 and ensure the g16 command is available in your PATH.

Install Q-SPOC

Install Q-SPOC using pip:

pip install qspoc

Or install from source:

git clone https://github.com/deepsynthesis/qspoc.git
cd qspoc
pip install -e .

Quick Start

Basic Usage

from qspoc import QSPOCDesc

# Initialize the descriptor calculator
desc = QSPOCDesc(
    multiwfn_path="/path/to/Multiwfn",
    save_dir="./results",
    exe_path_dict={"xtb": "xtb"}  # Path to xtb executable
)

# Load molecules from SMILES
desc.load_data(
    smiles_list=["CCO", "c1ccccc1", "CC(=O)O"],
    name_list=["ethanol", "benzene", "acetic_acid"]
)

# Generate initial 3D structures
desc.get_init_sturct()

# Geometric optimization using xtb
desc.geometric_opt(method="xtb")

# Single-point energy calculation
desc.singlepoint_energy(method="xtb")

# Calculate descriptors
descriptors_df = desc.descriptor_calc(
    include="all",  # or specify: ['cdft', 'esp', 'shape', 'orbit', 'internal', 'atom', 'bond', 'steric']
    save=True
)

print(descriptors_df)

Command Line Usage

Install xTB and Multiwfn into ~/.qspoc/bin and write tool availability to ~/.qspoc/config.json:

qspoc install
qspoc check

Install only one tool, or reinstall an existing tool:

qspoc install --tool xtb
qspoc install --tool multiwfn --force
qspoc desc molecules.csv \
  --multiwfn-path /path/to/Multiwfn \
  --xtb-path xtb \
  --save-dir ./results \
  --precision xtb \
  --include all

For a Gaussian16 workflow, provide both xTB and Gaussian16 commands:

qspoc desc molecules.csv \
  --multiwfn-path /path/to/Multiwfn \
  --xtb-path xtb \
  --g16-path g16 \
  --precision g16 \
  --g16-nprocs 32 \
  --g16-memory 16GB

Loading from File

Prepare a CSV or XLSX file with the following structure:

smiles,tag,charge,multiplicity
CCO,ethanol,0,1
c1ccccc1,benzene,0,1
# Load from CSV file
desc.load_data_from_file(
    file_path="molecules.csv",
    smiles_tag="smiles",
    tag_tag="tag"
)

# Continue with the workflow as shown above

Descriptor Categories

CDFT Descriptors

  • Vertical ionization potential (VIP)
  • Vertical electron affinity (VEA)
  • Condensed Fukui functions (Fukui+, Fukui-)
  • Condensed dual descriptors

ESP Descriptors

  • Minimum and maximum electrostatic potential
  • Molecular electrostatic potential (MPI)
  • Charge balance
  • Positive and negative surface areas
  • Average local ionization energy (ALIE)
  • Local electron affinity (LEA)

Shape Descriptors

  • Molecular volume
  • Molecular sphericity
  • Total surface area

Orbital Descriptors

  • HOMO and LUMO energies
  • HOMO-LUMO energy gap
  • Orbital density indices (ODI)

Internal Coordinates

  • Bond lengths
  • Bond angles
  • Dihedral angles

Atom Descriptors

  • CM5 atomic charges

Bond Descriptors

  • Wiberg bond orders
  • Bond dipole moments

Steric Parameters

  • Steric hindrance indices
  • Van der Waals radius-based parameters

Advanced Usage

Custom Calculation Settings

# Geometric optimization with custom settings
desc.geometric_opt(
    method="xtb",
    calc_command="--gfn 2 --tight",  # xtb-specific options
    recalc=False
)

# Single-point energy with Gaussian16
desc.singlepoint_energy(
    method="g16",
    calc_command="# B3LYP/6-31G(d) SP",
    recalc=False
)

Selective Descriptor Calculation

# Calculate only specific descriptor categories
descriptors_df = desc.descriptor_calc(
    include=["cdft", "esp", "orbit"],
    save=True
)

Descriptor Versions

Q-SPOC supports two descriptor versions:

  • "all": Calculate all available descriptors
  • "stem": Calculate a streamlined set of essential descriptors for faster computation
desc = QSPOCDesc(
    multiwfn_path="/path/to/Multiwfn",
    save_dir="./results",
    exe_path_dict={"xtb": "xtb"},
    version="stem"  # Use streamlined descriptors
)

Command Line Interface

Q-SPOC also provides a command-line interface:

qspoc --help

Dependencies

  • Python >= 3.8
  • numpy >= 1.20.0
  • pandas >= 1.3.0
  • rdkit >= 2023.3.1
  • rich >= 13.0.0
  • click >= 8.0.0
  • tqdm >= 4.60.0
  • matplotlib >= 3.10.6
  • epam.indigo >= 1.36.1

Project Structure

qspoc/
├── src/qspoc/
   ├── descriptor.py          # Main QSPOCDesc class
   ├── converter.py           # File format converters
   ├── rdkit_processor.py     # RDKit integration
   ├── utils.py               # Utility functions
   ├── multiwfn/              # Multiwfn interface
      ├── interface.py       # Multiwfn interface
      └── descriptors/       # Descriptor modules
          ├── cdft.py        # CDFT descriptors
          ├── esp.py         # ESP descriptors
          ├── shape.py       # Shape descriptors
          ├── orbit.py       # Orbital descriptors
          ├── internal.py    # Internal coordinates
          ├── atom.py        # Atom descriptors
          ├── bond.py        # Bond descriptors
          └── steric.py      # Steric parameters
   └── qm/                    # Quantum chemistry interfaces
       ├── interface.py       # QM interface
       ├── xtb.py             # xtb wrapper
       └── gaussian16.py      # Gaussian16 wrapper
├── tests/                     # Test suite
└── README.md

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests on our GitHub repository.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use Q-SPOC in your research, please cite:

Q-SPOC: Quantum Structural and Physic-organic Descriptor
Version 0.5.1
https://github.com/deepsynthesis/qspoc

Acknowledgments

  • xtb - Semiempirical quantum chemistry package
  • Multiwfn - Wavefunction analysis program
  • RDKit - Cheminformatics library
  • Gaussian - Quantum chemistry software

Contact

Roadmap

Future planned features:

  • Multi-conformation embedding
  • ORCA integration
  • Additional descriptor types
  • Machine learning model integration
  • Web interface

For more detailed documentation, please visit: https://qspoc.readthedocs.io EOF

About

A quantum version of SPOC descriptor generator

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors