GitHub - patonlab/molDscript: molDscript: A General-Purpose Workflow for Quantum Chemistry-Based Molecular Parameterization

MolDscript is a Python workflow that converts Density Functional Theory (DFT) and related quantum chemistry outputs into descriptor tables ready for machine learning or benchmarking. It wraps cclib, RDKit, DBSTEP, and pandas to align Gaussian, ORCA, and xTB calculations and write consistent molecule-, bond-, and atom-level CSV files.

Highlights

Parse optimization, single-point, NBO, NMR, charge, FMO, and Fukui calculations without manual file editing.
Match conformer ensembles, apply SMARTS-based substructure filters, and compute DBSTEP buried volumes on demand.
Generate ensembles (Boltzmann weighted, min/mnax within population windows, lowest-energy snapshots) in a single run.
Emit descriptor CSVs alongside module logs (MOLDSCRIPT_*.dat) for traceability.

Installation

git clone https://github.com/patonlab/molDscript.git
cd molDscript
pip install -e .

Open Babel (optional) can be installed from conda-forge:

conda install -c conda-forge openbabel

Quick Start

python -m moldscript \
  --opt calculations/opt --fmo calculations/fmo 
  --suffix_fmo fmo_suffix --suffix_nbo nbo_suffix
  --nbo calculations/nbo

Prefer storing options in a key:value text file? Use --varfile inputs.txt; command-line flags override values loaded from the file.

Core Inputs & Flags

--opt PATH (required) - baseline optimization files and conformer metadata.
--spc PATH - single-point energies that replace optimization SCF energies.
--nbo, --nmr, --charges, --fmo PATH - add module-specific descriptors; pair with --suffix_* to specify filename tokens specific to calculation type (required for proper comformer matching).
--fukui_neutral, --fukui_reduced, --fukui_oxidized PATH - supply all three charge states for vertical IE/EA and condensed Fukui functions. Again, pair with --suffix_* for proper conformer matching.
--substructure SMARTS - limit atom/bond descriptors to a SMARTS match; combine with --volume or --vall and optional --radius list for DBSTEP buried volumes.
--boltz, --min_max, --lowe - compute Boltzmann-weighted averages, min/max/range tables (using --cut), and lowest-energy snapshots. Adjust --temp (K) as needed.
--output PREFIX - prepend every generated filename; append a slash to target a directory. Use --no_mol, --no_atom, --no_bond, or --no_bond_filter to tailor CSV output.

Output Artefacts

molecule_level.csv, bond_level.csv, atom_level.csv - aligned descriptors per calculation, bond pair, or atom.
ensemble_*.csv, boltzmann_weights.csv - created when --boltz is enabled.
min_max_range_*.csv, lowest_energy_*.csv - created when --min_max or --lowe are requested.
MOLDSCRIPT_*.dat - per-module logs capturing provenance and CPU-time summaries.

Documentation

The Read the Docs site (coming soon) will provide the full user guide: https://moldscript.readthedocs.io

Dependencies

Key Python dependencies include pandas, cclib (latest GitHub version for the most up-to-date package compatability), dbstep, rdkit, networkx, numpy, and periodictable.

Supported Quantum Packages

Gaussian
ORCA
xTB (optimizations)

Testing

Run pytest -v from the project root to execute the test suite.

Acknowledgements

This work was carried out in the Paton Laboratory at Colorado State University, supported by the NSF Center for Computer-Assisted Synthesis (grant CHE-1925607).

Contributors include Shree Sowndarya, Jake King, and Robert Paton.

Name		Name	Last commit message	Last commit date
Latest commit History 289 Commits
.circleci		.circleci
_build/html		_build/html
docs		docs
moldscript		moldscript
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SUPPORT.md		SUPPORT.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Highlights

Installation

Quick Start

Core Inputs & Flags

Output Artefacts

Documentation

Dependencies

Supported Quantum Packages

Testing

Acknowledgements

About

Uh oh!

Releases 1

Packages

Contributors 3

Uh oh!

Languages

License

patonlab/molDscript

Folders and files

Latest commit

History

Repository files navigation

Highlights

Installation

Quick Start

Core Inputs & Flags

Output Artefacts

Documentation

Dependencies

Supported Quantum Packages

Testing

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Uh oh!

Languages

Packages