You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Restore missing mdf_client.py from design-renaissance branch
This file was part of PR #469 but was not included in the merge,
causing ModuleNotFoundError when importing foundry.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Fix DOI search to return correct dataset
The forge DOI search can return multiple results where only one
actually has the matching DOI. Previously, get_metadata_by_doi()
blindly returned the first result, which often didn't have the
requested DOI.
Now it iterates through results to find the one with the exact
DOI match, fixing test_dataframe_search_by_doi and
test_dataframe_download_by_doi tests.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Move torch/tensorflow to optional extras to fix CI disk space
The combined size of torch, tensorflow, and NVIDIA CUDA dependencies
exceeded GitHub Actions runner disk space (~4GB+). These ML frameworks
are now available as optional extras via pip install .[torch] or
pip install .[tensorflow].
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Fix flake8 linting errors
- Remove unused imports (sys, rprint, Optional, pandas, numpy)
- Fix unused exception variable
- Remove f-string without placeholders
- Split long line in MCP server description
- Add noqa comment for intentional re-export
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Replace mdf_forge with internal MDFClient in tests
Update test imports to use foundry.mdf_client.MDFClient instead of
mdf_forge.Forge, which is no longer a required dependency.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Add optional extras and document installation
Move heavy ML dependencies to optional extras to reduce default
install size:
- pip install foundry-ml[torch]
- pip install foundry-ml[tensorflow]
- pip install foundry-ml[huggingface]
- pip install foundry-ml[excel]
- pip install foundry-ml[examples]
- pip install foundry-ml[dev]
Update README with extras install instructions and NumPy 2.0
compatibility note.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Fix DOI search and improve MDFClient query handling
MDFClient improvements:
- Add Globus Search index ID constants (MDF_INDEX_ID, MDF_TEST_INDEX_ID)
- Add match_source_names() method with automatic version suffix stripping
- Add _has_field_filters property for elegant advanced mode detection
- Use advanced=True automatically for DOI and source_name searches
(required for exact field matching in Globus Search)
- Add try/finally to ensure query state is always reset after search
Foundry search fix:
- Pass free-text query to Globus Search for server-side filtering
instead of fetching 10 results and filtering client-side
- This fixes searches like f.search("Computational Band Gaps") that
were failing when the target dataset wasn't in the first 10 results
Test additions:
- Add test_load_mp_band_gaps_dataset to verify DOI-based dataset loading
Re-rendered example notebooks with updated outputs.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
"<h2>DFT Estimates of Solvation Energy in Multiple Solvents</h2>Ward, Logan; Dandu, Naveen; Blaiszik, Ben; Narayanan, Badri; Assary, Rajeev S.; Redfern, Paul C.; Foster, Ian; Curtiss, Larry A.<p>DOI: 10.18126/jos5-wj65</p><h3>Dataset</h3><table><tr><th>short_name</th><td>g4mp2_solvation</td></tr><tr><th>data_type</th><td>tabular</td></tr><tr><th>task_type</th><td><ul><li>supervised</li></ul></td></tr><tr><th>domain</th><td><ul><li>materials science</li><li>chemistry</li></ul></td></tr><tr><th>n_items</th><td>130258.0</td></tr><tr><th>splits</th><td><ul><li><table><tr><th>type</th><td>train</td></tr><tr><th>path</th><td>g4mp2_data.json</td></tr><tr><th>label</th><td>train</td></tr></table></li></ul></td></tr><tr><th>keys</th><td><table><tr><th>key</th><th>type</th><th>filter</th><th>description</th><th>units</th><th>classes</th></tr><tr><td><ul><li>smiles_0</li></ul></td><td>input</td><td></td><td>Input SMILES string</td><td></td><td></td></tr><tr><td><ul><li>smiles_1</li></ul></td><td>input</td><td></td><td>SMILES string after relaxation</td><td></td><td></td></tr><tr><td><ul><li>inchi_0</li></ul></td><td>input</td><td></td><td>InChi after generating coordinates with CORINA</td><td></td><td></td></tr><tr><td><ul><li>inchi_1</li></ul></td><td>input</td><td></td><td>InChi after relaxation</td><td></td><td></td></tr><tr><td><ul><li>xyz</li></ul></td><td>input</td><td></td><td>InChi after relaxation</td><td>XYZ coordinates after relaxation</td><td></td></tr><tr><td><ul><li>atomic_charges</li></ul></td><td>input</td><td></td><td>Atomic charges on each atom, as predicted from B3LYP</td><td></td><td></td></tr><tr><td><ul><li>A</li></ul></td><td>input</td><td></td><td>Rotational constant, A</td><td>GHz</td><td></td></tr><tr><td><ul><li>B</li></ul></td><td>input</td><td></td><td>Rotational constant, B</td><td>GHz</td><td></td></tr><tr><td><ul><li>C</li></ul></td><td>input</td><td></td><td>Rotational constant, C</td><td>GHz</td><td></td></tr><tr><td><ul><li>inchi_1</li></ul></td><td>input</td><td></td><td>InChi after relaxation</td><td></td><td></td></tr><tr><td><ul><li>n_electrons</li></ul></td><td>input</td><td></td><td>Number of electrons</td><td></td><td></td></tr><tr><td><ul><li>n_heavy_atoms</li></ul></td><td>input</td><td></td><td>Number of non-hydrogen atoms</td><td></td><td></td></tr><tr><td><ul><li>n_atom</li></ul></td><td>input</td><td></td><td>Number of atoms in molecule</td><td></td><td></td></tr><tr><td><ul><li>mu</li></ul></td><td>input</td><td></td><td>Dipole moment</td><td>D</td><td></td></tr><tr><td><ul><li>alpha</li></ul></td><td>input</td><td></td><td>Isotropic polarizability</td><td>a_0^3</td><td></td></tr><tr><td><ul><li>R2</li></ul></td><td>input</td><td></td><td>Electronic spatial extant</td><td>a_0^2</td><td></td></tr><tr><td><ul><li>cv</li></ul></td><td>input</td><td></td><td>Heat capacity at 298.15K</td><td>cal/mol-K</td><td></td></tr><tr><td><ul><li>g4mp2_hf298</li></ul></td><td>target</td><td></td><td>G4MP2 Standard Enthalpy of Formation, 298K</td><td>kcal/mol</td><td></td></tr><tr><td><ul><li>bandgap</li></ul></td><td>input</td><td></td><td>B3LYP Band gap energy</td><td>Ha</td><td></td></tr><tr><td><ul><li>homo</li></ul></td><td>input</td><td></td><td>B3LYP Energy of HOMO</td><td>Ha</td><td></td></tr><tr><td><ul><li>lumo</li></ul></td><td>input</td><td></td><td>B3LYP Energy of LUMO</td><td>Ha</td><td></td></tr><tr><td><ul><li>zpe</li></ul></td><td>input</td><td></td><td>B3LYP Zero point vibrational energy</td><td>Ha</td><td></td></tr><tr><td><ul><li>u0</li></ul></td><td>input</td><td></td><td>B3LYP Internal energy at 0K</td><td>Ha</td><td></td></tr><tr><td><ul><li>u</li></ul></td><td>input</td><td></td><td>B3LYP Internal energy at 298.15K</td><td>Ha</td><td></td></tr><tr><td><ul><li>h</li></ul></td><td>input</td><td></td><td>B3LYP Enthalpy at 298.15K</td><td>Ha</td><td></td></tr><tr><td><ul><li>u0_atom</li></ul></td><td>input</td><td></td><td>B3LYP atomization energy at 0K</td><td>Ha</td><td></td></tr><tr><td><ul><li>g</li></ul></td><td>input</td><td></td><td>B3LYP Free energy at 298.15K</td><td>Ha</td><td></td></tr><tr><td><ul><li>g4mp2_0k</li></ul></td><td>target</td><td></td><td>G4MP2 Internal energy at 0K</td><td>Ha</td><td></td></tr><tr><td><ul><li>g4mp2_energy</li></ul></td><td>target</td><td></td><td>G4MP2 Internal energy at 298.15K</td><td>Ha</td><td></td></tr><tr><td><ul><li>g4mp2_enthalpy</li></ul></td><td>target</td><td></td><td>G4MP2 Enthalpy at 298.15K</td><td>Ha</td><td></td></tr><tr><td><ul><li>g4mp2_free</li></ul></td><td>target</td><td></td><td>G4MP2 Free eergy at 0K</td><td>Ha</td><td></td></tr><tr><td><ul><li>g4mp2_atom</li></ul></td><td>target</td><td></td><td>G4MP2 atomization energy at 0K</td><td>Ha</td><td></td></tr><tr><td><ul><li>sol_acetone</li></ul></td><td>target</td><td></td><td>Solvation energy, acetone</td><td>kcal/mol</td><td></td></tr><tr><td><ul><li>sol_acn</li></ul></td><td>target</td><td></td><td>Solvation energy, acetonitrile</td><td>kcal/mol</td><td></td></tr><tr><td><ul><li>sol_dmso</li></ul></td><td>target</td><td></td><td>Solvation energy, dimethyl sulfoxide</td><td>kcal/mol</td><td></td></tr><tr><td><ul><li>sol_ethanol</li></ul></td><td>target</td><td></td><td>Solvation energy, ethanol</td><td>kcal/mol</td><td></td></tr><tr><td><ul><li>sol_water</li></ul></td><td>target</td><td></td><td>Solvation energy, water</td><td>kcal/mol</td><td></td></tr></table></td></tr></table>"
188
154
],
189
155
"text/plain": [
190
-
"<foundry.foundry_dataset.FoundryDataset at 0x1342b8230>"
156
+
"<foundry.foundry_dataset.FoundryDataset at 0x140201070>"
191
157
]
192
158
},
193
159
"execution_count": 4,
@@ -214,10 +180,70 @@
214
180
},
215
181
{
216
182
"cell_type": "code",
217
-
"execution_count": null,
183
+
"execution_count": 5,
218
184
"metadata": {},
219
-
"outputs": [],
220
-
"source": "# Get the schema - what columns/fields are in this dataset?\nschema = dataset.get_schema()\n\nprint(f\"Dataset: {schema['name']}\")\nprint(f\"Data Type: {schema['data_type']}\")\nprint(f\"\\nSplits: {[s['name'] for s in schema['splits']]}\")\nprint(f\"\\nFields:\")\nfor field in schema['fields']:\n print(f\" - {field['name']} ({field['role']}): {field['description'] or 'No description'}\")"
185
+
"outputs": [
186
+
{
187
+
"name": "stdout",
188
+
"output_type": "stream",
189
+
"text": [
190
+
"Dataset: foundry_g4mp2_solvation_v1.2\n",
191
+
"Data Type: tabular\n",
192
+
"\n",
193
+
"Splits: ['train']\n",
194
+
"\n",
195
+
"Fields:\n",
196
+
" - smiles_0 (input): Input SMILES string\n",
197
+
" - smiles_1 (input): SMILES string after relaxation\n",
198
+
" - inchi_0 (input): InChi after generating coordinates with CORINA\n",
199
+
" - inchi_1 (input): InChi after relaxation\n",
200
+
" - xyz (input): InChi after relaxation\n",
201
+
" - atomic_charges (input): Atomic charges on each atom, as predicted from B3LYP\n",
202
+
" - A (input): Rotational constant, A\n",
203
+
" - B (input): Rotational constant, B\n",
204
+
" - C (input): Rotational constant, C\n",
205
+
" - inchi_1 (input): InChi after relaxation\n",
206
+
" - n_electrons (input): Number of electrons\n",
207
+
" - n_heavy_atoms (input): Number of non-hydrogen atoms\n",
208
+
" - n_atom (input): Number of atoms in molecule\n",
209
+
" - mu (input): Dipole moment\n",
210
+
" - alpha (input): Isotropic polarizability\n",
211
+
" - R2 (input): Electronic spatial extant\n",
212
+
" - cv (input): Heat capacity at 298.15K\n",
213
+
" - g4mp2_hf298 (target): G4MP2 Standard Enthalpy of Formation, 298K\n",
214
+
" - bandgap (input): B3LYP Band gap energy\n",
215
+
" - homo (input): B3LYP Energy of HOMO\n",
216
+
" - lumo (input): B3LYP Energy of LUMO\n",
217
+
" - zpe (input): B3LYP Zero point vibrational energy\n",
218
+
" - u0 (input): B3LYP Internal energy at 0K\n",
219
+
" - u (input): B3LYP Internal energy at 298.15K\n",
220
+
" - h (input): B3LYP Enthalpy at 298.15K\n",
221
+
" - u0_atom (input): B3LYP atomization energy at 0K\n",
222
+
" - g (input): B3LYP Free energy at 298.15K\n",
223
+
" - g4mp2_0k (target): G4MP2 Internal energy at 0K\n",
224
+
" - g4mp2_energy (target): G4MP2 Internal energy at 298.15K\n",
225
+
" - g4mp2_enthalpy (target): G4MP2 Enthalpy at 298.15K\n",
226
+
" - g4mp2_free (target): G4MP2 Free eergy at 0K\n",
227
+
" - g4mp2_atom (target): G4MP2 atomization energy at 0K\n",
"author = {Ward, Logan and Dandu, Naveen and Blaiszik, Ben and Narayanan, Badri and Assary, Rajeev S. and Redfern, Paul C. and Foster, Ian and Curtiss, Larry A.}\n",
365
+
"title = {DFT Estimates of Solvation Energy in Multiple Solvents}\n",
0 commit comments