feat: Add functions for getting information from Structure#187
feat: Add functions for getting information from Structure#187cadenmyers13 wants to merge 21 commits intodiffpy:mainfrom
Structure#187Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #187 +/- ##
==========================================
+ Coverage 99.20% 99.21% +0.01%
==========================================
Files 15 15
Lines 2507 2554 +47
==========================================
+ Hits 2487 2534 +47
Misses 20 20
🚀 New features to boost your workflow:
|
|
@sbillinge ready for review. ase to diffpy.structure adapter. |
|
Let's chat about this. It seems that adding ase as a dependency is a big step and we need to understand better the use-cases. Let's start with use-cases, then we can work on design and tests. Then we can code. Especially when we have AI to write the code, this process is the most important because the code-writing is the lowest cost now and the biggest risk is to introduce bloat and entropy because the AI doesn't care too much about that it seems like. |
|
@sbillinge Here's the UCs i thought of and is what guided the code on this PR. Only option 2 is included here. Option 1 would be on a different PR. Both 1a and 1b would be things that would need to be implemented in
Any other use cases you could think of? Also, we might be able to get away with this without adding it as a dependency. The only use for it is an |
|
The main question is serialization. I think that if it is possible to serialize ASE objects we can read them without an ASE dependency. We can also have code to write them. Your UCs are good for some kind of high throughout pipeline where ASE is generating large numbers of objects that are being passed to Diffpy and serialization would be a serious performance bottleneck. However I do see the use and convenience. It occured to me that we could make an ASE pack so we don't pick up a new dependency in the core. |
|
I think that thinking of actual UCs could be good. The one highest on my list would be to take an MD simulation and Compute the PDF. Another would be to support something like Soham's cluster mining. We should maybe check how he did it as he may already have some code. I would guess that the code is on gitlab |
|
For these UCs do you know if it is Atoms objects or Some other kind of ASE objects that is passed? |
ASE.Atom objects to diffpy.structure.Structure objects and additional useful methodsStructure
|
@sbillinge I moved the discussion from here to the associated issue. Ive changed this PR to only include the function for getting structural information |
sbillinge
left a comment
There was a problem hiding this comment.
this looks good. Please see comments.
| # End of class TestStructure | ||
|
|
||
|
|
||
| def test_get_chemical_symbols(datafile): |
There was a problem hiding this comment.
do we need the case where the symbols are in there as ions? In that case I guess the desired behavior is that this returns the list without cleaning the ionic charge, right?
Then there may be a different function that cleans them.
In the future, we may want to handle ions differently than non ions, so (and i am not sure what the actual behavior is because I don't see a test for it) it would be better if this getter returned them uncleaned and the cleaning step was separate for when we want to clean (i.e., atm). Does it make sense?
There was a problem hiding this comment.
@sbillinge Yes makes sense. all atoms get passed through the atom_bare_symbol function which strips all the symbols and only returns species. The challenge with supporting ions is sometimes the atom passed through is a bare symbol, ion, or something else. For example, the 12-C input for this test,
@pytest.mark.parametrize(
"symbol, expected",
[
("Cl-", "Cl"),
("Ca2+", "Ca"),
("12-C", "C"),
],
)
def test_atom_bare_symbol(symbol, expected):
actual = atom_bare_symbol(symbol)
assert actual == expected
I think its better to put this on a separate PR/issue. I'll make that issue
tests/test_structure.py
Outdated
|
|
||
| @pytest.mark.parametrize( | ||
| "return_array", | ||
| [ # case: user wants ADPs as an array |
tests/test_structure.py
Outdated
| else: | ||
| expected_displacement = { | ||
| # Iodine | ||
| "I_8_11": np.float64(0.025), |
There was a problem hiding this comment.
Your test seems to be requiring np.float64 types. Is this behavior you want? It seems oddly specific. Remember, tests are supposed to test desired ``behavior". Is it desired behavior that we want it to return float64?
There was a problem hiding this comment.
Yeah probably not desired to have np.float64. I just incorporated this because this was in the output. Ive removed it.
tests/test_structure.py
Outdated
| "return_array", | ||
| [ # case: user wants isotropic displacement parameters as an array | ||
| # expected: a 1D array of shape (num_atoms,) with the Uiso values | ||
| True, |
There was a problem hiding this comment.
this is a slightly odd and non standard use of parametrize. We normally give ([inputs], [outputs]) which is more extensible and more readable by a human, and then the test itself is very short and has little or no logic in it.
There was a problem hiding this comment.
Okay I updated it. I still included the logic because the assert statements vary slightly as == isnt supported for numpy arrays
tests/test_structure.py
Outdated
| assert np.allclose(actual_isotropic_displacement, expected_isotropic_displacement) | ||
| else: | ||
| expected_isotropic_displacement = { | ||
| "Pb_1_Uiso": np.float64(0.0225566), |
|
@sbillinge ready for review. responded to comments. |
See news file for all the methods that are added.