Conversation
5a053c7 to
a65e1e8
Compare
|
@joehart2001 @ElliottKasoar do you think this belongs in a "defects" section rather than bulk crystals? |
Yes i totally agree. We have some other PRs that could maybe fit in a defects category, like #337 creates an interstitial category, so we could merge you two. but its something we can move around easily anyway |
|
The RMSD metric is still subject to change, since it is sensitive to the accuracy of the bulk structure as well as the defect structure (which is what we are interested in). |
|
This looks very nice @ThomasWarford! |
9c5df6e to
919f477
Compare
afb4d1c to
ad52779
Compare
|
I had a look through this code @ThomasWarford. It's very nice! One comment:
I would suggest changing "across different supercell sizes" to "across different crystal structures". 'Different supercell sizes' sounds like this number will differ for a defect in e.g. a 2x2x2 MgO supercell and a 4x4x4 supercell, which is not the case. Also just to check, is it not (V/N)^{1/3} ? i.e. average free length per atom (https://github.com/materialsproject/pymatgen/blob/v2025.6.14/src/pymatgen/analysis/structure_matcher.py#L534) Fyi, if the StructureMatcher scans are very slow to run, the |
|
@kavanase thanks for taking a look, I've made the changes you suggested to the docs. I'll profile this later and see how long the structure matching is taking (I suspect it's quite a small amount of time compared to the relaxations, but maybe not on GPU!). @joehart2001 @ElliottKasoar how would you suggest adding StructureMatcher_scan_stol from doped, and what's your attitude to dependencies? It looks like doped.utils.efficiency doesn't depend much on the rest of doped so the code could be copied with proper attribution (MIT license) |
|
Ok cool! The structure matching stuff was just if it's annoyingly slow |
|
It seems about ~1/4 of the time is spent structure matching, when running on GPU |
In general, the fewer dependencies the better, but if this is more than a small helper and the rest of doped is likely to be useful elsewhere, adding it could be justified. @ElliottKasoar thoughts? |
|
@joehart2001 perhaps it's not worth it for a 25% speedup. If a future PR uses |
|
Looks very nice! One thing that was initially surprising to me looking at that table (before remembering the origins) -- the errors are larger for PBE than PBEsol, despite the models mostly all being PBE models (outsite the R2SCAN one). But this is more due to their compositions (PBE -> nitrides, PBEsol -> oxides here) right? I see this is shown in the lower violin plot ("Oxides, PBEsol"), but I think it would be best if that could be included in the table headers, just to minimise confusion about this. I know this is more text to add so could make it messy, but if needed I think using "Oxides/Nitrides" in the headers would be better than "PBEsol/PBE" as it's the former that is the bigger difference here, whereas the slightly different functional choices are not so big factors? |
Generally I prefer adding a dependency over copying code, otherwise we miss out on any bug fixes, performance improvements, etc., but if it were say a single function we needed and licensing was fine, copying things over could be considered (I'd still reference it, even if it were not required). Adding dependencies of course comes with potential conflicts etc. to worry about, but we can also make things optional and declare conflicting dependencies if we need to as well, so I also wouldn't overthink it. (More generally, we need to formalise and potentially revisit some of our decisions with respect to test-specific dependencies, but that's not a job for this!) |
|
@kavanase thanks for pointing this out, I changed it in the latest commit. Materials project is pretty oxide-heavy, which probably explains it. |



Pre-review checklist for PR author
PR author must check the checkboxes below when creating the PR.
Summary
Based on this paper: Identifying split vacancy defects with machine-learned foundation models and electrostatics
The metrics:
Linked issue
Closes #335
Progress
Testing
New decorators/callbacks