Conversation
Include schema.rules.files.deriv in the regex chain so derivative filenames (with entities like space, desc, res, den) are recognized as valid BIDS. Closes #62
|
One question to discuss: Should derivative filenames only be accepted under bids/derivatives/? This is currently not the case under this PR. |
These tested general BIDS validation (wrong extensions, missing entities) rather than anything derivative-specific.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #70 +/- ##
==========================================
+ Coverage 90.89% 90.98% +0.08%
==========================================
Files 13 14 +1
Lines 846 854 +8
Branches 124 124
==========================================
+ Hits 769 777 +8
Misses 47 47
Partials 30 30 🚀 New features to boost your workflow:
|
|
I just want to check what your goal here is. Is this intended to flesh out the If it's the former, then you're right that we should distinguish between raw and derivative datasets, instead of declaring derivative files always valid. I might suggest giving If it's the latter, then we need to do some design. Here's the current validation loop: python-validator/src/bids_validator/__main__.py Lines 55 to 71 in 878d57d
python-validator/src/bids_validator/context.py Lines 378 to 482 in 878d57d We could rename it Broadly speaking we need something like: for rule in file_rules:
if full_match(context, rule):
break
if partial_match(context, rule):
partials.append(rule)
else: # Break is never called
suggestion = generate_suggestion(partials) # May be empty
error("UNKNOWN_FILE", suggestion)We don't currently have a rule data structure, but we can create dataclasses to match the types of rules: @attrs.define
class PathRule:
selectors: list[str]
level: Literal["optional", "required"]
path: str
@attrs.define
class StemRule:
selectors: list[str]
level: Literal["optional", "required"]
stem: str
datatypes: list[str]
extensions: list[str]
@attrs.define
class SuffixRule:
selectors: list[str]
level: Literal["optional", "required"]
suffixes: list[str]
entities: dict[str, Literal["optional", "required"]]
datatypes: list[str]
extensions: list[str]Possibly our def full_match(context: Context, rule: PathRule | StemRule | SuffixRule) -> bool:
match rule:
case PathRule(path=path):
return context.path == path
case StemRule(stem=stem, datatypes=dtypes, extensionss=exts):
return fnmatch(context.file.name, stem) and ...
case SuffixRule(suffixes=suffixes, ...):
return context.suffix in suffixes Anyway, that got a bit long. What's the target? |
I believe it was this, that is to help validate generated (or existing) filenames from PyBIDS and other tools. |
Summary
schema.rules.files.derivto the regex chain inBIDSValidator._init_regexes()so derivative filenames are recognized as valid BIDSspace,desc,res,den,label,atlas,seg,hemi,scale) are now parsed correctly byparse()and accepted byis_bids()Before
After
Closes #62
Test plan
tests/test_derivatives.pywith 12 parametrized tests (valid filenames, entity parsing, invalid filenames)