Fix CWE related weakness parsing and add regression tests#756
Fix CWE related weakness parsing and add regression tests#756akshat4703 wants to merge 2 commits intoOWASP:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes CWE importer robustness around parsing Related_Weaknesses so multiple related CWEs are handled across different XML shapes, and adds regression tests to prevent None propagation during import.
Changes:
- Normalize
Related_Weaknesspayloads into a list and iterate safely over all entries. - Ensure
parse_related_weakness()always returns aStandard(the updated/original CWE node). - Add regression tests covering list-shaped related weaknesses and empty input behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| application/utils/external_project_parsers/parsers/cwe.py | Updates related-weakness parsing to handle list/dict shapes and always return the CWE node. |
| application/tests/cwe_parser_test.py | Adds regression coverage for list-shaped related weakness parsing and empty input behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if isinstance(cwe_entries, Dict): | ||
| cwe_entries = [cwe_entries] |
There was a problem hiding this comment.
typing.Dict should not be used with isinstance() (it raises TypeError on Python 3.11/3.12). Use the runtime type (dict) or collections.abc.Mapping for this check, otherwise this code path can crash during parsing (and the new tests will hit it).
| if isinstance(cwe_entries, Dict): | ||
| cwe_entries = [cwe_entries] | ||
|
|
||
| if isinstance(cwe_entries, list): | ||
| for cwe_entry in cwe_entries: | ||
| if isinstance(cwe_entry, Dict): |
There was a problem hiding this comment.
Same issue here: isinstance(cwe_entry, Dict) uses typing.Dict and can raise TypeError at runtime. Switch to dict (or Mapping) for the instance check so related weaknesses parsing doesn’t crash.
| if isinstance(cwe_entries, Dict): | |
| cwe_entries = [cwe_entries] | |
| if isinstance(cwe_entries, list): | |
| for cwe_entry in cwe_entries: | |
| if isinstance(cwe_entry, Dict): | |
| if isinstance(cwe_entries, dict): | |
| cwe_entries = [cwe_entries] | |
| if isinstance(cwe_entries, list): | |
| for cwe_entry in cwe_entries: | |
| if isinstance(cwe_entry, dict): |
There was a problem hiding this comment.
Implemented and verified.
This PR specifically fixes parse_related_weakness() in application/utils/external_project_parsers/parsers/cwe.py to:
- normalize
Related_Weaknessdict/list shapes - iterate all related CWE entries safely
- always return a
Standardnode (noNonepropagation)
Added regression tests in application/tests/cwe_parser_test.py:
test_parse_related_weakness_handles_listtest_parse_related_weakness_returns_original_on_empty_input
Validation run:
python -m unittest application.tests.cwe_parser_test -v
Result: Ran 3 tests ... OK
Summary
Fixes a CWE parser bug where
parse_related_weakness()could returnNoneand fail to process multiple related weaknesses depending on XML shape.Problem
parse_related_weakness()only handled one specific structure (dict) and did not always return aStandardobject.This could lead to:
Nonepropagating in import flowChanges
parse_related_weakness()to normalizeRelated_Weaknessinto a list when needed.Tests
Added regression tests in
cwe_parser_test.py:Related_WeaknessinputImpact
Improves importer correctness and prevents data integrity issues in CWE-to-CRE relationship building.