Manuscript update#18
Merged
Merged
Conversation
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
There was a problem hiding this comment.
Pull request overview
This PR transitions ContextSV toward a standalone C++ CLI–first distribution (with Conda/Docker packaging) and updates tests/docs accordingly, removing the prior SWIG/Python module and several legacy Python utilities.
Changes:
- Replace the SWIG/Python-wrapped interface with a direct
ContextSVCLI entrypoint and updated versioning. - Add/refresh packaging and distribution assets (Conda recipe/build script, Dockerfile, Makefile
install), plus CI workflow updates. - Update tests to execute the built binary via subprocess and modernize the CNV plotting script.
Reviewed changes
Copilot reviewed 33 out of 34 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_general.py | Reworks tests to run ./build/contextsv and validate basic CLI behavior/output artifacts. |
| tests/data/ref.fa | Test data asset referenced for local/CI runs. |
| src/utils.cpp | Removes unused utility functions and related includes. |
| src/swig_wrapper.i | Removes SWIG wrapper definition. |
| src/swig_interface.cpp | Removes SWIG interface implementation. |
| src/sv_object.cpp | Adjusts SV collection/merge behavior and reduces debug/log noise. |
| src/sv_caller.cpp | Substantial SV calling/merging refinements, threading behavior tweaks, VCF output filtering/formatting updates. |
| src/main.cpp | Switches to direct ContextSV invocation, updates banner/version/help output, and adds success message. |
| src/input_data.cpp | Simplifies inputs (removes region/sample-size/min-CNV APIs), adds BAM/BAM-index warnings. |
| src/fasta_query.cpp | Renames reference loading method (setFilepath → read) and adjusts messages. |
| src/cnv_caller.cpp | Performance and behavior changes in coverage computation, SNP querying, reader caching, and CNV decision thresholds. |
| setup.py | Removes Python packaging entrypoint (SWIG-based). |
| README.md | Updates installation and usage docs for Conda/Docker and the CLI workflow. |
| python/utils.py | Removes legacy Python utility helpers. |
| python/train_model.py | Removes legacy model training script. |
| python/sv_merger.py | Removes legacy Python SV merging tool. |
| python/score_vcf.py | Removes legacy VCF scoring script. |
| python/predict.py | Removes legacy prediction script. |
| python/plot_venn.py | Removes legacy plotting helper. |
| python/plot_distributions.py | Removes legacy distribution plotting script. |
| python/mendelian_inheritance.py | Removes legacy Mendelian inheritance script. |
| python/mendelian_error.py | Removes legacy Mendelian error script. |
| python/extract_features.py | Removes legacy feature extraction script. |
| python/environment_merge.yml | Removes legacy conda env for merge tooling. |
| python/cnv_plots.py | Removes legacy TSV-based CNV plotting script. |
| python/cnv_plots_json.py | Rewrites JSON-based CNV plotting CLI with multi-format output options and validation. |
| python/cluster_params.py | Removes legacy clustering parameter exploration script. |
| Makefile | Updates build flags, adds install target for contextsv + contextsv-cnv-plot. |
| include/version.h | Replaces git-describe string version with semantic version macros. |
| include/utils.h | Removes declarations for deleted utility helpers. |
| include/swig_interface.h | Removes SWIG interface header. |
| include/sv_caller.h | Minor cleanup of commented-out overload. |
| include/input_data.h | Removes sample/min-CNV/region APIs; retains chromosome setter/getter. |
| include/fasta_query.h | Renames reference loading API to read(). |
| include/cnv_caller.h | Removes TSV save API declaration. |
| Doxyfile | Updates exclusion patterns now that SWIG components are removed. |
| Dockerfile | Adds a Conda-based Docker build and smoke tests for CLI + plotting command. |
| conda/meta.yaml | Updates recipe to package contextsv CLI + plotting dependency set. |
| conda/build.sh | Adds conda build script that compiles and installs CLI + plotting entrypoint. |
| .gitignore | Updates ignored artifacts consistent with removing SWIG/Python build outputs. |
| .github/workflows/build-tests.yml | Updates CI to fetch sample data, build, smoke-test, and run pytest. |
| main.py | Removes legacy Python entrypoint that invoked the SWIG-wrapped module. |
Comments suppressed due to low confidence (4)
tests/test_general.py:22
TEST_DATA_DIRis set based onos.getcwd()(lines 14–20) but then immediately overwritten (line 21), making the GitHub Actions/local path selection dead code. This can also mask path bugs because onlytests/datanext to this file is ever used. Consider removing the conditional block or only settingTEST_DATA_DIRonce based on a single, deterministic rule.
tests/test_general.py:36- This test module creates/removes
chr3_pfb.txtat import time. That introduces global side effects (and will fail early iftests/dataisn’t present) even fortest_run_help/test_run_version. Move this setup into a fixture or intotest_run_basic()and ensure the directory exists before writing.
src/main.cpp:160 printUsage()advertises-hfor both--hmmand--help, which is ambiguous, andparseArguments()also checks-hfor both meanings. This makes-hbehavior depend on whether another argument follows, and can prevent-hfrom showing help. Use distinct short flags (e.g., keep-hfor help and use-Hfor HMM), and update bothprintUsage()andparseArguments()consistently.
<< " -t, --threads <thread_count> Number of threads\n"
<< " -h, --hmm <hmm_file> HMM file\n"
<< " -e, --eth <eth_file> ETH file\n"
<< " -p, --pfb <pfb_file> PFB file\n"
<< " --assembly-gaps <gaps_file> Assembly gaps file\n"
<< " --save-cnv Save CNV data\n"
<< " --debug Debug mode with verbose logging\n"
<< " --version Print version and exit\n"
<< " -h, --help Print usage and exit\n";
src/sv_object.cpp:53
initial_sizeis assigned but never used inmergeSVs(). With-Wall -Wextrathis will produce an-Wunused-variablewarning during compilation. Either remove it or reintroduce the logging/metrics that consumes it.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -65,21 +65,12 @@ void runContextSV(const std::unordered_map<std::string, std::string>& args) | |||
| input_data.setRefGenome(args.at("ref-file")); | |||
| input_data.setSNPFilepath(args.at("snps-file")); | |||
| input_data.setOutputDir(args.at("output-dir")); | |||
| --snp snps.vcf \ | ||
| --eth nfe \ | ||
| --pfb gnomadv4_filepaths.txt \ | ||
| --assembly-gaps hg38-gaps.bed \ # optional: assembly gaps file |
Comment on lines
1
to
16
| {% set version = environ.get('GIT_DESCRIBE_TAG').lstrip('v') %} | ||
| {% set revision = environ.get('GIT_FULL_HASH') %} | ||
|
|
||
| package: | ||
| name: contextsv | ||
| version: {{ version }} | ||
|
|
||
| source: | ||
| path: ../ | ||
| git_lfs: false | ||
|
|
||
| channels: | ||
| - wglab | ||
| - conda-forge | ||
| - bioconda | ||
| - defaults |
Comment on lines
84
to
96
| result = subprocess.run( | ||
| ["./build/contextsv", | ||
| "--bam", BAM_FILE, | ||
| "--ref", REF_FILE, | ||
| "--snp", SNPS_FILE, | ||
| "--outdir", out_dir, | ||
| "--hmm", HMM_FILE, | ||
| "--eth", "nfe", | ||
| "--pfb", PFB_FILE, | ||
| "--sample-size", "20", | ||
| "--min-cnv", "2000", | ||
| "--eps", "0.1", | ||
| "--min-pts-pct", "0.1", | ||
| "--assembly-gaps", GAP_FILE, | ||
| "--chr", "chr3", | ||
| "--save-cnv", |
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Manuscript revisions