Command Line Interface

ChloroScan uses Snk to create a command-line utility. Here are the available commands:

  • run - Run the ChloroScan pipeline.

  • config - Show the workflow configuration.

  • env - Access the workflow conda environments.

  • script - Access the workflow scripts.

  • info - Show information about the workflow.

  • profile - Access the workflow profiles.

  • github - Launch the ChloroScan GitHub page.

  • docs - Launch the ChloroScan documentation.

To find out more about a command, run:

chloroscan --help

run

The main command to run the ChloroScan pipeline is:

chloroscan run

The run command will pass all unrecognized arguments to Snakemake. That means that if you want to use any of the Snakemake options, you can pass them to the run command e.g. chloroscan run --use-singularity. To see all the options available in Snakemake, you can use the --help-snakemake (-hs) flag.

You can use the --config flag to specify a configuration file to override the existing workflow configuration. This is the same as using the --configfile flag in Snakemake.

chloroscan run --config config.yaml

You can use the --profile flag to specify a profile to use for configuring Snakemake. You can specify the profile by name. The profile must be located in the profiles directory of the workflow.

chloroscan run --profile slurm

You can use the --dag flag to save the directed acyclic graph to a file. The output file must end in .pdf, .png, or .svg.

chloroscan run --dag workflow.pdf

You can use the --dry flag to display what would be done without executing anything (this is the same as using the --dry-run flag in Snakemake).

The --lock flag is used to lock the working directory (by default, the working directory is not locked e.g. --nolock is passed to Snakemake).

The --cores flag is used to set the number of cores to use. If None is specified, all cores will be used by default.

The --no-conda flag is used to disable the use of conda environments.

The --keep-snakemake flag is used to keep the .snakemake folder after the pipeline completes. This is useful for debugging purposes. By default, the .snakemake folder is removed after the pipeline completes.

The chloroscan help message is broken into two sections. The first section lists the options available for the run command. The second section lists the workflow configuration options (generated from the snakemake configfile).

chloroscan run --help

That will output something like:

Usage: chloroscan run [OPTIONS]

Run the workflow.
All unrecognized arguments are passed onto Snakemake.

╭─ Options ────────────────────────────────────────────────────────────────────────────────╮
│ --config                   FILE     Path to snakemake config file. Overrides existing    │
│                                     workflow configuration.                              │
│                                     [default: None]                                      │
│ --resource        -r       PATH     Additional resources to copy from workflow directory │
│                                     at run time.                                         │
│ --profile         -p       TEXT     Name of profile to use for configuring Snakemake.    │
│                                     [default: None]                                      │
│ --dry             -n                Do not execute anything, and display what would be   │
│                                     done.                                                │
│ --lock            -l                Lock the working directory.                          │
│ --dag             -d       PATH     Save directed acyclic graph to file. Must end in     │
│                                     .pdf, .png or .svg                                   │
│                                     [default: None]                                      │
│ --cores           -c       INTEGER  Set the number of cores to use. If None will use all │
│                                     cores.                                               │
│                                     [default: None]                                      │
│ --no-conda                          Do not use conda environments.                       │
│ --keep-resources                    Keep resources after pipeline completes.             │
│ --keep-snakemake                    Keep .snakemake folder after pipeline completes.     │
│ --verbose         -v                Run workflow in verbose mode.                        │
│ --help-snakemake  -hs               Print the snakemake help and exit.                   │
│ --help            -h                Show this message and exit.                          │
╰──────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Workflow Configuration ─────────────────────────────────────────────────────────────────╮
│ --Inputs-assembly        -a                            PATH     Path to fasta format     │
│                                                                 assembly of contigs from │
│                                                                 all sorts of organisms.  │
│                                                                 [default: None]          │
│ --Inputs-depth-txt       -d                            PATH     Path to a tab-separated  │
│                                                                 text storing abundance   │
│                                                                 of each contig in the    │
│                                                                 sample.                  │
│                                                                 [default: None]          │
│ --Inputs-alignment       -l                            PATH     Path to the folder       │
│                                                                 containing alignment     │
│                                                                 files of the contigs.    │
│                                                                 [default: None]          │
│ --Inputs-batch-name      -b                            TEXT     Name of the batch.       │
│                                                                 [default: None]          │
│ --outputdir              -o                            PATH     Path to the output       │
│                                                                 directory of the         │
│                                                                 workflow.                │
│                                                                 [default: None]          │
│ --tmpdir                 -t                            PATH     Path to the temporary    │
│                                                                 directory of the         │
│                                                                 workflow.                │
│                                                                 [default: tmp]           │
│ --binning-universal-le…                                INTEGER  Length cutoff for        │
│                                                                 universal binning.       │
│                                                                 [default: 1500]          │
│ --binning-snakemake-env                                TEXT     Customized snakemake     │
│                                                                 environment for binny to │
│                                                                 run.                     │
│                                                                 [default: None]          │
│ --binning-mantis-env                                   TEXT     Customized Mantis        │
│                                                                 virtual environment to   │
│                                                                 have mantis_pfa          │
│                                                                 installed, annotating    │
│                                                                 genes.                   │
│                                                                 [default: None]          │
│ --binning-outputdir      -o                            PATH     Path to the output       │
│                                                                 directory of the         │
│                                                                 binning.                 │
│                                                                 [default: binny_output]  │
│ --binning-clustering-e…                                TEXT     Range of epsilon values  │
│                                                                 for HDBSCAN clustering.  │
│                                                                 [default: 0.250,0.000]   │
│ --binning-clustering-h…                                TEXT     Range of min_samples     │
│                                                                 values for HDBSCAN       │
│                                                                 clustering, larger value │
│                                                                 means larger MAGs.       │
│                                                                 [default: 1,4,7,10]      │
│ --binning-bin-quality-…                                FLOAT    Starting completeness    │
│                                                                 for bin quality.         │
│                                                                 [default: 92.5]          │
│ --binning-bin-quality-…                                FLOAT    Minimum completeness for │
│                                                                 bin quality.             │
│                                                                 [default: 72.5]          │
│ --binning-bin-quality-…                                FLOAT    Purity for bin quality.  │
│                                                                 [default: 95]            │
│ --corgi-min-length                                     INTEGER  Minimum length of        │
│                                                                 contigs to be processed  │
│                                                                 by CORGI.                │
│                                                                 [default: 500]           │
│ --corgi-save-filter          --no-corgi-save-filter             Save the filtered        │
│                                                                 contigs by CORGI (Note:  │
│                                                                 may take long time).     │
│                                                                 [default:                │
│                                                                 no-corgi-save-filter]    │
│ --corgi-batch-size                                     INTEGER  Batch size for CORGI to  │
│                                                                 process contigs.         │
│                                                                 [default: 1]             │
│ --corgi-pthreshold                                     FLOAT    P-value threshold for    │
│                                                                 CORGI to determine if    │
│                                                                 the contigs category is  │
│                                                                 authentically plastidial │
│                                                                 or something else.       │
│                                                                 [default: 0.9]           │
│ --cat-database           -d                            PATH     Path to the database of  │
│                                                                 chloroplast genomes.     │
│                                                                 [default:                │
│                                                                 /home/yuhtong/scratch/a… │
│ --cat-taxonomy           -t                            PATH     Path to the taxonomy of  │
│                                                                 the database.            │
│                                                                 [default:                │
│                                                                 /home/yuhtong/scratch/a… │
│ --krona-env                                            TEXT     Path to the Krona        │
│                                                                 environment.             │
│                                                                 [default: kronatools]    │
╰──────────────────────────────────────────────────────────────────────────────────────────╯

config

The config subcommand will display the workflow configuration file contents. You can use the --pretty (-p) flag to display the configuration in a more readable format.

chloroscan config

You can redirect the output to a file to save the configuration.

chloroscan config > config.yaml

You can then edit the configuration file and use it to run the workflow.

chloroscan run --config config.yaml

env

The env subcommand in the workflow tool allows you to access and manage the conda environments used within the workflow. This guide provides an overview of the available options and commands for working with workflow environments.

env list

List the environments in the workflow.

chloroscan env list [OPTIONS]

env activate

This command activates the specified conda environment within the workflow.

chloroscan env activate [OPTIONS] ENV_NAME

ENV_NAME: Name of the environment to activate.

env create

This command creates all the conda environments specified in the envs dir. Individual conda envs can be create with workflow env create ENV_NAME.

Snakemake workflows that use a lot of conda environments can take a long time to install as each env is created sequentially. Running workflow env create –workers number_of_workers will create all the conda envs in parallel up to the number of workers requested (defaults to 1).

chloroscan env create --workers 4  # create up to 4 conda envs at a time

Warning

Some conda envs may not support parallel creation. If you encounter an error, try reducing the number of workers.

env remove

This command deletes all the conda environments in the workflow. You can also delete individual environments by specifying the environment name. Use the --force option to skip the confirmation prompt.

chloroscan env remove [OPTIONS] [ENV_NAME...]

ENV_NAME: Name of the environment to remove.

env run

The env run command in the workflow tool allows you to run a command within one of the workflow environments.

chloroscan env run --env ENV_NAME CMD...

CMD...: The command and its arguments to execute within the specified environment.

Make sure to replace ENV_NAME with the actual name of the desired environment, and CMD with the command you want to run.

For example:

chloroscan env run -e my_environment "python script.py --input input_file.txt --output output_file.txt"

This command runs the python script.py --input input_file.txt --output output_file.txt command within the my_environment environment in the workflow. Adjust the command and environment name according to your specific use case.

env show

This command displays the contents of the environments configuration file used in the workflow.

chloroscan env show [OPTIONS]

info

The info subcommand provides JSON formatted information about the workflow.

chloroscan info
{
    "name": "chloroscan",
    "version": "0.1.1",
    "snakefile": "~/chloroscan/workflow/Snakefile",
    "conda_prefix_dir": "~/chloroscan/.conda",
    "singularity_prefix_dir": "~/chloroscan/.singularity",
    "workflow_dir_path": "~/chloroscan"
}

profile

The profile subcommand provides several commands to manage the workflow profiles. Profiles are used to define different configurations for the workflow e.g. you can configure how the workflow will run on a HPC. You can read more about profiles in the Snakemake documentation.

profile list

List the profiles in the workflow.

chloroscan profile list

profile show

The show command will display the contents of a profile.

chloroscan profile show slurm

You can save the profiles by piping the output to a file.

chloroscan profile show slurm > profile/config.yaml

Load a profile

You can load a profile by using the --profile option in the run command.

chloroscan run --profile profile/config.yaml

profile edit

The edit command will open the profile in the default editor. Changes saved will modify the default profile settings for the installation.

chloroscan profile edit slurm

script

The script commands allow you to interact with the scripts that come with ChloroScan.

script list

The list command will list all scripts in the workflow.

chloroscan script list

script show

The show command will display the contents of a script.

chloroscan script show hello.py

script run

The run command will run a script. Use the --env option to specify the environment to run the script in.

chloroscan script run --env summary merge_contig_depth.py --help

github

chloroscan github

Launches the ChloroScan GitHub page at https://github.com/andyargueasae/chloroscan

docs

chloroscan docs

Launches the ChloroScan GitHub page at https://andyargueasae.github.io/chloroscan/