snakemake rule calls a shell script but exits after first command
up vote
2
down vote
favorite
I have a shell script that works well if I just run it from command line. When I call it from a rule within snakemake it fails.
The script runs a for loop over a file of identifiers and uses those to grep the sequences from a fastq file followed by multiple sequence alignment and makes a consensus.
Here is the script. I placed some echo statements in there and for some reason it doesn't call the commands. It stops at the grep statement.
I have tried adding set +o pipefail; in the rule but that doesn't work either.
#!/bin/bash
function Usage()--umi-list -f
# Check argument count
[[ "$#" -lt 2 ]] && Usage
# parse arguments
while [[ "$#" -gt 1 ]];do
case "$1" in
-r|--read2)
READ2="$2"
shift
;;
-l|--umi-list)
UMI="$2"
shift
;;
-f|--outfile)
OUTFILE="$2"
shift
;;
*)
Usage
;;
esac
shift
done
# Set defaults
# Check arguments
[[ -f "$READ2" ]] || (echo "Cannot find input file $READ2, exiting..." >&2; exit 1)
[[ -f "$UMI" ]] || (echo "Cannot find input file $UMI, exiting..." >&2; exit 1)
#Create output directory
OUTDIR=$(dirname "$OUTFILE")
[[ -d "$OUTDIR" ]] || (set -x; mkdir -p "$OUTDIR")
# Make temporary directories
TEMP_DIR="$OUTDIR/temp"
[[ -d "$TEMP_DIR" ]] || (set -x; mkdir -p "$TEMP_DIR")
#RUN consensus script
for f in $( more "$UMI" | cut -f1);do
NAME=$(echo $f)
grep "$NAME" "$READ2" | cut -f1 -d ' ' | sed 's/@M/M/' > "$TEMP_DIR/$NAME.name"
echo subsetting reads
seqtk subseq "$READ2" "$TEMP_DIR/$NAME.name" | seqtk seq -A > "$TEMP_DIR/$NAME.fasta"
~/software/muscle3.8.31_i86linux64 -msf -in "$TEMP_DIR/$NAME.fasta" -out "$TEMP_DIR/$NAME.muscle.fasta"
echo make consensus
~/software/EMBOSS-6.6.0/emboss/cons -sequence "$TEMP_DIR/$NAME.muscle.fasta" -outseq "$TEMP_DIR/$NAME.cons.fasta"
sed -i 's/n//g' "$TEMP_DIR/$NAME.cons.fasta"
sed -i "s/EMBOSS_001/$NAME.cons/" "$TEMP_DIR/$NAME.cons.fasta"
done
cat "$TEMP_DIR/*.cons.fasta" > "$OUTFILE"
Snakemake rule:
rule make_consensus:
input:
r2=get_extracted,
lst="prefix/sample/reads/cell_barcode_umi.count"
output:
fasta="prefix/sample/reads/fasta/sample.R2.consensus.fa"
shell:
"sh ./scripts/make_consensus.sh -r input.r2 -l input.lst -f output.fasta"
Edit Snakemake error messages I changed some of the paths to a neutral filepath
RuleException:
CalledProcessError in line 29 of ~/user/scripts/consensus.smk:
Command ' set -euo pipefail; sh ./scripts/make_consensus.sh -r ~/user/file.extracted.fastq -l ~/user/cell_barcode_umi
.count -f ~/user/file.consensus.fa ' returned non-zero exit status 1.
File "~/user/scripts/consensus.smk", line 29, in __rule
_make_consensus
File "~/user/miniconda3/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
If there are better ways to do this than using a shell for loop please let me know!
thanks!
Edit
Script ran as standalone: first grep
grep AGGCCGTTCT_TGTGGATG R_extracted/wgs_5_OL_debug.R2.extracted.fastq | cut -f1 -d ' ' | sed 's/@M/M/' > ./fasta/temp/AGGCCGTTCT_TGTGGATG.name
Script ran through snakemake: first 2 grep statements
grep :::::::::::::: R_extracted/wgs_5_OL_debug.R2.extracted.fastq | cut -f1 -d ' ' | sed 's/@M/M/' > ./fasta/temp/::::::::::::::.name
I'm now trying to figure out where those :::: in snakemake are coming from. All ideas welcome
shell for-loop snakemake
add a comment |
up vote
2
down vote
favorite
I have a shell script that works well if I just run it from command line. When I call it from a rule within snakemake it fails.
The script runs a for loop over a file of identifiers and uses those to grep the sequences from a fastq file followed by multiple sequence alignment and makes a consensus.
Here is the script. I placed some echo statements in there and for some reason it doesn't call the commands. It stops at the grep statement.
I have tried adding set +o pipefail; in the rule but that doesn't work either.
#!/bin/bash
function Usage()--umi-list -f
# Check argument count
[[ "$#" -lt 2 ]] && Usage
# parse arguments
while [[ "$#" -gt 1 ]];do
case "$1" in
-r|--read2)
READ2="$2"
shift
;;
-l|--umi-list)
UMI="$2"
shift
;;
-f|--outfile)
OUTFILE="$2"
shift
;;
*)
Usage
;;
esac
shift
done
# Set defaults
# Check arguments
[[ -f "$READ2" ]] || (echo "Cannot find input file $READ2, exiting..." >&2; exit 1)
[[ -f "$UMI" ]] || (echo "Cannot find input file $UMI, exiting..." >&2; exit 1)
#Create output directory
OUTDIR=$(dirname "$OUTFILE")
[[ -d "$OUTDIR" ]] || (set -x; mkdir -p "$OUTDIR")
# Make temporary directories
TEMP_DIR="$OUTDIR/temp"
[[ -d "$TEMP_DIR" ]] || (set -x; mkdir -p "$TEMP_DIR")
#RUN consensus script
for f in $( more "$UMI" | cut -f1);do
NAME=$(echo $f)
grep "$NAME" "$READ2" | cut -f1 -d ' ' | sed 's/@M/M/' > "$TEMP_DIR/$NAME.name"
echo subsetting reads
seqtk subseq "$READ2" "$TEMP_DIR/$NAME.name" | seqtk seq -A > "$TEMP_DIR/$NAME.fasta"
~/software/muscle3.8.31_i86linux64 -msf -in "$TEMP_DIR/$NAME.fasta" -out "$TEMP_DIR/$NAME.muscle.fasta"
echo make consensus
~/software/EMBOSS-6.6.0/emboss/cons -sequence "$TEMP_DIR/$NAME.muscle.fasta" -outseq "$TEMP_DIR/$NAME.cons.fasta"
sed -i 's/n//g' "$TEMP_DIR/$NAME.cons.fasta"
sed -i "s/EMBOSS_001/$NAME.cons/" "$TEMP_DIR/$NAME.cons.fasta"
done
cat "$TEMP_DIR/*.cons.fasta" > "$OUTFILE"
Snakemake rule:
rule make_consensus:
input:
r2=get_extracted,
lst="prefix/sample/reads/cell_barcode_umi.count"
output:
fasta="prefix/sample/reads/fasta/sample.R2.consensus.fa"
shell:
"sh ./scripts/make_consensus.sh -r input.r2 -l input.lst -f output.fasta"
Edit Snakemake error messages I changed some of the paths to a neutral filepath
RuleException:
CalledProcessError in line 29 of ~/user/scripts/consensus.smk:
Command ' set -euo pipefail; sh ./scripts/make_consensus.sh -r ~/user/file.extracted.fastq -l ~/user/cell_barcode_umi
.count -f ~/user/file.consensus.fa ' returned non-zero exit status 1.
File "~/user/scripts/consensus.smk", line 29, in __rule
_make_consensus
File "~/user/miniconda3/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
If there are better ways to do this than using a shell for loop please let me know!
thanks!
Edit
Script ran as standalone: first grep
grep AGGCCGTTCT_TGTGGATG R_extracted/wgs_5_OL_debug.R2.extracted.fastq | cut -f1 -d ' ' | sed 's/@M/M/' > ./fasta/temp/AGGCCGTTCT_TGTGGATG.name
Script ran through snakemake: first 2 grep statements
grep :::::::::::::: R_extracted/wgs_5_OL_debug.R2.extracted.fastq | cut -f1 -d ' ' | sed 's/@M/M/' > ./fasta/temp/::::::::::::::.name
I'm now trying to figure out where those :::: in snakemake are coming from. All ideas welcome
shell for-loop snakemake
What are the error messages?
– JeeYem
Nov 11 at 1:26
It might be useful to include an examplesnakemake
invocation.
– merv
Nov 11 at 2:46
What is the error that you get when usingset +o pipefail;
as part of your rule's shell command?
– JeeYem
Nov 11 at 20:50
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I have a shell script that works well if I just run it from command line. When I call it from a rule within snakemake it fails.
The script runs a for loop over a file of identifiers and uses those to grep the sequences from a fastq file followed by multiple sequence alignment and makes a consensus.
Here is the script. I placed some echo statements in there and for some reason it doesn't call the commands. It stops at the grep statement.
I have tried adding set +o pipefail; in the rule but that doesn't work either.
#!/bin/bash
function Usage()--umi-list -f
# Check argument count
[[ "$#" -lt 2 ]] && Usage
# parse arguments
while [[ "$#" -gt 1 ]];do
case "$1" in
-r|--read2)
READ2="$2"
shift
;;
-l|--umi-list)
UMI="$2"
shift
;;
-f|--outfile)
OUTFILE="$2"
shift
;;
*)
Usage
;;
esac
shift
done
# Set defaults
# Check arguments
[[ -f "$READ2" ]] || (echo "Cannot find input file $READ2, exiting..." >&2; exit 1)
[[ -f "$UMI" ]] || (echo "Cannot find input file $UMI, exiting..." >&2; exit 1)
#Create output directory
OUTDIR=$(dirname "$OUTFILE")
[[ -d "$OUTDIR" ]] || (set -x; mkdir -p "$OUTDIR")
# Make temporary directories
TEMP_DIR="$OUTDIR/temp"
[[ -d "$TEMP_DIR" ]] || (set -x; mkdir -p "$TEMP_DIR")
#RUN consensus script
for f in $( more "$UMI" | cut -f1);do
NAME=$(echo $f)
grep "$NAME" "$READ2" | cut -f1 -d ' ' | sed 's/@M/M/' > "$TEMP_DIR/$NAME.name"
echo subsetting reads
seqtk subseq "$READ2" "$TEMP_DIR/$NAME.name" | seqtk seq -A > "$TEMP_DIR/$NAME.fasta"
~/software/muscle3.8.31_i86linux64 -msf -in "$TEMP_DIR/$NAME.fasta" -out "$TEMP_DIR/$NAME.muscle.fasta"
echo make consensus
~/software/EMBOSS-6.6.0/emboss/cons -sequence "$TEMP_DIR/$NAME.muscle.fasta" -outseq "$TEMP_DIR/$NAME.cons.fasta"
sed -i 's/n//g' "$TEMP_DIR/$NAME.cons.fasta"
sed -i "s/EMBOSS_001/$NAME.cons/" "$TEMP_DIR/$NAME.cons.fasta"
done
cat "$TEMP_DIR/*.cons.fasta" > "$OUTFILE"
Snakemake rule:
rule make_consensus:
input:
r2=get_extracted,
lst="prefix/sample/reads/cell_barcode_umi.count"
output:
fasta="prefix/sample/reads/fasta/sample.R2.consensus.fa"
shell:
"sh ./scripts/make_consensus.sh -r input.r2 -l input.lst -f output.fasta"
Edit Snakemake error messages I changed some of the paths to a neutral filepath
RuleException:
CalledProcessError in line 29 of ~/user/scripts/consensus.smk:
Command ' set -euo pipefail; sh ./scripts/make_consensus.sh -r ~/user/file.extracted.fastq -l ~/user/cell_barcode_umi
.count -f ~/user/file.consensus.fa ' returned non-zero exit status 1.
File "~/user/scripts/consensus.smk", line 29, in __rule
_make_consensus
File "~/user/miniconda3/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
If there are better ways to do this than using a shell for loop please let me know!
thanks!
Edit
Script ran as standalone: first grep
grep AGGCCGTTCT_TGTGGATG R_extracted/wgs_5_OL_debug.R2.extracted.fastq | cut -f1 -d ' ' | sed 's/@M/M/' > ./fasta/temp/AGGCCGTTCT_TGTGGATG.name
Script ran through snakemake: first 2 grep statements
grep :::::::::::::: R_extracted/wgs_5_OL_debug.R2.extracted.fastq | cut -f1 -d ' ' | sed 's/@M/M/' > ./fasta/temp/::::::::::::::.name
I'm now trying to figure out where those :::: in snakemake are coming from. All ideas welcome
shell for-loop snakemake
I have a shell script that works well if I just run it from command line. When I call it from a rule within snakemake it fails.
The script runs a for loop over a file of identifiers and uses those to grep the sequences from a fastq file followed by multiple sequence alignment and makes a consensus.
Here is the script. I placed some echo statements in there and for some reason it doesn't call the commands. It stops at the grep statement.
I have tried adding set +o pipefail; in the rule but that doesn't work either.
#!/bin/bash
function Usage()--umi-list -f
# Check argument count
[[ "$#" -lt 2 ]] && Usage
# parse arguments
while [[ "$#" -gt 1 ]];do
case "$1" in
-r|--read2)
READ2="$2"
shift
;;
-l|--umi-list)
UMI="$2"
shift
;;
-f|--outfile)
OUTFILE="$2"
shift
;;
*)
Usage
;;
esac
shift
done
# Set defaults
# Check arguments
[[ -f "$READ2" ]] || (echo "Cannot find input file $READ2, exiting..." >&2; exit 1)
[[ -f "$UMI" ]] || (echo "Cannot find input file $UMI, exiting..." >&2; exit 1)
#Create output directory
OUTDIR=$(dirname "$OUTFILE")
[[ -d "$OUTDIR" ]] || (set -x; mkdir -p "$OUTDIR")
# Make temporary directories
TEMP_DIR="$OUTDIR/temp"
[[ -d "$TEMP_DIR" ]] || (set -x; mkdir -p "$TEMP_DIR")
#RUN consensus script
for f in $( more "$UMI" | cut -f1);do
NAME=$(echo $f)
grep "$NAME" "$READ2" | cut -f1 -d ' ' | sed 's/@M/M/' > "$TEMP_DIR/$NAME.name"
echo subsetting reads
seqtk subseq "$READ2" "$TEMP_DIR/$NAME.name" | seqtk seq -A > "$TEMP_DIR/$NAME.fasta"
~/software/muscle3.8.31_i86linux64 -msf -in "$TEMP_DIR/$NAME.fasta" -out "$TEMP_DIR/$NAME.muscle.fasta"
echo make consensus
~/software/EMBOSS-6.6.0/emboss/cons -sequence "$TEMP_DIR/$NAME.muscle.fasta" -outseq "$TEMP_DIR/$NAME.cons.fasta"
sed -i 's/n//g' "$TEMP_DIR/$NAME.cons.fasta"
sed -i "s/EMBOSS_001/$NAME.cons/" "$TEMP_DIR/$NAME.cons.fasta"
done
cat "$TEMP_DIR/*.cons.fasta" > "$OUTFILE"
Snakemake rule:
rule make_consensus:
input:
r2=get_extracted,
lst="prefix/sample/reads/cell_barcode_umi.count"
output:
fasta="prefix/sample/reads/fasta/sample.R2.consensus.fa"
shell:
"sh ./scripts/make_consensus.sh -r input.r2 -l input.lst -f output.fasta"
Edit Snakemake error messages I changed some of the paths to a neutral filepath
RuleException:
CalledProcessError in line 29 of ~/user/scripts/consensus.smk:
Command ' set -euo pipefail; sh ./scripts/make_consensus.sh -r ~/user/file.extracted.fastq -l ~/user/cell_barcode_umi
.count -f ~/user/file.consensus.fa ' returned non-zero exit status 1.
File "~/user/scripts/consensus.smk", line 29, in __rule
_make_consensus
File "~/user/miniconda3/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
If there are better ways to do this than using a shell for loop please let me know!
thanks!
Edit
Script ran as standalone: first grep
grep AGGCCGTTCT_TGTGGATG R_extracted/wgs_5_OL_debug.R2.extracted.fastq | cut -f1 -d ' ' | sed 's/@M/M/' > ./fasta/temp/AGGCCGTTCT_TGTGGATG.name
Script ran through snakemake: first 2 grep statements
grep :::::::::::::: R_extracted/wgs_5_OL_debug.R2.extracted.fastq | cut -f1 -d ' ' | sed 's/@M/M/' > ./fasta/temp/::::::::::::::.name
I'm now trying to figure out where those :::: in snakemake are coming from. All ideas welcome
shell for-loop snakemake
shell for-loop snakemake
edited Nov 13 at 22:48
asked Nov 10 at 21:47
Mack123456
113
113
What are the error messages?
– JeeYem
Nov 11 at 1:26
It might be useful to include an examplesnakemake
invocation.
– merv
Nov 11 at 2:46
What is the error that you get when usingset +o pipefail;
as part of your rule's shell command?
– JeeYem
Nov 11 at 20:50
add a comment |
What are the error messages?
– JeeYem
Nov 11 at 1:26
It might be useful to include an examplesnakemake
invocation.
– merv
Nov 11 at 2:46
What is the error that you get when usingset +o pipefail;
as part of your rule's shell command?
– JeeYem
Nov 11 at 20:50
What are the error messages?
– JeeYem
Nov 11 at 1:26
What are the error messages?
– JeeYem
Nov 11 at 1:26
It might be useful to include an example
snakemake
invocation.– merv
Nov 11 at 2:46
It might be useful to include an example
snakemake
invocation.– merv
Nov 11 at 2:46
What is the error that you get when using
set +o pipefail;
as part of your rule's shell command?– JeeYem
Nov 11 at 20:50
What is the error that you get when using
set +o pipefail;
as part of your rule's shell command?– JeeYem
Nov 11 at 20:50
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
It stops at the grep statement
My guess is that the grep
command in make_consensus.sh
doesn't capture anything. grep
returns exit code 1 in such cases and the non-zero exit status propagates to snakemake. (see also Handling SIGPIPE error in snakemake)
Loosely related... There is an inconsistency between the shebang of make_consensus.sh
that says the script should be executed with bash
(#!/bin/bash
) and the actual execution using sh (sh ./scripts/make_consensus.sh
). (In practice it shouldn't make any difference since sh is probably redirected to bash anyway)
I look deeper into it. I'll add an edit with echo grep ... for snakemake and when the script is run as a stand alone script. The return different things for the first 2 elements in the for loop.
– Mack123456
Nov 13 at 22:45
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
It stops at the grep statement
My guess is that the grep
command in make_consensus.sh
doesn't capture anything. grep
returns exit code 1 in such cases and the non-zero exit status propagates to snakemake. (see also Handling SIGPIPE error in snakemake)
Loosely related... There is an inconsistency between the shebang of make_consensus.sh
that says the script should be executed with bash
(#!/bin/bash
) and the actual execution using sh (sh ./scripts/make_consensus.sh
). (In practice it shouldn't make any difference since sh is probably redirected to bash anyway)
I look deeper into it. I'll add an edit with echo grep ... for snakemake and when the script is run as a stand alone script. The return different things for the first 2 elements in the for loop.
– Mack123456
Nov 13 at 22:45
add a comment |
up vote
1
down vote
It stops at the grep statement
My guess is that the grep
command in make_consensus.sh
doesn't capture anything. grep
returns exit code 1 in such cases and the non-zero exit status propagates to snakemake. (see also Handling SIGPIPE error in snakemake)
Loosely related... There is an inconsistency between the shebang of make_consensus.sh
that says the script should be executed with bash
(#!/bin/bash
) and the actual execution using sh (sh ./scripts/make_consensus.sh
). (In practice it shouldn't make any difference since sh is probably redirected to bash anyway)
I look deeper into it. I'll add an edit with echo grep ... for snakemake and when the script is run as a stand alone script. The return different things for the first 2 elements in the for loop.
– Mack123456
Nov 13 at 22:45
add a comment |
up vote
1
down vote
up vote
1
down vote
It stops at the grep statement
My guess is that the grep
command in make_consensus.sh
doesn't capture anything. grep
returns exit code 1 in such cases and the non-zero exit status propagates to snakemake. (see also Handling SIGPIPE error in snakemake)
Loosely related... There is an inconsistency between the shebang of make_consensus.sh
that says the script should be executed with bash
(#!/bin/bash
) and the actual execution using sh (sh ./scripts/make_consensus.sh
). (In practice it shouldn't make any difference since sh is probably redirected to bash anyway)
It stops at the grep statement
My guess is that the grep
command in make_consensus.sh
doesn't capture anything. grep
returns exit code 1 in such cases and the non-zero exit status propagates to snakemake. (see also Handling SIGPIPE error in snakemake)
Loosely related... There is an inconsistency between the shebang of make_consensus.sh
that says the script should be executed with bash
(#!/bin/bash
) and the actual execution using sh (sh ./scripts/make_consensus.sh
). (In practice it shouldn't make any difference since sh is probably redirected to bash anyway)
answered Nov 12 at 9:07
dariober
8891121
8891121
I look deeper into it. I'll add an edit with echo grep ... for snakemake and when the script is run as a stand alone script. The return different things for the first 2 elements in the for loop.
– Mack123456
Nov 13 at 22:45
add a comment |
I look deeper into it. I'll add an edit with echo grep ... for snakemake and when the script is run as a stand alone script. The return different things for the first 2 elements in the for loop.
– Mack123456
Nov 13 at 22:45
I look deeper into it. I'll add an edit with echo grep ... for snakemake and when the script is run as a stand alone script. The return different things for the first 2 elements in the for loop.
– Mack123456
Nov 13 at 22:45
I look deeper into it. I'll add an edit with echo grep ... for snakemake and when the script is run as a stand alone script. The return different things for the first 2 elements in the for loop.
– Mack123456
Nov 13 at 22:45
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53243735%2fsnakemake-rule-calls-a-shell-script-but-exits-after-first-command%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What are the error messages?
– JeeYem
Nov 11 at 1:26
It might be useful to include an example
snakemake
invocation.– merv
Nov 11 at 2:46
What is the error that you get when using
set +o pipefail;
as part of your rule's shell command?– JeeYem
Nov 11 at 20:50