GNU parallel: deleting line from joblog breaks parallel updating it










1















If you run GNU parallel with --joblog path/to/logfile and then delete a line from said logfile while parallel is running, GNU parallel is no longer able to append future completed jobs to it.



Execute this MWE:



#!/usr/bin/bash

parallel -j1 -n0 --joblog log sleep 1 ::: $(seq 10) &

sleep 5 && sed -i '$ d' log


If you tail -f log prior to execution, you can see that parallel keeps writing to this file. However, if you cat log after 10 seconds, you will see that nothing was written to the actual file now on disk after the third entry or so.



What's the reason behind this? Is there a way to delete something from the file and have GNU parallel be able to still write to it?



Some background as to why this happened:



Using GNU parallel, I started a few jobs on remote machines with --sshloginfile. I then needed to pkill a few jobs on one of the machines because a colleague needed to use it (and I subsequently removed the machine from the sshloginfile so that parallel wouldn't reuse it for new runs). If you pkill those processes started on the remote machine, they get an Exitval of 0 (it looks like they finished without issues; you can't tell that they were killed). I wanted to remove them immediately from the joblog so that when I restart parallel --resume later, parallel can have a look at the joblog and determine what's missing.



Turns out, this was a bad idea, as now my joblog is useless.










share|improve this question

















  • 3





    AFAIK, it's always a "bad idea" TM to allow multiple programs to write to the same file!

    – Mark Setchell
    Nov 15 '18 at 19:29












  • Yeah, you are of course right. I just expected parallel to not hold on to the log file, but to just append once a new jobs was finished.

    – mSSM
    Nov 15 '18 at 22:20















1















If you run GNU parallel with --joblog path/to/logfile and then delete a line from said logfile while parallel is running, GNU parallel is no longer able to append future completed jobs to it.



Execute this MWE:



#!/usr/bin/bash

parallel -j1 -n0 --joblog log sleep 1 ::: $(seq 10) &

sleep 5 && sed -i '$ d' log


If you tail -f log prior to execution, you can see that parallel keeps writing to this file. However, if you cat log after 10 seconds, you will see that nothing was written to the actual file now on disk after the third entry or so.



What's the reason behind this? Is there a way to delete something from the file and have GNU parallel be able to still write to it?



Some background as to why this happened:



Using GNU parallel, I started a few jobs on remote machines with --sshloginfile. I then needed to pkill a few jobs on one of the machines because a colleague needed to use it (and I subsequently removed the machine from the sshloginfile so that parallel wouldn't reuse it for new runs). If you pkill those processes started on the remote machine, they get an Exitval of 0 (it looks like they finished without issues; you can't tell that they were killed). I wanted to remove them immediately from the joblog so that when I restart parallel --resume later, parallel can have a look at the joblog and determine what's missing.



Turns out, this was a bad idea, as now my joblog is useless.










share|improve this question

















  • 3





    AFAIK, it's always a "bad idea" TM to allow multiple programs to write to the same file!

    – Mark Setchell
    Nov 15 '18 at 19:29












  • Yeah, you are of course right. I just expected parallel to not hold on to the log file, but to just append once a new jobs was finished.

    – mSSM
    Nov 15 '18 at 22:20













1












1








1








If you run GNU parallel with --joblog path/to/logfile and then delete a line from said logfile while parallel is running, GNU parallel is no longer able to append future completed jobs to it.



Execute this MWE:



#!/usr/bin/bash

parallel -j1 -n0 --joblog log sleep 1 ::: $(seq 10) &

sleep 5 && sed -i '$ d' log


If you tail -f log prior to execution, you can see that parallel keeps writing to this file. However, if you cat log after 10 seconds, you will see that nothing was written to the actual file now on disk after the third entry or so.



What's the reason behind this? Is there a way to delete something from the file and have GNU parallel be able to still write to it?



Some background as to why this happened:



Using GNU parallel, I started a few jobs on remote machines with --sshloginfile. I then needed to pkill a few jobs on one of the machines because a colleague needed to use it (and I subsequently removed the machine from the sshloginfile so that parallel wouldn't reuse it for new runs). If you pkill those processes started on the remote machine, they get an Exitval of 0 (it looks like they finished without issues; you can't tell that they were killed). I wanted to remove them immediately from the joblog so that when I restart parallel --resume later, parallel can have a look at the joblog and determine what's missing.



Turns out, this was a bad idea, as now my joblog is useless.










share|improve this question














If you run GNU parallel with --joblog path/to/logfile and then delete a line from said logfile while parallel is running, GNU parallel is no longer able to append future completed jobs to it.



Execute this MWE:



#!/usr/bin/bash

parallel -j1 -n0 --joblog log sleep 1 ::: $(seq 10) &

sleep 5 && sed -i '$ d' log


If you tail -f log prior to execution, you can see that parallel keeps writing to this file. However, if you cat log after 10 seconds, you will see that nothing was written to the actual file now on disk after the third entry or so.



What's the reason behind this? Is there a way to delete something from the file and have GNU parallel be able to still write to it?



Some background as to why this happened:



Using GNU parallel, I started a few jobs on remote machines with --sshloginfile. I then needed to pkill a few jobs on one of the machines because a colleague needed to use it (and I subsequently removed the machine from the sshloginfile so that parallel wouldn't reuse it for new runs). If you pkill those processes started on the remote machine, they get an Exitval of 0 (it looks like they finished without issues; you can't tell that they were killed). I wanted to remove them immediately from the joblog so that when I restart parallel --resume later, parallel can have a look at the joblog and determine what's missing.



Turns out, this was a bad idea, as now my joblog is useless.







gnu-parallel






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 19:25









mSSMmSSM

29138




29138







  • 3





    AFAIK, it's always a "bad idea" TM to allow multiple programs to write to the same file!

    – Mark Setchell
    Nov 15 '18 at 19:29












  • Yeah, you are of course right. I just expected parallel to not hold on to the log file, but to just append once a new jobs was finished.

    – mSSM
    Nov 15 '18 at 22:20












  • 3





    AFAIK, it's always a "bad idea" TM to allow multiple programs to write to the same file!

    – Mark Setchell
    Nov 15 '18 at 19:29












  • Yeah, you are of course right. I just expected parallel to not hold on to the log file, but to just append once a new jobs was finished.

    – mSSM
    Nov 15 '18 at 22:20







3




3





AFAIK, it's always a "bad idea" TM to allow multiple programs to write to the same file!

– Mark Setchell
Nov 15 '18 at 19:29






AFAIK, it's always a "bad idea" TM to allow multiple programs to write to the same file!

– Mark Setchell
Nov 15 '18 at 19:29














Yeah, you are of course right. I just expected parallel to not hold on to the log file, but to just append once a new jobs was finished.

– mSSM
Nov 15 '18 at 22:20





Yeah, you are of course right. I just expected parallel to not hold on to the log file, but to just append once a new jobs was finished.

– mSSM
Nov 15 '18 at 22:20












1 Answer
1






active

oldest

votes


















1














While @MarkSetchell is absolutely right in his comment, root problem here is due to man sed lying:



-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)


sed -i does not edit files in place.



What it does is to make a temporary file in the same dir, copy the input file to the temporary file while doing the editing, and finally renaming the temporary file to the input file's name. Similar to this:



sed '$ d' log > sedXxO11P
mv sedXxO11P log


It is clear that the original log and sedXxO11P have different inodes - let us call them ino1 and ino2. GNU Parallel has ino1 open and really does not know about the existence of ino2. GNU Parallel will happily append to ino1 completely unaware that when it closes the file, the file will vanish because it has already been unlinked.



So you need to change the content of the file without changing the inode:



#!/usr/bin/bash 

seq 10 | parallel -j1 -n0 --joblog log sleep 1 &

sleep 5

# Obvious race condition here:
# Anything appended to log before sed is done is lost.
# This can be avoided by suspending parallel while running this
tmp=$RANDOM$$
cp log $tmp
(rm $tmp; sed '$ d' >log) < $tmp

wait
cat log


This works right now. But do not expect this to be a supported feature - ever.






share|improve this answer























  • Hi, thanks for providing this “fix”. You are probably right that this is way to unsecure to be used. Maybe as an alternative question: if I kill a job on a remote host, how do I go about resuming it? If I understand the --resume and --resume-flags flags correctly, there is no way parallel would know to resume my jobs. If I recall correctly, Exitval was reported as 0 (= success). Together with the folder pointed to by --results being populatel, it looks to parallel as if all went well.

    – mSSM
    Nov 21 '18 at 20:14











  • It looks like when I pkill a process started by parallel, it should report an exitval of 143, but it doesn't. Any idea why?

    – mSSM
    Nov 21 '18 at 20:50











  • @mSSM I cannot reproduce that. Are you using current release or an ancient, buggy version?

    – Ole Tange
    Nov 21 '18 at 21:12











  • Ask a new question with an MCVE.

    – Ole Tange
    Nov 21 '18 at 21:17











  • I will crosscheck and submit. Thank you.

    – mSSM
    Nov 21 '18 at 22:05











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53326619%2fgnu-parallel-deleting-line-from-joblog-breaks-parallel-updating-it%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














While @MarkSetchell is absolutely right in his comment, root problem here is due to man sed lying:



-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)


sed -i does not edit files in place.



What it does is to make a temporary file in the same dir, copy the input file to the temporary file while doing the editing, and finally renaming the temporary file to the input file's name. Similar to this:



sed '$ d' log > sedXxO11P
mv sedXxO11P log


It is clear that the original log and sedXxO11P have different inodes - let us call them ino1 and ino2. GNU Parallel has ino1 open and really does not know about the existence of ino2. GNU Parallel will happily append to ino1 completely unaware that when it closes the file, the file will vanish because it has already been unlinked.



So you need to change the content of the file without changing the inode:



#!/usr/bin/bash 

seq 10 | parallel -j1 -n0 --joblog log sleep 1 &

sleep 5

# Obvious race condition here:
# Anything appended to log before sed is done is lost.
# This can be avoided by suspending parallel while running this
tmp=$RANDOM$$
cp log $tmp
(rm $tmp; sed '$ d' >log) < $tmp

wait
cat log


This works right now. But do not expect this to be a supported feature - ever.






share|improve this answer























  • Hi, thanks for providing this “fix”. You are probably right that this is way to unsecure to be used. Maybe as an alternative question: if I kill a job on a remote host, how do I go about resuming it? If I understand the --resume and --resume-flags flags correctly, there is no way parallel would know to resume my jobs. If I recall correctly, Exitval was reported as 0 (= success). Together with the folder pointed to by --results being populatel, it looks to parallel as if all went well.

    – mSSM
    Nov 21 '18 at 20:14











  • It looks like when I pkill a process started by parallel, it should report an exitval of 143, but it doesn't. Any idea why?

    – mSSM
    Nov 21 '18 at 20:50











  • @mSSM I cannot reproduce that. Are you using current release or an ancient, buggy version?

    – Ole Tange
    Nov 21 '18 at 21:12











  • Ask a new question with an MCVE.

    – Ole Tange
    Nov 21 '18 at 21:17











  • I will crosscheck and submit. Thank you.

    – mSSM
    Nov 21 '18 at 22:05















1














While @MarkSetchell is absolutely right in his comment, root problem here is due to man sed lying:



-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)


sed -i does not edit files in place.



What it does is to make a temporary file in the same dir, copy the input file to the temporary file while doing the editing, and finally renaming the temporary file to the input file's name. Similar to this:



sed '$ d' log > sedXxO11P
mv sedXxO11P log


It is clear that the original log and sedXxO11P have different inodes - let us call them ino1 and ino2. GNU Parallel has ino1 open and really does not know about the existence of ino2. GNU Parallel will happily append to ino1 completely unaware that when it closes the file, the file will vanish because it has already been unlinked.



So you need to change the content of the file without changing the inode:



#!/usr/bin/bash 

seq 10 | parallel -j1 -n0 --joblog log sleep 1 &

sleep 5

# Obvious race condition here:
# Anything appended to log before sed is done is lost.
# This can be avoided by suspending parallel while running this
tmp=$RANDOM$$
cp log $tmp
(rm $tmp; sed '$ d' >log) < $tmp

wait
cat log


This works right now. But do not expect this to be a supported feature - ever.






share|improve this answer























  • Hi, thanks for providing this “fix”. You are probably right that this is way to unsecure to be used. Maybe as an alternative question: if I kill a job on a remote host, how do I go about resuming it? If I understand the --resume and --resume-flags flags correctly, there is no way parallel would know to resume my jobs. If I recall correctly, Exitval was reported as 0 (= success). Together with the folder pointed to by --results being populatel, it looks to parallel as if all went well.

    – mSSM
    Nov 21 '18 at 20:14











  • It looks like when I pkill a process started by parallel, it should report an exitval of 143, but it doesn't. Any idea why?

    – mSSM
    Nov 21 '18 at 20:50











  • @mSSM I cannot reproduce that. Are you using current release or an ancient, buggy version?

    – Ole Tange
    Nov 21 '18 at 21:12











  • Ask a new question with an MCVE.

    – Ole Tange
    Nov 21 '18 at 21:17











  • I will crosscheck and submit. Thank you.

    – mSSM
    Nov 21 '18 at 22:05













1












1








1







While @MarkSetchell is absolutely right in his comment, root problem here is due to man sed lying:



-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)


sed -i does not edit files in place.



What it does is to make a temporary file in the same dir, copy the input file to the temporary file while doing the editing, and finally renaming the temporary file to the input file's name. Similar to this:



sed '$ d' log > sedXxO11P
mv sedXxO11P log


It is clear that the original log and sedXxO11P have different inodes - let us call them ino1 and ino2. GNU Parallel has ino1 open and really does not know about the existence of ino2. GNU Parallel will happily append to ino1 completely unaware that when it closes the file, the file will vanish because it has already been unlinked.



So you need to change the content of the file without changing the inode:



#!/usr/bin/bash 

seq 10 | parallel -j1 -n0 --joblog log sleep 1 &

sleep 5

# Obvious race condition here:
# Anything appended to log before sed is done is lost.
# This can be avoided by suspending parallel while running this
tmp=$RANDOM$$
cp log $tmp
(rm $tmp; sed '$ d' >log) < $tmp

wait
cat log


This works right now. But do not expect this to be a supported feature - ever.






share|improve this answer













While @MarkSetchell is absolutely right in his comment, root problem here is due to man sed lying:



-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)


sed -i does not edit files in place.



What it does is to make a temporary file in the same dir, copy the input file to the temporary file while doing the editing, and finally renaming the temporary file to the input file's name. Similar to this:



sed '$ d' log > sedXxO11P
mv sedXxO11P log


It is clear that the original log and sedXxO11P have different inodes - let us call them ino1 and ino2. GNU Parallel has ino1 open and really does not know about the existence of ino2. GNU Parallel will happily append to ino1 completely unaware that when it closes the file, the file will vanish because it has already been unlinked.



So you need to change the content of the file without changing the inode:



#!/usr/bin/bash 

seq 10 | parallel -j1 -n0 --joblog log sleep 1 &

sleep 5

# Obvious race condition here:
# Anything appended to log before sed is done is lost.
# This can be avoided by suspending parallel while running this
tmp=$RANDOM$$
cp log $tmp
(rm $tmp; sed '$ d' >log) < $tmp

wait
cat log


This works right now. But do not expect this to be a supported feature - ever.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 21 '18 at 19:47









Ole TangeOle Tange

19.9k35669




19.9k35669












  • Hi, thanks for providing this “fix”. You are probably right that this is way to unsecure to be used. Maybe as an alternative question: if I kill a job on a remote host, how do I go about resuming it? If I understand the --resume and --resume-flags flags correctly, there is no way parallel would know to resume my jobs. If I recall correctly, Exitval was reported as 0 (= success). Together with the folder pointed to by --results being populatel, it looks to parallel as if all went well.

    – mSSM
    Nov 21 '18 at 20:14











  • It looks like when I pkill a process started by parallel, it should report an exitval of 143, but it doesn't. Any idea why?

    – mSSM
    Nov 21 '18 at 20:50











  • @mSSM I cannot reproduce that. Are you using current release or an ancient, buggy version?

    – Ole Tange
    Nov 21 '18 at 21:12











  • Ask a new question with an MCVE.

    – Ole Tange
    Nov 21 '18 at 21:17











  • I will crosscheck and submit. Thank you.

    – mSSM
    Nov 21 '18 at 22:05

















  • Hi, thanks for providing this “fix”. You are probably right that this is way to unsecure to be used. Maybe as an alternative question: if I kill a job on a remote host, how do I go about resuming it? If I understand the --resume and --resume-flags flags correctly, there is no way parallel would know to resume my jobs. If I recall correctly, Exitval was reported as 0 (= success). Together with the folder pointed to by --results being populatel, it looks to parallel as if all went well.

    – mSSM
    Nov 21 '18 at 20:14











  • It looks like when I pkill a process started by parallel, it should report an exitval of 143, but it doesn't. Any idea why?

    – mSSM
    Nov 21 '18 at 20:50











  • @mSSM I cannot reproduce that. Are you using current release or an ancient, buggy version?

    – Ole Tange
    Nov 21 '18 at 21:12











  • Ask a new question with an MCVE.

    – Ole Tange
    Nov 21 '18 at 21:17











  • I will crosscheck and submit. Thank you.

    – mSSM
    Nov 21 '18 at 22:05
















Hi, thanks for providing this “fix”. You are probably right that this is way to unsecure to be used. Maybe as an alternative question: if I kill a job on a remote host, how do I go about resuming it? If I understand the --resume and --resume-flags flags correctly, there is no way parallel would know to resume my jobs. If I recall correctly, Exitval was reported as 0 (= success). Together with the folder pointed to by --results being populatel, it looks to parallel as if all went well.

– mSSM
Nov 21 '18 at 20:14





Hi, thanks for providing this “fix”. You are probably right that this is way to unsecure to be used. Maybe as an alternative question: if I kill a job on a remote host, how do I go about resuming it? If I understand the --resume and --resume-flags flags correctly, there is no way parallel would know to resume my jobs. If I recall correctly, Exitval was reported as 0 (= success). Together with the folder pointed to by --results being populatel, it looks to parallel as if all went well.

– mSSM
Nov 21 '18 at 20:14













It looks like when I pkill a process started by parallel, it should report an exitval of 143, but it doesn't. Any idea why?

– mSSM
Nov 21 '18 at 20:50





It looks like when I pkill a process started by parallel, it should report an exitval of 143, but it doesn't. Any idea why?

– mSSM
Nov 21 '18 at 20:50













@mSSM I cannot reproduce that. Are you using current release or an ancient, buggy version?

– Ole Tange
Nov 21 '18 at 21:12





@mSSM I cannot reproduce that. Are you using current release or an ancient, buggy version?

– Ole Tange
Nov 21 '18 at 21:12













Ask a new question with an MCVE.

– Ole Tange
Nov 21 '18 at 21:17





Ask a new question with an MCVE.

– Ole Tange
Nov 21 '18 at 21:17













I will crosscheck and submit. Thank you.

– mSSM
Nov 21 '18 at 22:05





I will crosscheck and submit. Thank you.

– mSSM
Nov 21 '18 at 22:05



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53326619%2fgnu-parallel-deleting-line-from-joblog-breaks-parallel-updating-it%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

Museum of Modern and Contemporary Art of Trento and Rovereto