bash: how to compute average of different columns?










0















I am writing a script for automatically computing average runtime.



First I need to run $ time ./foo.py for 100 times and save output to file time.txt (working)



$ for i in `seq 100`; do time ./foo.py; 2>> time.txt; done


Output looks as follows



time ./foo.py
real 0m0,030s
user 0m0,030s
sys 0m0,000s
[...]


Runtimes from different scripts are in the same file. Each entry starts with time ./foo.py, followed by 100 "triplets" of real, user and sys.



Now, if possible, I would love to have the script automatically compute the average runtime for each tested file by using all 100 "triplets", and neatly returning only one "mean triplet".



I have thought about maybe using awk to calculate the mean, like this



awk ' total += $2 END print total/NR ' time.txt


But the command would need to be adapted to fit my needs - after all, only the parts after the , (e.g. ,030s) may be used for computation and the s would also need to be disregarded.



Since I do not know how to achieve this objective, I thought to ask the community.



Any help is greatly appreciated.










share|improve this question




























    0















    I am writing a script for automatically computing average runtime.



    First I need to run $ time ./foo.py for 100 times and save output to file time.txt (working)



    $ for i in `seq 100`; do time ./foo.py; 2>> time.txt; done


    Output looks as follows



    time ./foo.py
    real 0m0,030s
    user 0m0,030s
    sys 0m0,000s
    [...]


    Runtimes from different scripts are in the same file. Each entry starts with time ./foo.py, followed by 100 "triplets" of real, user and sys.



    Now, if possible, I would love to have the script automatically compute the average runtime for each tested file by using all 100 "triplets", and neatly returning only one "mean triplet".



    I have thought about maybe using awk to calculate the mean, like this



    awk ' total += $2 END print total/NR ' time.txt


    But the command would need to be adapted to fit my needs - after all, only the parts after the , (e.g. ,030s) may be used for computation and the s would also need to be disregarded.



    Since I do not know how to achieve this objective, I thought to ask the community.



    Any help is greatly appreciated.










    share|improve this question


























      0












      0








      0








      I am writing a script for automatically computing average runtime.



      First I need to run $ time ./foo.py for 100 times and save output to file time.txt (working)



      $ for i in `seq 100`; do time ./foo.py; 2>> time.txt; done


      Output looks as follows



      time ./foo.py
      real 0m0,030s
      user 0m0,030s
      sys 0m0,000s
      [...]


      Runtimes from different scripts are in the same file. Each entry starts with time ./foo.py, followed by 100 "triplets" of real, user and sys.



      Now, if possible, I would love to have the script automatically compute the average runtime for each tested file by using all 100 "triplets", and neatly returning only one "mean triplet".



      I have thought about maybe using awk to calculate the mean, like this



      awk ' total += $2 END print total/NR ' time.txt


      But the command would need to be adapted to fit my needs - after all, only the parts after the , (e.g. ,030s) may be used for computation and the s would also need to be disregarded.



      Since I do not know how to achieve this objective, I thought to ask the community.



      Any help is greatly appreciated.










      share|improve this question
















      I am writing a script for automatically computing average runtime.



      First I need to run $ time ./foo.py for 100 times and save output to file time.txt (working)



      $ for i in `seq 100`; do time ./foo.py; 2>> time.txt; done


      Output looks as follows



      time ./foo.py
      real 0m0,030s
      user 0m0,030s
      sys 0m0,000s
      [...]


      Runtimes from different scripts are in the same file. Each entry starts with time ./foo.py, followed by 100 "triplets" of real, user and sys.



      Now, if possible, I would love to have the script automatically compute the average runtime for each tested file by using all 100 "triplets", and neatly returning only one "mean triplet".



      I have thought about maybe using awk to calculate the mean, like this



      awk ' total += $2 END print total/NR ' time.txt


      But the command would need to be adapted to fit my needs - after all, only the parts after the , (e.g. ,030s) may be used for computation and the s would also need to be disregarded.



      Since I do not know how to achieve this objective, I thought to ask the community.



      Any help is greatly appreciated.







      bash awk time mean






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 13 '18 at 3:46









      mjuarez

      9,77973751




      9,77973751










      asked Nov 13 '18 at 3:13









      OingoBoingoOingoBoingo

      158




      158






















          1 Answer
          1






          active

          oldest

          votes


















          1














          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ totalReal += $2 /^user/ totalUser += $2 /^sys/ totalSys += $2 END print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg " totalSys/(NR/4) ' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:



          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.





          share|improve this answer























          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!

            – OingoBoingo
            Nov 13 '18 at 15:34










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273248%2fbash-how-to-compute-average-of-different-columns%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ totalReal += $2 /^user/ totalUser += $2 /^sys/ totalSys += $2 END print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg " totalSys/(NR/4) ' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:



          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.





          share|improve this answer























          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!

            – OingoBoingo
            Nov 13 '18 at 15:34















          1














          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ totalReal += $2 /^user/ totalUser += $2 /^sys/ totalSys += $2 END print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg " totalSys/(NR/4) ' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:



          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.





          share|improve this answer























          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!

            – OingoBoingo
            Nov 13 '18 at 15:34













          1












          1








          1







          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ totalReal += $2 /^user/ totalUser += $2 /^sys/ totalSys += $2 END print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg " totalSys/(NR/4) ' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:



          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.





          share|improve this answer













          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ totalReal += $2 /^user/ totalUser += $2 /^sys/ totalSys += $2 END print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg " totalSys/(NR/4) ' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:



          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 13 '18 at 3:46









          mjuarezmjuarez

          9,77973751




          9,77973751












          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!

            – OingoBoingo
            Nov 13 '18 at 15:34

















          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!

            – OingoBoingo
            Nov 13 '18 at 15:34
















          Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!

          – OingoBoingo
          Nov 13 '18 at 15:34





          Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!

          – OingoBoingo
          Nov 13 '18 at 15:34

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273248%2fbash-how-to-compute-average-of-different-columns%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          How to read a connectionString WITH PROVIDER in .NET Core?

          In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

          Museum of Modern and Contemporary Art of Trento and Rovereto