Calculating how many rows sum to zero: problems importing data?










0















I have a dataframe bwsp that contains abundance data of many species at two locations that looks something like this:



 Location sp1 sp2 sp3 sp4
sample1 SiteA 0 12 0 0
sample2 SiteA 0 3 0 0
sample3 SiteA 1 0 0 0
sample4 SiteB 0 0 6 0
sample5 SiteB 2 1 1 0
sample6 SiteB 0 1 0 80
sample7 SiteB 2 1 1 0
sample8 SiteB 0 0 0 0


I calculate the total abundance of all species in each sample using:



bwsp$N <- rowSums(bwsp)



I now want to calculate how many samples (=rows) have zero abundance (ie, N=0) at each location. I started with:



 library(tidyverse)
sum(bwsp$N == "0")


and found no rows summed to zero. But I know this is wrong! (I handled the samples, and I know that there are several that were "empty".) So I checked it with:



> summary(bwsp$N)


I was really surprised to see that the minimum N was 1.0. I double-checked the other summary statistics in Excel and they don't quite match either.



Are these just rounding errors? What am I doing wrong?



NB: I just checked this with the dummy data that I provided above and it worked just fine. This makes me think that I'm doing something wrong with the way I'm getting the data into R, i.e. bwsp <- read.csv("dummybwsp.csv", row.names = 1).










share|improve this question

















  • 3





    You need sum(rowSums(bwsp$N) == 0)

    – G5W
    Nov 12 '18 at 0:35






  • 1





    Run rowSums(bwsp[-1]) first and see if the results match. Also, you are not using library(tidyverse) in the question example. Maybe you are using it in your code but question examples should be minimal.

    – Rui Barradas
    Nov 12 '18 at 11:34
















0















I have a dataframe bwsp that contains abundance data of many species at two locations that looks something like this:



 Location sp1 sp2 sp3 sp4
sample1 SiteA 0 12 0 0
sample2 SiteA 0 3 0 0
sample3 SiteA 1 0 0 0
sample4 SiteB 0 0 6 0
sample5 SiteB 2 1 1 0
sample6 SiteB 0 1 0 80
sample7 SiteB 2 1 1 0
sample8 SiteB 0 0 0 0


I calculate the total abundance of all species in each sample using:



bwsp$N <- rowSums(bwsp)



I now want to calculate how many samples (=rows) have zero abundance (ie, N=0) at each location. I started with:



 library(tidyverse)
sum(bwsp$N == "0")


and found no rows summed to zero. But I know this is wrong! (I handled the samples, and I know that there are several that were "empty".) So I checked it with:



> summary(bwsp$N)


I was really surprised to see that the minimum N was 1.0. I double-checked the other summary statistics in Excel and they don't quite match either.



Are these just rounding errors? What am I doing wrong?



NB: I just checked this with the dummy data that I provided above and it worked just fine. This makes me think that I'm doing something wrong with the way I'm getting the data into R, i.e. bwsp <- read.csv("dummybwsp.csv", row.names = 1).










share|improve this question

















  • 3





    You need sum(rowSums(bwsp$N) == 0)

    – G5W
    Nov 12 '18 at 0:35






  • 1





    Run rowSums(bwsp[-1]) first and see if the results match. Also, you are not using library(tidyverse) in the question example. Maybe you are using it in your code but question examples should be minimal.

    – Rui Barradas
    Nov 12 '18 at 11:34














0












0








0








I have a dataframe bwsp that contains abundance data of many species at two locations that looks something like this:



 Location sp1 sp2 sp3 sp4
sample1 SiteA 0 12 0 0
sample2 SiteA 0 3 0 0
sample3 SiteA 1 0 0 0
sample4 SiteB 0 0 6 0
sample5 SiteB 2 1 1 0
sample6 SiteB 0 1 0 80
sample7 SiteB 2 1 1 0
sample8 SiteB 0 0 0 0


I calculate the total abundance of all species in each sample using:



bwsp$N <- rowSums(bwsp)



I now want to calculate how many samples (=rows) have zero abundance (ie, N=0) at each location. I started with:



 library(tidyverse)
sum(bwsp$N == "0")


and found no rows summed to zero. But I know this is wrong! (I handled the samples, and I know that there are several that were "empty".) So I checked it with:



> summary(bwsp$N)


I was really surprised to see that the minimum N was 1.0. I double-checked the other summary statistics in Excel and they don't quite match either.



Are these just rounding errors? What am I doing wrong?



NB: I just checked this with the dummy data that I provided above and it worked just fine. This makes me think that I'm doing something wrong with the way I'm getting the data into R, i.e. bwsp <- read.csv("dummybwsp.csv", row.names = 1).










share|improve this question














I have a dataframe bwsp that contains abundance data of many species at two locations that looks something like this:



 Location sp1 sp2 sp3 sp4
sample1 SiteA 0 12 0 0
sample2 SiteA 0 3 0 0
sample3 SiteA 1 0 0 0
sample4 SiteB 0 0 6 0
sample5 SiteB 2 1 1 0
sample6 SiteB 0 1 0 80
sample7 SiteB 2 1 1 0
sample8 SiteB 0 0 0 0


I calculate the total abundance of all species in each sample using:



bwsp$N <- rowSums(bwsp)



I now want to calculate how many samples (=rows) have zero abundance (ie, N=0) at each location. I started with:



 library(tidyverse)
sum(bwsp$N == "0")


and found no rows summed to zero. But I know this is wrong! (I handled the samples, and I know that there are several that were "empty".) So I checked it with:



> summary(bwsp$N)


I was really surprised to see that the minimum N was 1.0. I double-checked the other summary statistics in Excel and they don't quite match either.



Are these just rounding errors? What am I doing wrong?



NB: I just checked this with the dummy data that I provided above and it worked just fine. This makes me think that I'm doing something wrong with the way I'm getting the data into R, i.e. bwsp <- read.csv("dummybwsp.csv", row.names = 1).







r import summary






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 12 '18 at 0:31









ayeshaayesha

418




418







  • 3





    You need sum(rowSums(bwsp$N) == 0)

    – G5W
    Nov 12 '18 at 0:35






  • 1





    Run rowSums(bwsp[-1]) first and see if the results match. Also, you are not using library(tidyverse) in the question example. Maybe you are using it in your code but question examples should be minimal.

    – Rui Barradas
    Nov 12 '18 at 11:34













  • 3





    You need sum(rowSums(bwsp$N) == 0)

    – G5W
    Nov 12 '18 at 0:35






  • 1





    Run rowSums(bwsp[-1]) first and see if the results match. Also, you are not using library(tidyverse) in the question example. Maybe you are using it in your code but question examples should be minimal.

    – Rui Barradas
    Nov 12 '18 at 11:34








3




3





You need sum(rowSums(bwsp$N) == 0)

– G5W
Nov 12 '18 at 0:35





You need sum(rowSums(bwsp$N) == 0)

– G5W
Nov 12 '18 at 0:35




1




1





Run rowSums(bwsp[-1]) first and see if the results match. Also, you are not using library(tidyverse) in the question example. Maybe you are using it in your code but question examples should be minimal.

– Rui Barradas
Nov 12 '18 at 11:34






Run rowSums(bwsp[-1]) first and see if the results match. Also, you are not using library(tidyverse) in the question example. Maybe you are using it in your code but question examples should be minimal.

– Rui Barradas
Nov 12 '18 at 11:34













2 Answers
2






active

oldest

votes


















1














Replace



bwsp$N <- rowSums(bwsp)


with



bwsp$N <- rowSums(bwsp[-1])


to exclude the first column as rowSum() requires numeric data.






share|improve this answer






























    0














    Once I pared down the question, I was able to look back at my original script and see my error. In my other working, I had calculated some diversity indices first using:



    bwsp$shann <- diversity(bwsp)
    bwsp$simp <- diversity(bwsp, "simpson")


    Of course, these add to one, and hence add one to every row of data. There was no issue with the original script that I wrote, but there was an issue with me not thinking carefully about the way I was manipulating data.



    I was able to repair this issue by specifying the columns of data used in the calculations:



    bwsp$shann <- diversity(bwsp[,1:64])
    bwsp$simp <- diversity(bwsp[,1:64], "simpson")
    bwsp$N <- rowSums(bwsp[,1:64])


    Phew! This was a good reminder to really think about my data!






    share|improve this answer






















      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53254633%2fcalculating-how-many-rows-sum-to-zero-problems-importing-data%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      Replace



      bwsp$N <- rowSums(bwsp)


      with



      bwsp$N <- rowSums(bwsp[-1])


      to exclude the first column as rowSum() requires numeric data.






      share|improve this answer



























        1














        Replace



        bwsp$N <- rowSums(bwsp)


        with



        bwsp$N <- rowSums(bwsp[-1])


        to exclude the first column as rowSum() requires numeric data.






        share|improve this answer

























          1












          1








          1







          Replace



          bwsp$N <- rowSums(bwsp)


          with



          bwsp$N <- rowSums(bwsp[-1])


          to exclude the first column as rowSum() requires numeric data.






          share|improve this answer













          Replace



          bwsp$N <- rowSums(bwsp)


          with



          bwsp$N <- rowSums(bwsp[-1])


          to exclude the first column as rowSum() requires numeric data.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 12 '18 at 0:41









          DamesDames

          864




          864























              0














              Once I pared down the question, I was able to look back at my original script and see my error. In my other working, I had calculated some diversity indices first using:



              bwsp$shann <- diversity(bwsp)
              bwsp$simp <- diversity(bwsp, "simpson")


              Of course, these add to one, and hence add one to every row of data. There was no issue with the original script that I wrote, but there was an issue with me not thinking carefully about the way I was manipulating data.



              I was able to repair this issue by specifying the columns of data used in the calculations:



              bwsp$shann <- diversity(bwsp[,1:64])
              bwsp$simp <- diversity(bwsp[,1:64], "simpson")
              bwsp$N <- rowSums(bwsp[,1:64])


              Phew! This was a good reminder to really think about my data!






              share|improve this answer



























                0














                Once I pared down the question, I was able to look back at my original script and see my error. In my other working, I had calculated some diversity indices first using:



                bwsp$shann <- diversity(bwsp)
                bwsp$simp <- diversity(bwsp, "simpson")


                Of course, these add to one, and hence add one to every row of data. There was no issue with the original script that I wrote, but there was an issue with me not thinking carefully about the way I was manipulating data.



                I was able to repair this issue by specifying the columns of data used in the calculations:



                bwsp$shann <- diversity(bwsp[,1:64])
                bwsp$simp <- diversity(bwsp[,1:64], "simpson")
                bwsp$N <- rowSums(bwsp[,1:64])


                Phew! This was a good reminder to really think about my data!






                share|improve this answer

























                  0












                  0








                  0







                  Once I pared down the question, I was able to look back at my original script and see my error. In my other working, I had calculated some diversity indices first using:



                  bwsp$shann <- diversity(bwsp)
                  bwsp$simp <- diversity(bwsp, "simpson")


                  Of course, these add to one, and hence add one to every row of data. There was no issue with the original script that I wrote, but there was an issue with me not thinking carefully about the way I was manipulating data.



                  I was able to repair this issue by specifying the columns of data used in the calculations:



                  bwsp$shann <- diversity(bwsp[,1:64])
                  bwsp$simp <- diversity(bwsp[,1:64], "simpson")
                  bwsp$N <- rowSums(bwsp[,1:64])


                  Phew! This was a good reminder to really think about my data!






                  share|improve this answer













                  Once I pared down the question, I was able to look back at my original script and see my error. In my other working, I had calculated some diversity indices first using:



                  bwsp$shann <- diversity(bwsp)
                  bwsp$simp <- diversity(bwsp, "simpson")


                  Of course, these add to one, and hence add one to every row of data. There was no issue with the original script that I wrote, but there was an issue with me not thinking carefully about the way I was manipulating data.



                  I was able to repair this issue by specifying the columns of data used in the calculations:



                  bwsp$shann <- diversity(bwsp[,1:64])
                  bwsp$simp <- diversity(bwsp[,1:64], "simpson")
                  bwsp$N <- rowSums(bwsp[,1:64])


                  Phew! This was a good reminder to really think about my data!







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 15 '18 at 2:10









                  ayeshaayesha

                  418




                  418



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53254633%2fcalculating-how-many-rows-sum-to-zero-problems-importing-data%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      Barbados

                      How to read a connectionString WITH PROVIDER in .NET Core?

                      Node.js Script on GitHub Pages or Amazon S3