Kotlin: Why is Sequence more performant in this example?









up vote
2
down vote

favorite












Currently, I am looking into Kotlin and have a question about Sequences vs. Collections.



I read a blog post about this topic and there you can find this code snippets:



List implementation:



val list = generateSequence(1) it + 1 
.take(50_000_000)
.toList()

measure
list
.filter it % 3 == 0
.average()

// 8644 ms


Sequence implementation:



val sequence = generateSequence(1) it + 1 
.take(50_000_000)

measure
sequence
.filter it % 3 == 0
.average()

// 822 ms


The point here is that the Sequence implementation is about 10x faster.



However, I do not really understand WHY that is. I know that with a Sequence, you do "lazy evaluation", but I cannot find any reason why that helps reducing the processing in this example.



However, here I know why a Sequence is generally faster:



val result = sequenceOf("a", "b", "c")
.map
println("map: $it")
it.toUpperCase()

.any
println("any: $it")
it.startsWith("B")



Because with a Sequence you process the data "vertically", when the first element starts with "B", you don't have to map for the rest of the elements. It makes sense here.



So, why is it also faster in the first example?










share|improve this question

























    up vote
    2
    down vote

    favorite












    Currently, I am looking into Kotlin and have a question about Sequences vs. Collections.



    I read a blog post about this topic and there you can find this code snippets:



    List implementation:



    val list = generateSequence(1) it + 1 
    .take(50_000_000)
    .toList()

    measure
    list
    .filter it % 3 == 0
    .average()

    // 8644 ms


    Sequence implementation:



    val sequence = generateSequence(1) it + 1 
    .take(50_000_000)

    measure
    sequence
    .filter it % 3 == 0
    .average()

    // 822 ms


    The point here is that the Sequence implementation is about 10x faster.



    However, I do not really understand WHY that is. I know that with a Sequence, you do "lazy evaluation", but I cannot find any reason why that helps reducing the processing in this example.



    However, here I know why a Sequence is generally faster:



    val result = sequenceOf("a", "b", "c")
    .map
    println("map: $it")
    it.toUpperCase()

    .any
    println("any: $it")
    it.startsWith("B")



    Because with a Sequence you process the data "vertically", when the first element starts with "B", you don't have to map for the rest of the elements. It makes sense here.



    So, why is it also faster in the first example?










    share|improve this question























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      Currently, I am looking into Kotlin and have a question about Sequences vs. Collections.



      I read a blog post about this topic and there you can find this code snippets:



      List implementation:



      val list = generateSequence(1) it + 1 
      .take(50_000_000)
      .toList()

      measure
      list
      .filter it % 3 == 0
      .average()

      // 8644 ms


      Sequence implementation:



      val sequence = generateSequence(1) it + 1 
      .take(50_000_000)

      measure
      sequence
      .filter it % 3 == 0
      .average()

      // 822 ms


      The point here is that the Sequence implementation is about 10x faster.



      However, I do not really understand WHY that is. I know that with a Sequence, you do "lazy evaluation", but I cannot find any reason why that helps reducing the processing in this example.



      However, here I know why a Sequence is generally faster:



      val result = sequenceOf("a", "b", "c")
      .map
      println("map: $it")
      it.toUpperCase()

      .any
      println("any: $it")
      it.startsWith("B")



      Because with a Sequence you process the data "vertically", when the first element starts with "B", you don't have to map for the rest of the elements. It makes sense here.



      So, why is it also faster in the first example?










      share|improve this question













      Currently, I am looking into Kotlin and have a question about Sequences vs. Collections.



      I read a blog post about this topic and there you can find this code snippets:



      List implementation:



      val list = generateSequence(1) it + 1 
      .take(50_000_000)
      .toList()

      measure
      list
      .filter it % 3 == 0
      .average()

      // 8644 ms


      Sequence implementation:



      val sequence = generateSequence(1) it + 1 
      .take(50_000_000)

      measure
      sequence
      .filter it % 3 == 0
      .average()

      // 822 ms


      The point here is that the Sequence implementation is about 10x faster.



      However, I do not really understand WHY that is. I know that with a Sequence, you do "lazy evaluation", but I cannot find any reason why that helps reducing the processing in this example.



      However, here I know why a Sequence is generally faster:



      val result = sequenceOf("a", "b", "c")
      .map
      println("map: $it")
      it.toUpperCase()

      .any
      println("any: $it")
      it.startsWith("B")



      Because with a Sequence you process the data "vertically", when the first element starts with "B", you don't have to map for the rest of the elements. It makes sense here.



      So, why is it also faster in the first example?







      kotlin






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 2 days ago









      Aliquis

      362212




      362212






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          5
          down vote



          accepted










          Let's look at what those two implementations are actually doing:




          1. The List implementation first creates a List in memory with 50 million elements.  This will take a bare minimum of 200MB, since an integer takes 4 bytes.



            (In fact, it's probably far more than that.  As Alexey Romanov pointed out, since it's a generic List implementation and not an IntList, it won't be storing the integers directly, but will be ‘boxing’ them — storing references to Int objects.  Each reference will be 8 or 16 bytes, and each Int will probably take 16, giving over 1GB.  Also, depending how the List gets created, it might start with a small array and keep creating larger and larger ones as the list grows, copying all the values across each time.)



            Then it has to read all the values back from the list, filter them, and create another list in memory.



            Finally, it has to read all those values back in again, to calculate the average.




          2. The Sequence implementation, on the other hand, doesn't have to store anything!  It simply generates the values in order, and as it does each one it checks whether it's divisible by 3 and if so includes it in the average.



            (That's pretty much how you'd do it if you were implementing it ‘by hand’.)



          You can see that in addition to the divisibility checking and average calculation, the List implementation is doing a massive amount of memory access, which will take a lot of time.  That's the main reason it's far slower than the Sequence version, which doesn't!



          Seeing this, you might ask why we don't use Sequences everywhere…  But this is a fairly extreme example.  Setting up and then iterating the Sequence has some overhead of its own, and for smallish lists that can outweigh the memory overhead.  So Sequences only have a clear advantage in cases when the lists are very large, are processed strictly in order, there are several intermediate steps, and/or many items are filtered out along the way (especially if the Sequence is infinite!).



          In my experience, those conditions don't occur very often.  But this question shows how important it is to recognise them when they do!






          share|improve this answer






















          • Good answer overall, but I'd add two things: 1. Sequences are also much slower (if usable at all) if you need to do anything other than process each element in order. 2. The initial list is going to be much larger than 200MB because integers need to be boxed, see stackoverflow.com/questions/8419860/….
            – Alexey Romanov
            2 days ago










          • @AlexeyRomanov: good points, thanks! I should have spotted that the ints will be boxed. I'll update my answer.
            – gidds
            2 days ago

















          up vote
          2
          down vote













          Leveraging lazy-evaluation allows avoiding the creation of intermediate objects that are irrelevant from the point of the end goal.



          Also, the benchmarking method used in the mentioned article is not super accurate. Try to repeat the experiment with JMH.



          Initial code produces a list containing 50_000_000 objects:



           val list = generateSequence(1) it + 1 
          .take(50_000_000)
          .toList()


          then iterates through it and creates another list containing a subset of its elements:



          .filter it % 3 == 0 


          ... and then proceeds with calculating the average:



          .average()



          Using sequences allows you to avoid doing all those intermediate steps. The below code doesn't produce 50_000_000 elements, it's just a representation of that 1...50_000_000 sequence:



          val sequence = generateSequence(1) it + 1 
          .take(50_000_000)


          adding a filtering to it doesn't trigger the calculation itself as well but derives a new sequence from the existing one (3, 6, 9...):



          .filter it % 3 == 0 


          and eventually, a terminal operation is called that triggers the evaluation of the sequence and the actual calculation:



          .average()



          Some relevant reading:



          Kotlin: Beware of Java Stream API Habits



          Kotlin Collections API Performance Antipatterns






          share|improve this answer




















          • Well, compared to the second example in the OP, about the same number of operations are needed for filtering and then calculating the average when using a sequence or normal collection. So, the reason why it's more performant is because no intermediate collections must be created, right?
            – Aliquis
            2 days ago










          • @Aliquis it's not the same number of operations - in the first example, you need to iterate 50_000_000 times to create a filtered list and then proceed with calculating the average (which involves more iteration). In the second one, you just go straight for the average (you iterate the sequence once with filtering applied) without creating any redundant objects + possibly avoid autoboxing of ints
            – Grzegorz Piwowarek
            2 days ago











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53237853%2fkotlin-why-is-sequence-more-performant-in-this-example%23new-answer', 'question_page');

          );

          Post as a guest






























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          5
          down vote



          accepted










          Let's look at what those two implementations are actually doing:




          1. The List implementation first creates a List in memory with 50 million elements.  This will take a bare minimum of 200MB, since an integer takes 4 bytes.



            (In fact, it's probably far more than that.  As Alexey Romanov pointed out, since it's a generic List implementation and not an IntList, it won't be storing the integers directly, but will be ‘boxing’ them — storing references to Int objects.  Each reference will be 8 or 16 bytes, and each Int will probably take 16, giving over 1GB.  Also, depending how the List gets created, it might start with a small array and keep creating larger and larger ones as the list grows, copying all the values across each time.)



            Then it has to read all the values back from the list, filter them, and create another list in memory.



            Finally, it has to read all those values back in again, to calculate the average.




          2. The Sequence implementation, on the other hand, doesn't have to store anything!  It simply generates the values in order, and as it does each one it checks whether it's divisible by 3 and if so includes it in the average.



            (That's pretty much how you'd do it if you were implementing it ‘by hand’.)



          You can see that in addition to the divisibility checking and average calculation, the List implementation is doing a massive amount of memory access, which will take a lot of time.  That's the main reason it's far slower than the Sequence version, which doesn't!



          Seeing this, you might ask why we don't use Sequences everywhere…  But this is a fairly extreme example.  Setting up and then iterating the Sequence has some overhead of its own, and for smallish lists that can outweigh the memory overhead.  So Sequences only have a clear advantage in cases when the lists are very large, are processed strictly in order, there are several intermediate steps, and/or many items are filtered out along the way (especially if the Sequence is infinite!).



          In my experience, those conditions don't occur very often.  But this question shows how important it is to recognise them when they do!






          share|improve this answer






















          • Good answer overall, but I'd add two things: 1. Sequences are also much slower (if usable at all) if you need to do anything other than process each element in order. 2. The initial list is going to be much larger than 200MB because integers need to be boxed, see stackoverflow.com/questions/8419860/….
            – Alexey Romanov
            2 days ago










          • @AlexeyRomanov: good points, thanks! I should have spotted that the ints will be boxed. I'll update my answer.
            – gidds
            2 days ago














          up vote
          5
          down vote



          accepted










          Let's look at what those two implementations are actually doing:




          1. The List implementation first creates a List in memory with 50 million elements.  This will take a bare minimum of 200MB, since an integer takes 4 bytes.



            (In fact, it's probably far more than that.  As Alexey Romanov pointed out, since it's a generic List implementation and not an IntList, it won't be storing the integers directly, but will be ‘boxing’ them — storing references to Int objects.  Each reference will be 8 or 16 bytes, and each Int will probably take 16, giving over 1GB.  Also, depending how the List gets created, it might start with a small array and keep creating larger and larger ones as the list grows, copying all the values across each time.)



            Then it has to read all the values back from the list, filter them, and create another list in memory.



            Finally, it has to read all those values back in again, to calculate the average.




          2. The Sequence implementation, on the other hand, doesn't have to store anything!  It simply generates the values in order, and as it does each one it checks whether it's divisible by 3 and if so includes it in the average.



            (That's pretty much how you'd do it if you were implementing it ‘by hand’.)



          You can see that in addition to the divisibility checking and average calculation, the List implementation is doing a massive amount of memory access, which will take a lot of time.  That's the main reason it's far slower than the Sequence version, which doesn't!



          Seeing this, you might ask why we don't use Sequences everywhere…  But this is a fairly extreme example.  Setting up and then iterating the Sequence has some overhead of its own, and for smallish lists that can outweigh the memory overhead.  So Sequences only have a clear advantage in cases when the lists are very large, are processed strictly in order, there are several intermediate steps, and/or many items are filtered out along the way (especially if the Sequence is infinite!).



          In my experience, those conditions don't occur very often.  But this question shows how important it is to recognise them when they do!






          share|improve this answer






















          • Good answer overall, but I'd add two things: 1. Sequences are also much slower (if usable at all) if you need to do anything other than process each element in order. 2. The initial list is going to be much larger than 200MB because integers need to be boxed, see stackoverflow.com/questions/8419860/….
            – Alexey Romanov
            2 days ago










          • @AlexeyRomanov: good points, thanks! I should have spotted that the ints will be boxed. I'll update my answer.
            – gidds
            2 days ago












          up vote
          5
          down vote



          accepted







          up vote
          5
          down vote



          accepted






          Let's look at what those two implementations are actually doing:




          1. The List implementation first creates a List in memory with 50 million elements.  This will take a bare minimum of 200MB, since an integer takes 4 bytes.



            (In fact, it's probably far more than that.  As Alexey Romanov pointed out, since it's a generic List implementation and not an IntList, it won't be storing the integers directly, but will be ‘boxing’ them — storing references to Int objects.  Each reference will be 8 or 16 bytes, and each Int will probably take 16, giving over 1GB.  Also, depending how the List gets created, it might start with a small array and keep creating larger and larger ones as the list grows, copying all the values across each time.)



            Then it has to read all the values back from the list, filter them, and create another list in memory.



            Finally, it has to read all those values back in again, to calculate the average.




          2. The Sequence implementation, on the other hand, doesn't have to store anything!  It simply generates the values in order, and as it does each one it checks whether it's divisible by 3 and if so includes it in the average.



            (That's pretty much how you'd do it if you were implementing it ‘by hand’.)



          You can see that in addition to the divisibility checking and average calculation, the List implementation is doing a massive amount of memory access, which will take a lot of time.  That's the main reason it's far slower than the Sequence version, which doesn't!



          Seeing this, you might ask why we don't use Sequences everywhere…  But this is a fairly extreme example.  Setting up and then iterating the Sequence has some overhead of its own, and for smallish lists that can outweigh the memory overhead.  So Sequences only have a clear advantage in cases when the lists are very large, are processed strictly in order, there are several intermediate steps, and/or many items are filtered out along the way (especially if the Sequence is infinite!).



          In my experience, those conditions don't occur very often.  But this question shows how important it is to recognise them when they do!






          share|improve this answer














          Let's look at what those two implementations are actually doing:




          1. The List implementation first creates a List in memory with 50 million elements.  This will take a bare minimum of 200MB, since an integer takes 4 bytes.



            (In fact, it's probably far more than that.  As Alexey Romanov pointed out, since it's a generic List implementation and not an IntList, it won't be storing the integers directly, but will be ‘boxing’ them — storing references to Int objects.  Each reference will be 8 or 16 bytes, and each Int will probably take 16, giving over 1GB.  Also, depending how the List gets created, it might start with a small array and keep creating larger and larger ones as the list grows, copying all the values across each time.)



            Then it has to read all the values back from the list, filter them, and create another list in memory.



            Finally, it has to read all those values back in again, to calculate the average.




          2. The Sequence implementation, on the other hand, doesn't have to store anything!  It simply generates the values in order, and as it does each one it checks whether it's divisible by 3 and if so includes it in the average.



            (That's pretty much how you'd do it if you were implementing it ‘by hand’.)



          You can see that in addition to the divisibility checking and average calculation, the List implementation is doing a massive amount of memory access, which will take a lot of time.  That's the main reason it's far slower than the Sequence version, which doesn't!



          Seeing this, you might ask why we don't use Sequences everywhere…  But this is a fairly extreme example.  Setting up and then iterating the Sequence has some overhead of its own, and for smallish lists that can outweigh the memory overhead.  So Sequences only have a clear advantage in cases when the lists are very large, are processed strictly in order, there are several intermediate steps, and/or many items are filtered out along the way (especially if the Sequence is infinite!).



          In my experience, those conditions don't occur very often.  But this question shows how important it is to recognise them when they do!







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago

























          answered 2 days ago









          gidds

          3363




          3363











          • Good answer overall, but I'd add two things: 1. Sequences are also much slower (if usable at all) if you need to do anything other than process each element in order. 2. The initial list is going to be much larger than 200MB because integers need to be boxed, see stackoverflow.com/questions/8419860/….
            – Alexey Romanov
            2 days ago










          • @AlexeyRomanov: good points, thanks! I should have spotted that the ints will be boxed. I'll update my answer.
            – gidds
            2 days ago
















          • Good answer overall, but I'd add two things: 1. Sequences are also much slower (if usable at all) if you need to do anything other than process each element in order. 2. The initial list is going to be much larger than 200MB because integers need to be boxed, see stackoverflow.com/questions/8419860/….
            – Alexey Romanov
            2 days ago










          • @AlexeyRomanov: good points, thanks! I should have spotted that the ints will be boxed. I'll update my answer.
            – gidds
            2 days ago















          Good answer overall, but I'd add two things: 1. Sequences are also much slower (if usable at all) if you need to do anything other than process each element in order. 2. The initial list is going to be much larger than 200MB because integers need to be boxed, see stackoverflow.com/questions/8419860/….
          – Alexey Romanov
          2 days ago




          Good answer overall, but I'd add two things: 1. Sequences are also much slower (if usable at all) if you need to do anything other than process each element in order. 2. The initial list is going to be much larger than 200MB because integers need to be boxed, see stackoverflow.com/questions/8419860/….
          – Alexey Romanov
          2 days ago












          @AlexeyRomanov: good points, thanks! I should have spotted that the ints will be boxed. I'll update my answer.
          – gidds
          2 days ago




          @AlexeyRomanov: good points, thanks! I should have spotted that the ints will be boxed. I'll update my answer.
          – gidds
          2 days ago












          up vote
          2
          down vote













          Leveraging lazy-evaluation allows avoiding the creation of intermediate objects that are irrelevant from the point of the end goal.



          Also, the benchmarking method used in the mentioned article is not super accurate. Try to repeat the experiment with JMH.



          Initial code produces a list containing 50_000_000 objects:



           val list = generateSequence(1) it + 1 
          .take(50_000_000)
          .toList()


          then iterates through it and creates another list containing a subset of its elements:



          .filter it % 3 == 0 


          ... and then proceeds with calculating the average:



          .average()



          Using sequences allows you to avoid doing all those intermediate steps. The below code doesn't produce 50_000_000 elements, it's just a representation of that 1...50_000_000 sequence:



          val sequence = generateSequence(1) it + 1 
          .take(50_000_000)


          adding a filtering to it doesn't trigger the calculation itself as well but derives a new sequence from the existing one (3, 6, 9...):



          .filter it % 3 == 0 


          and eventually, a terminal operation is called that triggers the evaluation of the sequence and the actual calculation:



          .average()



          Some relevant reading:



          Kotlin: Beware of Java Stream API Habits



          Kotlin Collections API Performance Antipatterns






          share|improve this answer




















          • Well, compared to the second example in the OP, about the same number of operations are needed for filtering and then calculating the average when using a sequence or normal collection. So, the reason why it's more performant is because no intermediate collections must be created, right?
            – Aliquis
            2 days ago










          • @Aliquis it's not the same number of operations - in the first example, you need to iterate 50_000_000 times to create a filtered list and then proceed with calculating the average (which involves more iteration). In the second one, you just go straight for the average (you iterate the sequence once with filtering applied) without creating any redundant objects + possibly avoid autoboxing of ints
            – Grzegorz Piwowarek
            2 days ago















          up vote
          2
          down vote













          Leveraging lazy-evaluation allows avoiding the creation of intermediate objects that are irrelevant from the point of the end goal.



          Also, the benchmarking method used in the mentioned article is not super accurate. Try to repeat the experiment with JMH.



          Initial code produces a list containing 50_000_000 objects:



           val list = generateSequence(1) it + 1 
          .take(50_000_000)
          .toList()


          then iterates through it and creates another list containing a subset of its elements:



          .filter it % 3 == 0 


          ... and then proceeds with calculating the average:



          .average()



          Using sequences allows you to avoid doing all those intermediate steps. The below code doesn't produce 50_000_000 elements, it's just a representation of that 1...50_000_000 sequence:



          val sequence = generateSequence(1) it + 1 
          .take(50_000_000)


          adding a filtering to it doesn't trigger the calculation itself as well but derives a new sequence from the existing one (3, 6, 9...):



          .filter it % 3 == 0 


          and eventually, a terminal operation is called that triggers the evaluation of the sequence and the actual calculation:



          .average()



          Some relevant reading:



          Kotlin: Beware of Java Stream API Habits



          Kotlin Collections API Performance Antipatterns






          share|improve this answer




















          • Well, compared to the second example in the OP, about the same number of operations are needed for filtering and then calculating the average when using a sequence or normal collection. So, the reason why it's more performant is because no intermediate collections must be created, right?
            – Aliquis
            2 days ago










          • @Aliquis it's not the same number of operations - in the first example, you need to iterate 50_000_000 times to create a filtered list and then proceed with calculating the average (which involves more iteration). In the second one, you just go straight for the average (you iterate the sequence once with filtering applied) without creating any redundant objects + possibly avoid autoboxing of ints
            – Grzegorz Piwowarek
            2 days ago













          up vote
          2
          down vote










          up vote
          2
          down vote









          Leveraging lazy-evaluation allows avoiding the creation of intermediate objects that are irrelevant from the point of the end goal.



          Also, the benchmarking method used in the mentioned article is not super accurate. Try to repeat the experiment with JMH.



          Initial code produces a list containing 50_000_000 objects:



           val list = generateSequence(1) it + 1 
          .take(50_000_000)
          .toList()


          then iterates through it and creates another list containing a subset of its elements:



          .filter it % 3 == 0 


          ... and then proceeds with calculating the average:



          .average()



          Using sequences allows you to avoid doing all those intermediate steps. The below code doesn't produce 50_000_000 elements, it's just a representation of that 1...50_000_000 sequence:



          val sequence = generateSequence(1) it + 1 
          .take(50_000_000)


          adding a filtering to it doesn't trigger the calculation itself as well but derives a new sequence from the existing one (3, 6, 9...):



          .filter it % 3 == 0 


          and eventually, a terminal operation is called that triggers the evaluation of the sequence and the actual calculation:



          .average()



          Some relevant reading:



          Kotlin: Beware of Java Stream API Habits



          Kotlin Collections API Performance Antipatterns






          share|improve this answer












          Leveraging lazy-evaluation allows avoiding the creation of intermediate objects that are irrelevant from the point of the end goal.



          Also, the benchmarking method used in the mentioned article is not super accurate. Try to repeat the experiment with JMH.



          Initial code produces a list containing 50_000_000 objects:



           val list = generateSequence(1) it + 1 
          .take(50_000_000)
          .toList()


          then iterates through it and creates another list containing a subset of its elements:



          .filter it % 3 == 0 


          ... and then proceeds with calculating the average:



          .average()



          Using sequences allows you to avoid doing all those intermediate steps. The below code doesn't produce 50_000_000 elements, it's just a representation of that 1...50_000_000 sequence:



          val sequence = generateSequence(1) it + 1 
          .take(50_000_000)


          adding a filtering to it doesn't trigger the calculation itself as well but derives a new sequence from the existing one (3, 6, 9...):



          .filter it % 3 == 0 


          and eventually, a terminal operation is called that triggers the evaluation of the sequence and the actual calculation:



          .average()



          Some relevant reading:



          Kotlin: Beware of Java Stream API Habits



          Kotlin Collections API Performance Antipatterns







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 2 days ago









          Grzegorz Piwowarek

          7,11042560




          7,11042560











          • Well, compared to the second example in the OP, about the same number of operations are needed for filtering and then calculating the average when using a sequence or normal collection. So, the reason why it's more performant is because no intermediate collections must be created, right?
            – Aliquis
            2 days ago










          • @Aliquis it's not the same number of operations - in the first example, you need to iterate 50_000_000 times to create a filtered list and then proceed with calculating the average (which involves more iteration). In the second one, you just go straight for the average (you iterate the sequence once with filtering applied) without creating any redundant objects + possibly avoid autoboxing of ints
            – Grzegorz Piwowarek
            2 days ago

















          • Well, compared to the second example in the OP, about the same number of operations are needed for filtering and then calculating the average when using a sequence or normal collection. So, the reason why it's more performant is because no intermediate collections must be created, right?
            – Aliquis
            2 days ago










          • @Aliquis it's not the same number of operations - in the first example, you need to iterate 50_000_000 times to create a filtered list and then proceed with calculating the average (which involves more iteration). In the second one, you just go straight for the average (you iterate the sequence once with filtering applied) without creating any redundant objects + possibly avoid autoboxing of ints
            – Grzegorz Piwowarek
            2 days ago
















          Well, compared to the second example in the OP, about the same number of operations are needed for filtering and then calculating the average when using a sequence or normal collection. So, the reason why it's more performant is because no intermediate collections must be created, right?
          – Aliquis
          2 days ago




          Well, compared to the second example in the OP, about the same number of operations are needed for filtering and then calculating the average when using a sequence or normal collection. So, the reason why it's more performant is because no intermediate collections must be created, right?
          – Aliquis
          2 days ago












          @Aliquis it's not the same number of operations - in the first example, you need to iterate 50_000_000 times to create a filtered list and then proceed with calculating the average (which involves more iteration). In the second one, you just go straight for the average (you iterate the sequence once with filtering applied) without creating any redundant objects + possibly avoid autoboxing of ints
          – Grzegorz Piwowarek
          2 days ago





          @Aliquis it's not the same number of operations - in the first example, you need to iterate 50_000_000 times to create a filtered list and then proceed with calculating the average (which involves more iteration). In the second one, you just go straight for the average (you iterate the sequence once with filtering applied) without creating any redundant objects + possibly avoid autoboxing of ints
          – Grzegorz Piwowarek
          2 days ago


















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53237853%2fkotlin-why-is-sequence-more-performant-in-this-example%23new-answer', 'question_page');

          );

          Post as a guest














































































          這個網誌中的熱門文章

          How to read a connectionString WITH PROVIDER in .NET Core?

          In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

          Museum of Modern and Contemporary Art of Trento and Rovereto