Filter the rows in a list of tuples using numpy










2














I am looking for a quicker way to filter out the list of tuples, using numpy and avoiding loops.



A = [(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)]


Now I have to filter it based on the second element, i.e. 4.
If '4' is present form one list 'B', if not form list 'C'.



That is:



B = [(27157, 4),(2857, 4)]
C = [(24814, 0),(1047, 2),(18265, 2),(23854, 2),(36881, 0)]









share|improve this question























  • If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
    – hpaulj
    Nov 12 '18 at 20:04










  • Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
    – Gurpreet.S
    Nov 12 '18 at 22:00










  • The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
    – hpaulj
    Nov 12 '18 at 22:17















2














I am looking for a quicker way to filter out the list of tuples, using numpy and avoiding loops.



A = [(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)]


Now I have to filter it based on the second element, i.e. 4.
If '4' is present form one list 'B', if not form list 'C'.



That is:



B = [(27157, 4),(2857, 4)]
C = [(24814, 0),(1047, 2),(18265, 2),(23854, 2),(36881, 0)]









share|improve this question























  • If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
    – hpaulj
    Nov 12 '18 at 20:04










  • Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
    – Gurpreet.S
    Nov 12 '18 at 22:00










  • The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
    – hpaulj
    Nov 12 '18 at 22:17













2












2








2







I am looking for a quicker way to filter out the list of tuples, using numpy and avoiding loops.



A = [(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)]


Now I have to filter it based on the second element, i.e. 4.
If '4' is present form one list 'B', if not form list 'C'.



That is:



B = [(27157, 4),(2857, 4)]
C = [(24814, 0),(1047, 2),(18265, 2),(23854, 2),(36881, 0)]









share|improve this question















I am looking for a quicker way to filter out the list of tuples, using numpy and avoiding loops.



A = [(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)]


Now I have to filter it based on the second element, i.e. 4.
If '4' is present form one list 'B', if not form list 'C'.



That is:



B = [(27157, 4),(2857, 4)]
C = [(24814, 0),(1047, 2),(18265, 2),(23854, 2),(36881, 0)]






python arrays numpy indexing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 '18 at 18:13









jpp

91.9k2052102




91.9k2052102










asked Nov 12 '18 at 17:37









Gurpreet.S

496




496











  • If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
    – hpaulj
    Nov 12 '18 at 20:04










  • Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
    – Gurpreet.S
    Nov 12 '18 at 22:00










  • The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
    – hpaulj
    Nov 12 '18 at 22:17
















  • If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
    – hpaulj
    Nov 12 '18 at 20:04










  • Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
    – Gurpreet.S
    Nov 12 '18 at 22:00










  • The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
    – hpaulj
    Nov 12 '18 at 22:17















If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
– hpaulj
Nov 12 '18 at 20:04




If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
– hpaulj
Nov 12 '18 at 20:04












Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
– Gurpreet.S
Nov 12 '18 at 22:00




Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
– Gurpreet.S
Nov 12 '18 at 22:00












The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
– hpaulj
Nov 12 '18 at 22:17




The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
– hpaulj
Nov 12 '18 at 22:17












2 Answers
2






active

oldest

votes


















2














With NumPy, you can use Boolean indexing to return arrays:



mask = A[:, 1] == 4
B = A[mask]
C = A[~mask]


This requires your input to be a NumPy array:



A = np.array([(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)])





share|improve this answer




























    0














    To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



    dt=dtype([('val',int),('key',int)])
    B=ndarray(len(A),dt,array(A))

    B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
    B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...





    share|improve this answer




















      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53267343%2ffilter-the-rows-in-a-list-of-tuples-using-numpy%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      2














      With NumPy, you can use Boolean indexing to return arrays:



      mask = A[:, 1] == 4
      B = A[mask]
      C = A[~mask]


      This requires your input to be a NumPy array:



      A = np.array([(27157, 4),
      (24814, 0),
      (1047, 2),
      (18265, 2),
      (2857, 4),
      (23854, 2),
      (36881, 0)])





      share|improve this answer

























        2














        With NumPy, you can use Boolean indexing to return arrays:



        mask = A[:, 1] == 4
        B = A[mask]
        C = A[~mask]


        This requires your input to be a NumPy array:



        A = np.array([(27157, 4),
        (24814, 0),
        (1047, 2),
        (18265, 2),
        (2857, 4),
        (23854, 2),
        (36881, 0)])





        share|improve this answer























          2












          2








          2






          With NumPy, you can use Boolean indexing to return arrays:



          mask = A[:, 1] == 4
          B = A[mask]
          C = A[~mask]


          This requires your input to be a NumPy array:



          A = np.array([(27157, 4),
          (24814, 0),
          (1047, 2),
          (18265, 2),
          (2857, 4),
          (23854, 2),
          (36881, 0)])





          share|improve this answer












          With NumPy, you can use Boolean indexing to return arrays:



          mask = A[:, 1] == 4
          B = A[mask]
          C = A[~mask]


          This requires your input to be a NumPy array:



          A = np.array([(27157, 4),
          (24814, 0),
          (1047, 2),
          (18265, 2),
          (2857, 4),
          (23854, 2),
          (36881, 0)])






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 12 '18 at 17:38









          jpp

          91.9k2052102




          91.9k2052102























              0














              To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



              dt=dtype([('val',int),('key',int)])
              B=ndarray(len(A),dt,array(A))

              B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
              B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...





              share|improve this answer

























                0














                To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



                dt=dtype([('val',int),('key',int)])
                B=ndarray(len(A),dt,array(A))

                B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
                B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...





                share|improve this answer























                  0












                  0








                  0






                  To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



                  dt=dtype([('val',int),('key',int)])
                  B=ndarray(len(A),dt,array(A))

                  B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
                  B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...





                  share|improve this answer












                  To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



                  dt=dtype([('val',int),('key',int)])
                  B=ndarray(len(A),dt,array(A))

                  B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
                  B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 12 '18 at 18:11









                  B. M.

                  12.9k11934




                  12.9k11934



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53267343%2ffilter-the-rows-in-a-list-of-tuples-using-numpy%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      How to read a connectionString WITH PROVIDER in .NET Core?

                      In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

                      Museum of Modern and Contemporary Art of Trento and Rovereto