Is it possible to search a txt-file by words from list and return the line above?









up vote
1
down vote

favorite












I have a txt-file with sentances and am able to find words from a list within it. I would like to print the line above the 'found-line' to a seperate list. I tried it with the below-code, but this only returns .



Here is my code:



fname_in = "test.txt"
lv_pos =
search_list = ['word1', 'word2']

with open (fname_in, 'r') as f:
file_l1 = [line.split('n') for line in f.readlines()]
counter = 0

for word in search_list:
if word in file_l1:
l_pos.append(file_l1[counter - 1])

counter += 1

print(l_pos)


The text file looks somthing like this:



Bla bla bla
I want this line1.
I found this line with word1.
Bla bla bla
I want this line2.
I found this line with word2.


The result I want this:



l_pos = ['I want this line1.','I want this line2.']









share|improve this question



















  • 1




    file_l1 = f.readlines() should be enough. Then instead of if word ... you need an inner for loop to iterate over all items (lines) of file_l1 to check if word is contained. Use enumerate to find out line number.
    – Michael Butscher
    Nov 11 at 0:26















up vote
1
down vote

favorite












I have a txt-file with sentances and am able to find words from a list within it. I would like to print the line above the 'found-line' to a seperate list. I tried it with the below-code, but this only returns .



Here is my code:



fname_in = "test.txt"
lv_pos =
search_list = ['word1', 'word2']

with open (fname_in, 'r') as f:
file_l1 = [line.split('n') for line in f.readlines()]
counter = 0

for word in search_list:
if word in file_l1:
l_pos.append(file_l1[counter - 1])

counter += 1

print(l_pos)


The text file looks somthing like this:



Bla bla bla
I want this line1.
I found this line with word1.
Bla bla bla
I want this line2.
I found this line with word2.


The result I want this:



l_pos = ['I want this line1.','I want this line2.']









share|improve this question



















  • 1




    file_l1 = f.readlines() should be enough. Then instead of if word ... you need an inner for loop to iterate over all items (lines) of file_l1 to check if word is contained. Use enumerate to find out line number.
    – Michael Butscher
    Nov 11 at 0:26













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have a txt-file with sentances and am able to find words from a list within it. I would like to print the line above the 'found-line' to a seperate list. I tried it with the below-code, but this only returns .



Here is my code:



fname_in = "test.txt"
lv_pos =
search_list = ['word1', 'word2']

with open (fname_in, 'r') as f:
file_l1 = [line.split('n') for line in f.readlines()]
counter = 0

for word in search_list:
if word in file_l1:
l_pos.append(file_l1[counter - 1])

counter += 1

print(l_pos)


The text file looks somthing like this:



Bla bla bla
I want this line1.
I found this line with word1.
Bla bla bla
I want this line2.
I found this line with word2.


The result I want this:



l_pos = ['I want this line1.','I want this line2.']









share|improve this question















I have a txt-file with sentances and am able to find words from a list within it. I would like to print the line above the 'found-line' to a seperate list. I tried it with the below-code, but this only returns .



Here is my code:



fname_in = "test.txt"
lv_pos =
search_list = ['word1', 'word2']

with open (fname_in, 'r') as f:
file_l1 = [line.split('n') for line in f.readlines()]
counter = 0

for word in search_list:
if word in file_l1:
l_pos.append(file_l1[counter - 1])

counter += 1

print(l_pos)


The text file looks somthing like this:



Bla bla bla
I want this line1.
I found this line with word1.
Bla bla bla
I want this line2.
I found this line with word2.


The result I want this:



l_pos = ['I want this line1.','I want this line2.']






python list






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 0:24









martineau

64.7k887172




64.7k887172










asked Nov 11 at 0:20









Mady

506




506







  • 1




    file_l1 = f.readlines() should be enough. Then instead of if word ... you need an inner for loop to iterate over all items (lines) of file_l1 to check if word is contained. Use enumerate to find out line number.
    – Michael Butscher
    Nov 11 at 0:26













  • 1




    file_l1 = f.readlines() should be enough. Then instead of if word ... you need an inner for loop to iterate over all items (lines) of file_l1 to check if word is contained. Use enumerate to find out line number.
    – Michael Butscher
    Nov 11 at 0:26








1




1




file_l1 = f.readlines() should be enough. Then instead of if word ... you need an inner for loop to iterate over all items (lines) of file_l1 to check if word is contained. Use enumerate to find out line number.
– Michael Butscher
Nov 11 at 0:26





file_l1 = f.readlines() should be enough. Then instead of if word ... you need an inner for loop to iterate over all items (lines) of file_l1 to check if word is contained. Use enumerate to find out line number.
– Michael Butscher
Nov 11 at 0:26













3 Answers
3






active

oldest

votes

















up vote
0
down vote













In the second line of your example you wrote lv_pos instead of l_pos. Inside the with statement you could fix it like this I think:



fname_in = "test.txt"
l_pos =
search_list = ['word1', 'word2']

file_l1 = f.readlines()

for line in range(len(file_l1)):
for word in search_words:
if word in file_l1[line].split(" "):
l_pos.append(file_l1[line - 1])

print(l_pos)


I'm not thrilled about this solution but I think it would fix your code with minimal modification.






share|improve this answer



























    up vote
    0
    down vote













    Treat the file as a collection of pairs of lines and lines-before:



    [prev for prev,this in zip(lines, lines[1:]) 
    if 'word1' in this or 'word2' in this]
    #['I want this line1.', 'I want this line2.']


    This approach can be extended to cover any number of words:



    words = 'word1', 'word2'
    [prev for prev,this in zip(lines,lines[1:])
    if any(word in this for word in words)]
    #['I want this line1.', 'I want this line2.']


    Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"), you should properly tokenize lines with, say, nltk.word_tokenize():



    from nltk import word_tokenize
    [prev for prev,this in zip(lines,lines[1:])
    if words & set(word_tokenize(this))]
    #['I want this line1.', 'I want this line2.']





    share|improve this answer





























      up vote
      0
      down vote













      First of all you got some typos in your code—in some places you wrote l_pos and in others, lv_pos.



      The other problem is I don't think you realize that file_l1 is a list-of-lists, so the if word in file_l1: isn't doing what you think. You need to check each word against each of these sublists.



      Here's some working code based on your own:



      fname_in = "simple_test.txt"
      l_pos =
      search_list = ['word1', 'word2']

      with open(fname_in) as f:
      lines = f.read().splitlines()

      for i, line in enumerate(lines):
      for word in search_list:
      if word in line:
      l_pos.append(lines[i - 1])

      print(l_pos) # -> ['I want this line1.', 'I want this line2.']



      Update



      Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:



      from collections import deque

      fname_in = "simple_test.txt"
      l_pos =
      search_list = ['word1', 'word2']

      with open(fname_in) as file:
      lines = (line.rstrip('n') for line in file) # Generator expression.

      try: # Create and initialize a sliding window.
      sw = deque(next(lines), maxlen=2)
      except StopIteration: # File with less than 1 line.
      pass

      for line in lines:
      sw.append(line)
      for word in search_list:
      if word in sw[1]:
      l_pos.append(sw[0])

      print(l_pos) # -> ['I want this line1.', 'I want this line2.']





      share|improve this answer


















      • 1




        Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
        – martineau
        Nov 11 at 14:42











      • Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
        – martineau
        Nov 11 at 15:12










      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244718%2fis-it-possible-to-search-a-txt-file-by-words-from-list-and-return-the-line-above%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      0
      down vote













      In the second line of your example you wrote lv_pos instead of l_pos. Inside the with statement you could fix it like this I think:



      fname_in = "test.txt"
      l_pos =
      search_list = ['word1', 'word2']

      file_l1 = f.readlines()

      for line in range(len(file_l1)):
      for word in search_words:
      if word in file_l1[line].split(" "):
      l_pos.append(file_l1[line - 1])

      print(l_pos)


      I'm not thrilled about this solution but I think it would fix your code with minimal modification.






      share|improve this answer
























        up vote
        0
        down vote













        In the second line of your example you wrote lv_pos instead of l_pos. Inside the with statement you could fix it like this I think:



        fname_in = "test.txt"
        l_pos =
        search_list = ['word1', 'word2']

        file_l1 = f.readlines()

        for line in range(len(file_l1)):
        for word in search_words:
        if word in file_l1[line].split(" "):
        l_pos.append(file_l1[line - 1])

        print(l_pos)


        I'm not thrilled about this solution but I think it would fix your code with minimal modification.






        share|improve this answer






















          up vote
          0
          down vote










          up vote
          0
          down vote









          In the second line of your example you wrote lv_pos instead of l_pos. Inside the with statement you could fix it like this I think:



          fname_in = "test.txt"
          l_pos =
          search_list = ['word1', 'word2']

          file_l1 = f.readlines()

          for line in range(len(file_l1)):
          for word in search_words:
          if word in file_l1[line].split(" "):
          l_pos.append(file_l1[line - 1])

          print(l_pos)


          I'm not thrilled about this solution but I think it would fix your code with minimal modification.






          share|improve this answer












          In the second line of your example you wrote lv_pos instead of l_pos. Inside the with statement you could fix it like this I think:



          fname_in = "test.txt"
          l_pos =
          search_list = ['word1', 'word2']

          file_l1 = f.readlines()

          for line in range(len(file_l1)):
          for word in search_words:
          if word in file_l1[line].split(" "):
          l_pos.append(file_l1[line - 1])

          print(l_pos)


          I'm not thrilled about this solution but I think it would fix your code with minimal modification.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 11 at 0:33









          Charles Landau

          1,2461212




          1,2461212






















              up vote
              0
              down vote













              Treat the file as a collection of pairs of lines and lines-before:



              [prev for prev,this in zip(lines, lines[1:]) 
              if 'word1' in this or 'word2' in this]
              #['I want this line1.', 'I want this line2.']


              This approach can be extended to cover any number of words:



              words = 'word1', 'word2'
              [prev for prev,this in zip(lines,lines[1:])
              if any(word in this for word in words)]
              #['I want this line1.', 'I want this line2.']


              Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"), you should properly tokenize lines with, say, nltk.word_tokenize():



              from nltk import word_tokenize
              [prev for prev,this in zip(lines,lines[1:])
              if words & set(word_tokenize(this))]
              #['I want this line1.', 'I want this line2.']





              share|improve this answer


























                up vote
                0
                down vote













                Treat the file as a collection of pairs of lines and lines-before:



                [prev for prev,this in zip(lines, lines[1:]) 
                if 'word1' in this or 'word2' in this]
                #['I want this line1.', 'I want this line2.']


                This approach can be extended to cover any number of words:



                words = 'word1', 'word2'
                [prev for prev,this in zip(lines,lines[1:])
                if any(word in this for word in words)]
                #['I want this line1.', 'I want this line2.']


                Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"), you should properly tokenize lines with, say, nltk.word_tokenize():



                from nltk import word_tokenize
                [prev for prev,this in zip(lines,lines[1:])
                if words & set(word_tokenize(this))]
                #['I want this line1.', 'I want this line2.']





                share|improve this answer
























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  Treat the file as a collection of pairs of lines and lines-before:



                  [prev for prev,this in zip(lines, lines[1:]) 
                  if 'word1' in this or 'word2' in this]
                  #['I want this line1.', 'I want this line2.']


                  This approach can be extended to cover any number of words:



                  words = 'word1', 'word2'
                  [prev for prev,this in zip(lines,lines[1:])
                  if any(word in this for word in words)]
                  #['I want this line1.', 'I want this line2.']


                  Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"), you should properly tokenize lines with, say, nltk.word_tokenize():



                  from nltk import word_tokenize
                  [prev for prev,this in zip(lines,lines[1:])
                  if words & set(word_tokenize(this))]
                  #['I want this line1.', 'I want this line2.']





                  share|improve this answer














                  Treat the file as a collection of pairs of lines and lines-before:



                  [prev for prev,this in zip(lines, lines[1:]) 
                  if 'word1' in this or 'word2' in this]
                  #['I want this line1.', 'I want this line2.']


                  This approach can be extended to cover any number of words:



                  words = 'word1', 'word2'
                  [prev for prev,this in zip(lines,lines[1:])
                  if any(word in this for word in words)]
                  #['I want this line1.', 'I want this line2.']


                  Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"), you should properly tokenize lines with, say, nltk.word_tokenize():



                  from nltk import word_tokenize
                  [prev for prev,this in zip(lines,lines[1:])
                  if words & set(word_tokenize(this))]
                  #['I want this line1.', 'I want this line2.']






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 11 at 2:26

























                  answered Nov 11 at 2:18









                  DYZ

                  24.1k61948




                  24.1k61948




















                      up vote
                      0
                      down vote













                      First of all you got some typos in your code—in some places you wrote l_pos and in others, lv_pos.



                      The other problem is I don't think you realize that file_l1 is a list-of-lists, so the if word in file_l1: isn't doing what you think. You need to check each word against each of these sublists.



                      Here's some working code based on your own:



                      fname_in = "simple_test.txt"
                      l_pos =
                      search_list = ['word1', 'word2']

                      with open(fname_in) as f:
                      lines = f.read().splitlines()

                      for i, line in enumerate(lines):
                      for word in search_list:
                      if word in line:
                      l_pos.append(lines[i - 1])

                      print(l_pos) # -> ['I want this line1.', 'I want this line2.']



                      Update



                      Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:



                      from collections import deque

                      fname_in = "simple_test.txt"
                      l_pos =
                      search_list = ['word1', 'word2']

                      with open(fname_in) as file:
                      lines = (line.rstrip('n') for line in file) # Generator expression.

                      try: # Create and initialize a sliding window.
                      sw = deque(next(lines), maxlen=2)
                      except StopIteration: # File with less than 1 line.
                      pass

                      for line in lines:
                      sw.append(line)
                      for word in search_list:
                      if word in sw[1]:
                      l_pos.append(sw[0])

                      print(l_pos) # -> ['I want this line1.', 'I want this line2.']





                      share|improve this answer


















                      • 1




                        Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
                        – martineau
                        Nov 11 at 14:42











                      • Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
                        – martineau
                        Nov 11 at 15:12














                      up vote
                      0
                      down vote













                      First of all you got some typos in your code—in some places you wrote l_pos and in others, lv_pos.



                      The other problem is I don't think you realize that file_l1 is a list-of-lists, so the if word in file_l1: isn't doing what you think. You need to check each word against each of these sublists.



                      Here's some working code based on your own:



                      fname_in = "simple_test.txt"
                      l_pos =
                      search_list = ['word1', 'word2']

                      with open(fname_in) as f:
                      lines = f.read().splitlines()

                      for i, line in enumerate(lines):
                      for word in search_list:
                      if word in line:
                      l_pos.append(lines[i - 1])

                      print(l_pos) # -> ['I want this line1.', 'I want this line2.']



                      Update



                      Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:



                      from collections import deque

                      fname_in = "simple_test.txt"
                      l_pos =
                      search_list = ['word1', 'word2']

                      with open(fname_in) as file:
                      lines = (line.rstrip('n') for line in file) # Generator expression.

                      try: # Create and initialize a sliding window.
                      sw = deque(next(lines), maxlen=2)
                      except StopIteration: # File with less than 1 line.
                      pass

                      for line in lines:
                      sw.append(line)
                      for word in search_list:
                      if word in sw[1]:
                      l_pos.append(sw[0])

                      print(l_pos) # -> ['I want this line1.', 'I want this line2.']





                      share|improve this answer


















                      • 1




                        Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
                        – martineau
                        Nov 11 at 14:42











                      • Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
                        – martineau
                        Nov 11 at 15:12












                      up vote
                      0
                      down vote










                      up vote
                      0
                      down vote









                      First of all you got some typos in your code—in some places you wrote l_pos and in others, lv_pos.



                      The other problem is I don't think you realize that file_l1 is a list-of-lists, so the if word in file_l1: isn't doing what you think. You need to check each word against each of these sublists.



                      Here's some working code based on your own:



                      fname_in = "simple_test.txt"
                      l_pos =
                      search_list = ['word1', 'word2']

                      with open(fname_in) as f:
                      lines = f.read().splitlines()

                      for i, line in enumerate(lines):
                      for word in search_list:
                      if word in line:
                      l_pos.append(lines[i - 1])

                      print(l_pos) # -> ['I want this line1.', 'I want this line2.']



                      Update



                      Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:



                      from collections import deque

                      fname_in = "simple_test.txt"
                      l_pos =
                      search_list = ['word1', 'word2']

                      with open(fname_in) as file:
                      lines = (line.rstrip('n') for line in file) # Generator expression.

                      try: # Create and initialize a sliding window.
                      sw = deque(next(lines), maxlen=2)
                      except StopIteration: # File with less than 1 line.
                      pass

                      for line in lines:
                      sw.append(line)
                      for word in search_list:
                      if word in sw[1]:
                      l_pos.append(sw[0])

                      print(l_pos) # -> ['I want this line1.', 'I want this line2.']





                      share|improve this answer














                      First of all you got some typos in your code—in some places you wrote l_pos and in others, lv_pos.



                      The other problem is I don't think you realize that file_l1 is a list-of-lists, so the if word in file_l1: isn't doing what you think. You need to check each word against each of these sublists.



                      Here's some working code based on your own:



                      fname_in = "simple_test.txt"
                      l_pos =
                      search_list = ['word1', 'word2']

                      with open(fname_in) as f:
                      lines = f.read().splitlines()

                      for i, line in enumerate(lines):
                      for word in search_list:
                      if word in line:
                      l_pos.append(lines[i - 1])

                      print(l_pos) # -> ['I want this line1.', 'I want this line2.']



                      Update



                      Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:



                      from collections import deque

                      fname_in = "simple_test.txt"
                      l_pos =
                      search_list = ['word1', 'word2']

                      with open(fname_in) as file:
                      lines = (line.rstrip('n') for line in file) # Generator expression.

                      try: # Create and initialize a sliding window.
                      sw = deque(next(lines), maxlen=2)
                      except StopIteration: # File with less than 1 line.
                      pass

                      for line in lines:
                      sw.append(line)
                      for word in search_list:
                      if word in sw[1]:
                      l_pos.append(sw[0])

                      print(l_pos) # -> ['I want this line1.', 'I want this line2.']






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Nov 11 at 18:38

























                      answered Nov 11 at 0:44









                      martineau

                      64.7k887172




                      64.7k887172







                      • 1




                        Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
                        – martineau
                        Nov 11 at 14:42











                      • Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
                        – martineau
                        Nov 11 at 15:12












                      • 1




                        Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
                        – martineau
                        Nov 11 at 14:42











                      • Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
                        – martineau
                        Nov 11 at 15:12







                      1




                      1




                      Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
                      – martineau
                      Nov 11 at 14:42





                      Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
                      – martineau
                      Nov 11 at 14:42













                      Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
                      – martineau
                      Nov 11 at 15:12




                      Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
                      – martineau
                      Nov 11 at 15:12

















                       

                      draft saved


                      draft discarded















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244718%2fis-it-possible-to-search-a-txt-file-by-words-from-list-and-return-the-line-above%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      How to read a connectionString WITH PROVIDER in .NET Core?

                      Node.js Script on GitHub Pages or Amazon S3

                      Museum of Modern and Contemporary Art of Trento and Rovereto