Is it possible to search a txt-file by words from list and return the line above?
up vote
1
down vote
favorite
I have a txt-file with sentances and am able to find words from a list within it. I would like to print the line above the 'found-line' to a seperate list. I tried it with the below-code, but this only returns .
Here is my code:
fname_in = "test.txt"
lv_pos =
search_list = ['word1', 'word2']
with open (fname_in, 'r') as f:
file_l1 = [line.split('n') for line in f.readlines()]
counter = 0
for word in search_list:
if word in file_l1:
l_pos.append(file_l1[counter - 1])
counter += 1
print(l_pos)
The text file looks somthing like this:
Bla bla bla
I want this line1.
I found this line with word1.
Bla bla bla
I want this line2.
I found this line with word2.
The result I want this:
l_pos = ['I want this line1.','I want this line2.']
python list
add a comment |
up vote
1
down vote
favorite
I have a txt-file with sentances and am able to find words from a list within it. I would like to print the line above the 'found-line' to a seperate list. I tried it with the below-code, but this only returns .
Here is my code:
fname_in = "test.txt"
lv_pos =
search_list = ['word1', 'word2']
with open (fname_in, 'r') as f:
file_l1 = [line.split('n') for line in f.readlines()]
counter = 0
for word in search_list:
if word in file_l1:
l_pos.append(file_l1[counter - 1])
counter += 1
print(l_pos)
The text file looks somthing like this:
Bla bla bla
I want this line1.
I found this line with word1.
Bla bla bla
I want this line2.
I found this line with word2.
The result I want this:
l_pos = ['I want this line1.','I want this line2.']
python list
1
file_l1 = f.readlines()
should be enough. Then instead ofif word ...
you need an inner for loop to iterate over all items (lines) offile_l1
to check ifword
is contained. Useenumerate
to find out line number.
– Michael Butscher
Nov 11 at 0:26
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a txt-file with sentances and am able to find words from a list within it. I would like to print the line above the 'found-line' to a seperate list. I tried it with the below-code, but this only returns .
Here is my code:
fname_in = "test.txt"
lv_pos =
search_list = ['word1', 'word2']
with open (fname_in, 'r') as f:
file_l1 = [line.split('n') for line in f.readlines()]
counter = 0
for word in search_list:
if word in file_l1:
l_pos.append(file_l1[counter - 1])
counter += 1
print(l_pos)
The text file looks somthing like this:
Bla bla bla
I want this line1.
I found this line with word1.
Bla bla bla
I want this line2.
I found this line with word2.
The result I want this:
l_pos = ['I want this line1.','I want this line2.']
python list
I have a txt-file with sentances and am able to find words from a list within it. I would like to print the line above the 'found-line' to a seperate list. I tried it with the below-code, but this only returns .
Here is my code:
fname_in = "test.txt"
lv_pos =
search_list = ['word1', 'word2']
with open (fname_in, 'r') as f:
file_l1 = [line.split('n') for line in f.readlines()]
counter = 0
for word in search_list:
if word in file_l1:
l_pos.append(file_l1[counter - 1])
counter += 1
print(l_pos)
The text file looks somthing like this:
Bla bla bla
I want this line1.
I found this line with word1.
Bla bla bla
I want this line2.
I found this line with word2.
The result I want this:
l_pos = ['I want this line1.','I want this line2.']
python list
python list
edited Nov 11 at 0:24
martineau
64.7k887172
64.7k887172
asked Nov 11 at 0:20
Mady
506
506
1
file_l1 = f.readlines()
should be enough. Then instead ofif word ...
you need an inner for loop to iterate over all items (lines) offile_l1
to check ifword
is contained. Useenumerate
to find out line number.
– Michael Butscher
Nov 11 at 0:26
add a comment |
1
file_l1 = f.readlines()
should be enough. Then instead ofif word ...
you need an inner for loop to iterate over all items (lines) offile_l1
to check ifword
is contained. Useenumerate
to find out line number.
– Michael Butscher
Nov 11 at 0:26
1
1
file_l1 = f.readlines()
should be enough. Then instead of if word ...
you need an inner for loop to iterate over all items (lines) of file_l1
to check if word
is contained. Use enumerate
to find out line number.– Michael Butscher
Nov 11 at 0:26
file_l1 = f.readlines()
should be enough. Then instead of if word ...
you need an inner for loop to iterate over all items (lines) of file_l1
to check if word
is contained. Use enumerate
to find out line number.– Michael Butscher
Nov 11 at 0:26
add a comment |
3 Answers
3
active
oldest
votes
up vote
0
down vote
In the second line of your example you wrote lv_pos
instead of l_pos
. Inside the with
statement you could fix it like this I think:
fname_in = "test.txt"
l_pos =
search_list = ['word1', 'word2']
file_l1 = f.readlines()
for line in range(len(file_l1)):
for word in search_words:
if word in file_l1[line].split(" "):
l_pos.append(file_l1[line - 1])
print(l_pos)
I'm not thrilled about this solution but I think it would fix your code with minimal modification.
add a comment |
up vote
0
down vote
Treat the file as a collection of pairs of lines and lines-before:
[prev for prev,this in zip(lines, lines[1:])
if 'word1' in this or 'word2' in this]
#['I want this line1.', 'I want this line2.']
This approach can be extended to cover any number of words:
words = 'word1', 'word2'
[prev for prev,this in zip(lines,lines[1:])
if any(word in this for word in words)]
#['I want this line1.', 'I want this line2.']
Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"
), you should properly tokenize lines with, say, nltk.word_tokenize()
:
from nltk import word_tokenize
[prev for prev,this in zip(lines,lines[1:])
if words & set(word_tokenize(this))]
#['I want this line1.', 'I want this line2.']
add a comment |
up vote
0
down vote
First of all you got some typos in your code—in some places you wrote l_pos
and in others, lv_pos
.
The other problem is I don't think you realize that file_l1
is a list-of-lists, so the if word in file_l1:
isn't doing what you think. You need to check each word
against each of these sublists.
Here's some working code based on your own:
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as f:
lines = f.read().splitlines()
for i, line in enumerate(lines):
for word in search_list:
if word in line:
l_pos.append(lines[i - 1])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
Update
Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:
from collections import deque
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as file:
lines = (line.rstrip('n') for line in file) # Generator expression.
try: # Create and initialize a sliding window.
sw = deque(next(lines), maxlen=2)
except StopIteration: # File with less than 1 line.
pass
for line in lines:
sw.append(line)
for word in search_list:
if word in sw[1]:
l_pos.append(sw[0])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
1
Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
– martineau
Nov 11 at 14:42
Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
– martineau
Nov 11 at 15:12
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
In the second line of your example you wrote lv_pos
instead of l_pos
. Inside the with
statement you could fix it like this I think:
fname_in = "test.txt"
l_pos =
search_list = ['word1', 'word2']
file_l1 = f.readlines()
for line in range(len(file_l1)):
for word in search_words:
if word in file_l1[line].split(" "):
l_pos.append(file_l1[line - 1])
print(l_pos)
I'm not thrilled about this solution but I think it would fix your code with minimal modification.
add a comment |
up vote
0
down vote
In the second line of your example you wrote lv_pos
instead of l_pos
. Inside the with
statement you could fix it like this I think:
fname_in = "test.txt"
l_pos =
search_list = ['word1', 'word2']
file_l1 = f.readlines()
for line in range(len(file_l1)):
for word in search_words:
if word in file_l1[line].split(" "):
l_pos.append(file_l1[line - 1])
print(l_pos)
I'm not thrilled about this solution but I think it would fix your code with minimal modification.
add a comment |
up vote
0
down vote
up vote
0
down vote
In the second line of your example you wrote lv_pos
instead of l_pos
. Inside the with
statement you could fix it like this I think:
fname_in = "test.txt"
l_pos =
search_list = ['word1', 'word2']
file_l1 = f.readlines()
for line in range(len(file_l1)):
for word in search_words:
if word in file_l1[line].split(" "):
l_pos.append(file_l1[line - 1])
print(l_pos)
I'm not thrilled about this solution but I think it would fix your code with minimal modification.
In the second line of your example you wrote lv_pos
instead of l_pos
. Inside the with
statement you could fix it like this I think:
fname_in = "test.txt"
l_pos =
search_list = ['word1', 'word2']
file_l1 = f.readlines()
for line in range(len(file_l1)):
for word in search_words:
if word in file_l1[line].split(" "):
l_pos.append(file_l1[line - 1])
print(l_pos)
I'm not thrilled about this solution but I think it would fix your code with minimal modification.
answered Nov 11 at 0:33
Charles Landau
1,2461212
1,2461212
add a comment |
add a comment |
up vote
0
down vote
Treat the file as a collection of pairs of lines and lines-before:
[prev for prev,this in zip(lines, lines[1:])
if 'word1' in this or 'word2' in this]
#['I want this line1.', 'I want this line2.']
This approach can be extended to cover any number of words:
words = 'word1', 'word2'
[prev for prev,this in zip(lines,lines[1:])
if any(word in this for word in words)]
#['I want this line1.', 'I want this line2.']
Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"
), you should properly tokenize lines with, say, nltk.word_tokenize()
:
from nltk import word_tokenize
[prev for prev,this in zip(lines,lines[1:])
if words & set(word_tokenize(this))]
#['I want this line1.', 'I want this line2.']
add a comment |
up vote
0
down vote
Treat the file as a collection of pairs of lines and lines-before:
[prev for prev,this in zip(lines, lines[1:])
if 'word1' in this or 'word2' in this]
#['I want this line1.', 'I want this line2.']
This approach can be extended to cover any number of words:
words = 'word1', 'word2'
[prev for prev,this in zip(lines,lines[1:])
if any(word in this for word in words)]
#['I want this line1.', 'I want this line2.']
Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"
), you should properly tokenize lines with, say, nltk.word_tokenize()
:
from nltk import word_tokenize
[prev for prev,this in zip(lines,lines[1:])
if words & set(word_tokenize(this))]
#['I want this line1.', 'I want this line2.']
add a comment |
up vote
0
down vote
up vote
0
down vote
Treat the file as a collection of pairs of lines and lines-before:
[prev for prev,this in zip(lines, lines[1:])
if 'word1' in this or 'word2' in this]
#['I want this line1.', 'I want this line2.']
This approach can be extended to cover any number of words:
words = 'word1', 'word2'
[prev for prev,this in zip(lines,lines[1:])
if any(word in this for word in words)]
#['I want this line1.', 'I want this line2.']
Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"
), you should properly tokenize lines with, say, nltk.word_tokenize()
:
from nltk import word_tokenize
[prev for prev,this in zip(lines,lines[1:])
if words & set(word_tokenize(this))]
#['I want this line1.', 'I want this line2.']
Treat the file as a collection of pairs of lines and lines-before:
[prev for prev,this in zip(lines, lines[1:])
if 'word1' in this or 'word2' in this]
#['I want this line1.', 'I want this line2.']
This approach can be extended to cover any number of words:
words = 'word1', 'word2'
[prev for prev,this in zip(lines,lines[1:])
if any(word in this for word in words)]
#['I want this line1.', 'I want this line2.']
Finally, if you care about proper words rather than occurrences (as in "thisisnotword1"
), you should properly tokenize lines with, say, nltk.word_tokenize()
:
from nltk import word_tokenize
[prev for prev,this in zip(lines,lines[1:])
if words & set(word_tokenize(this))]
#['I want this line1.', 'I want this line2.']
edited Nov 11 at 2:26
answered Nov 11 at 2:18
DYZ
24.1k61948
24.1k61948
add a comment |
add a comment |
up vote
0
down vote
First of all you got some typos in your code—in some places you wrote l_pos
and in others, lv_pos
.
The other problem is I don't think you realize that file_l1
is a list-of-lists, so the if word in file_l1:
isn't doing what you think. You need to check each word
against each of these sublists.
Here's some working code based on your own:
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as f:
lines = f.read().splitlines()
for i, line in enumerate(lines):
for word in search_list:
if word in line:
l_pos.append(lines[i - 1])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
Update
Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:
from collections import deque
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as file:
lines = (line.rstrip('n') for line in file) # Generator expression.
try: # Create and initialize a sliding window.
sw = deque(next(lines), maxlen=2)
except StopIteration: # File with less than 1 line.
pass
for line in lines:
sw.append(line)
for word in search_list:
if word in sw[1]:
l_pos.append(sw[0])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
1
Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
– martineau
Nov 11 at 14:42
Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
– martineau
Nov 11 at 15:12
add a comment |
up vote
0
down vote
First of all you got some typos in your code—in some places you wrote l_pos
and in others, lv_pos
.
The other problem is I don't think you realize that file_l1
is a list-of-lists, so the if word in file_l1:
isn't doing what you think. You need to check each word
against each of these sublists.
Here's some working code based on your own:
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as f:
lines = f.read().splitlines()
for i, line in enumerate(lines):
for word in search_list:
if word in line:
l_pos.append(lines[i - 1])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
Update
Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:
from collections import deque
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as file:
lines = (line.rstrip('n') for line in file) # Generator expression.
try: # Create and initialize a sliding window.
sw = deque(next(lines), maxlen=2)
except StopIteration: # File with less than 1 line.
pass
for line in lines:
sw.append(line)
for word in search_list:
if word in sw[1]:
l_pos.append(sw[0])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
1
Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
– martineau
Nov 11 at 14:42
Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
– martineau
Nov 11 at 15:12
add a comment |
up vote
0
down vote
up vote
0
down vote
First of all you got some typos in your code—in some places you wrote l_pos
and in others, lv_pos
.
The other problem is I don't think you realize that file_l1
is a list-of-lists, so the if word in file_l1:
isn't doing what you think. You need to check each word
against each of these sublists.
Here's some working code based on your own:
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as f:
lines = f.read().splitlines()
for i, line in enumerate(lines):
for word in search_list:
if word in line:
l_pos.append(lines[i - 1])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
Update
Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:
from collections import deque
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as file:
lines = (line.rstrip('n') for line in file) # Generator expression.
try: # Create and initialize a sliding window.
sw = deque(next(lines), maxlen=2)
except StopIteration: # File with less than 1 line.
pass
for line in lines:
sw.append(line)
for word in search_list:
if word in sw[1]:
l_pos.append(sw[0])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
First of all you got some typos in your code—in some places you wrote l_pos
and in others, lv_pos
.
The other problem is I don't think you realize that file_l1
is a list-of-lists, so the if word in file_l1:
isn't doing what you think. You need to check each word
against each of these sublists.
Here's some working code based on your own:
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as f:
lines = f.read().splitlines()
for i, line in enumerate(lines):
for word in search_list:
if word in line:
l_pos.append(lines[i - 1])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
Update
Here's another way to do it that doesn't require reading the entire file into memory at once, so doesn't require as much memory:
from collections import deque
fname_in = "simple_test.txt"
l_pos =
search_list = ['word1', 'word2']
with open(fname_in) as file:
lines = (line.rstrip('n') for line in file) # Generator expression.
try: # Create and initialize a sliding window.
sw = deque(next(lines), maxlen=2)
except StopIteration: # File with less than 1 line.
pass
for line in lines:
sw.append(line)
for word in search_list:
if word in sw[1]:
l_pos.append(sw[0])
print(l_pos) # -> ['I want this line1.', 'I want this line2.']
edited Nov 11 at 18:38
answered Nov 11 at 0:44
martineau
64.7k887172
64.7k887172
1
Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
– martineau
Nov 11 at 14:42
Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
– martineau
Nov 11 at 15:12
add a comment |
1
Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
– martineau
Nov 11 at 14:42
Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
– martineau
Nov 11 at 15:12
1
1
Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
– martineau
Nov 11 at 14:42
Mady: That's good to hear. Thank me by reading What should I do when someone answers my question?.
– martineau
Nov 11 at 14:42
Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
– martineau
Nov 11 at 15:12
Mady: What should happen when one of the words is found in the very first line of the file (when there is no previous line)?
– martineau
Nov 11 at 15:12
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244718%2fis-it-possible-to-search-a-txt-file-by-words-from-list-and-return-the-line-above%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
file_l1 = f.readlines()
should be enough. Then instead ofif word ...
you need an inner for loop to iterate over all items (lines) offile_l1
to check ifword
is contained. Useenumerate
to find out line number.– Michael Butscher
Nov 11 at 0:26