Sequence-to-sequence classification with variable sequence lengths in Keras
I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.
I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.
Here's the problem in code:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.
x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)
python keras lstm
add a comment |
I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.
I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.
Here's the problem in code:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.
x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)
python keras lstm
You can try and make some embeddings of fixed size (ex. 500) and feed that into your network
– Novak
Nov 14 '18 at 15:57
@Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.
– jerha202
Nov 15 '18 at 10:30
Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).
– Novak
Nov 15 '18 at 11:26
In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.
– Novak
Nov 15 '18 at 11:28
My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!
– jerha202
Nov 15 '18 at 12:05
add a comment |
I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.
I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.
Here's the problem in code:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.
x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)
python keras lstm
I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.
I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.
Here's the problem in code:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.
x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)
python keras lstm
python keras lstm
edited Nov 14 '18 at 15:54
jerha202
asked Nov 14 '18 at 13:54
jerha202jerha202
33
33
You can try and make some embeddings of fixed size (ex. 500) and feed that into your network
– Novak
Nov 14 '18 at 15:57
@Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.
– jerha202
Nov 15 '18 at 10:30
Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).
– Novak
Nov 15 '18 at 11:26
In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.
– Novak
Nov 15 '18 at 11:28
My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!
– jerha202
Nov 15 '18 at 12:05
add a comment |
You can try and make some embeddings of fixed size (ex. 500) and feed that into your network
– Novak
Nov 14 '18 at 15:57
@Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.
– jerha202
Nov 15 '18 at 10:30
Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).
– Novak
Nov 15 '18 at 11:26
In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.
– Novak
Nov 15 '18 at 11:28
My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!
– jerha202
Nov 15 '18 at 12:05
You can try and make some embeddings of fixed size (ex. 500) and feed that into your network
– Novak
Nov 14 '18 at 15:57
You can try and make some embeddings of fixed size (ex. 500) and feed that into your network
– Novak
Nov 14 '18 at 15:57
@Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.
– jerha202
Nov 15 '18 at 10:30
@Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.
– jerha202
Nov 15 '18 at 10:30
Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).
– Novak
Nov 15 '18 at 11:26
Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).
– Novak
Nov 15 '18 at 11:26
In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.
– Novak
Nov 15 '18 at 11:28
In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.
– Novak
Nov 15 '18 at 11:28
My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!
– jerha202
Nov 15 '18 at 12:05
My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!
– jerha202
Nov 15 '18 at 12:05
add a comment |
2 Answers
2
active
oldest
votes
You can do some thing like this
use generator function take a look at this link fit_generator look for fit_generator method.
def data_generater(batch_size):
print("reading data")
training_file = 'data_location', 'r')
# assuming data is in json format so feels free to change accordingly
training_set = json.loads(training_file.read())
training_file.close()
batch_i = 0 # Counter inside the current batch vector
batch_x = # The current batch's x data
batch_y = # The current batch's y data
while True:
for obj in training_set:
batch_x.append(your input sequences one by one)
if obj['val'] == True:
batch_y.append([1])
elif obj['val'] == False:
batch_y.append([0])
batch_i += 1
if batch_i == batch_size:
# Ready to yield the batch
# pad input to max length in the batch
batch_x = pad_txt_data(batch_x)
yield batch_x, np.array(batch_y)
batch_x =
batch_y =
batch_i = 0
def pad_txt_data(arr):
# expecting arr to be in the shape of (10, m, 6)
paded_arr =
prefered_len = len(max(arr, key=len))
# Now pad all your sequences to preferred length in the batch(arr)
return np.array(paded_arr)
and in the model
model = keras.Sequential()
model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
model.add(keras.layers.LSTM(32))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)
Batch_size, steps_per_epoch, epoch can be different.
Generally
steps_per_epoch = (number of sequences/batch_size)
Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.
And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.
Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?
– jerha202
Nov 23 '18 at 16:40
unfortunately yes.
– Manoj
Nov 24 '18 at 17:12
Sorry i mean CuDNNLSTM does not support masking
– Manoj
Nov 24 '18 at 17:27
add a comment |
In case it helps someone, here's how I ended up implementing a solution:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (m,n) of float64, where m is the timesteps
# and n is the number of features.
# y_list is a list of arrays (m,1) of Boolean.
assert len(x_list) == len(y_list)
num_sequences = len(x_list)
num_features = len(x_list[0][0])
batch_size = 10
batches_per_epoch = 5
assert batch_size * batches_per_epoch == num_sequences
def train_generator():
# Sort by length so the number of timesteps in each batch is minimized
x_list.sort(key=len)
y_list.sort(key=len)
# Generate batches
while True:
for b in range(batches_per_epoch):
longest_index = (b + 1) * batch_size - 1
timesteps = len(x_list[longest_index])
x_train = np.zeros((batch_size, timesteps, num_features))
y_train = np.zeros((batch_size, timesteps, 1))
for i in range(batch_size):
li = b * batch_size + i
x_train[i, 0:len(x_list[li]), :] = x_list[li]
y_train[i, 0:len(y_list[li]), 0] = y_list[li]
yield x_train, y_train
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53301877%2fsequence-to-sequence-classification-with-variable-sequence-lengths-in-keras%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can do some thing like this
use generator function take a look at this link fit_generator look for fit_generator method.
def data_generater(batch_size):
print("reading data")
training_file = 'data_location', 'r')
# assuming data is in json format so feels free to change accordingly
training_set = json.loads(training_file.read())
training_file.close()
batch_i = 0 # Counter inside the current batch vector
batch_x = # The current batch's x data
batch_y = # The current batch's y data
while True:
for obj in training_set:
batch_x.append(your input sequences one by one)
if obj['val'] == True:
batch_y.append([1])
elif obj['val'] == False:
batch_y.append([0])
batch_i += 1
if batch_i == batch_size:
# Ready to yield the batch
# pad input to max length in the batch
batch_x = pad_txt_data(batch_x)
yield batch_x, np.array(batch_y)
batch_x =
batch_y =
batch_i = 0
def pad_txt_data(arr):
# expecting arr to be in the shape of (10, m, 6)
paded_arr =
prefered_len = len(max(arr, key=len))
# Now pad all your sequences to preferred length in the batch(arr)
return np.array(paded_arr)
and in the model
model = keras.Sequential()
model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
model.add(keras.layers.LSTM(32))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)
Batch_size, steps_per_epoch, epoch can be different.
Generally
steps_per_epoch = (number of sequences/batch_size)
Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.
And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.
Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?
– jerha202
Nov 23 '18 at 16:40
unfortunately yes.
– Manoj
Nov 24 '18 at 17:12
Sorry i mean CuDNNLSTM does not support masking
– Manoj
Nov 24 '18 at 17:27
add a comment |
You can do some thing like this
use generator function take a look at this link fit_generator look for fit_generator method.
def data_generater(batch_size):
print("reading data")
training_file = 'data_location', 'r')
# assuming data is in json format so feels free to change accordingly
training_set = json.loads(training_file.read())
training_file.close()
batch_i = 0 # Counter inside the current batch vector
batch_x = # The current batch's x data
batch_y = # The current batch's y data
while True:
for obj in training_set:
batch_x.append(your input sequences one by one)
if obj['val'] == True:
batch_y.append([1])
elif obj['val'] == False:
batch_y.append([0])
batch_i += 1
if batch_i == batch_size:
# Ready to yield the batch
# pad input to max length in the batch
batch_x = pad_txt_data(batch_x)
yield batch_x, np.array(batch_y)
batch_x =
batch_y =
batch_i = 0
def pad_txt_data(arr):
# expecting arr to be in the shape of (10, m, 6)
paded_arr =
prefered_len = len(max(arr, key=len))
# Now pad all your sequences to preferred length in the batch(arr)
return np.array(paded_arr)
and in the model
model = keras.Sequential()
model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
model.add(keras.layers.LSTM(32))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)
Batch_size, steps_per_epoch, epoch can be different.
Generally
steps_per_epoch = (number of sequences/batch_size)
Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.
And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.
Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?
– jerha202
Nov 23 '18 at 16:40
unfortunately yes.
– Manoj
Nov 24 '18 at 17:12
Sorry i mean CuDNNLSTM does not support masking
– Manoj
Nov 24 '18 at 17:27
add a comment |
You can do some thing like this
use generator function take a look at this link fit_generator look for fit_generator method.
def data_generater(batch_size):
print("reading data")
training_file = 'data_location', 'r')
# assuming data is in json format so feels free to change accordingly
training_set = json.loads(training_file.read())
training_file.close()
batch_i = 0 # Counter inside the current batch vector
batch_x = # The current batch's x data
batch_y = # The current batch's y data
while True:
for obj in training_set:
batch_x.append(your input sequences one by one)
if obj['val'] == True:
batch_y.append([1])
elif obj['val'] == False:
batch_y.append([0])
batch_i += 1
if batch_i == batch_size:
# Ready to yield the batch
# pad input to max length in the batch
batch_x = pad_txt_data(batch_x)
yield batch_x, np.array(batch_y)
batch_x =
batch_y =
batch_i = 0
def pad_txt_data(arr):
# expecting arr to be in the shape of (10, m, 6)
paded_arr =
prefered_len = len(max(arr, key=len))
# Now pad all your sequences to preferred length in the batch(arr)
return np.array(paded_arr)
and in the model
model = keras.Sequential()
model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
model.add(keras.layers.LSTM(32))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)
Batch_size, steps_per_epoch, epoch can be different.
Generally
steps_per_epoch = (number of sequences/batch_size)
Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.
And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.
You can do some thing like this
use generator function take a look at this link fit_generator look for fit_generator method.
def data_generater(batch_size):
print("reading data")
training_file = 'data_location', 'r')
# assuming data is in json format so feels free to change accordingly
training_set = json.loads(training_file.read())
training_file.close()
batch_i = 0 # Counter inside the current batch vector
batch_x = # The current batch's x data
batch_y = # The current batch's y data
while True:
for obj in training_set:
batch_x.append(your input sequences one by one)
if obj['val'] == True:
batch_y.append([1])
elif obj['val'] == False:
batch_y.append([0])
batch_i += 1
if batch_i == batch_size:
# Ready to yield the batch
# pad input to max length in the batch
batch_x = pad_txt_data(batch_x)
yield batch_x, np.array(batch_y)
batch_x =
batch_y =
batch_i = 0
def pad_txt_data(arr):
# expecting arr to be in the shape of (10, m, 6)
paded_arr =
prefered_len = len(max(arr, key=len))
# Now pad all your sequences to preferred length in the batch(arr)
return np.array(paded_arr)
and in the model
model = keras.Sequential()
model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
model.add(keras.layers.LSTM(32))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)
Batch_size, steps_per_epoch, epoch can be different.
Generally
steps_per_epoch = (number of sequences/batch_size)
Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.
And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.
edited Nov 18 '18 at 0:57
answered Nov 14 '18 at 16:38
ManojManoj
11810
11810
Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?
– jerha202
Nov 23 '18 at 16:40
unfortunately yes.
– Manoj
Nov 24 '18 at 17:12
Sorry i mean CuDNNLSTM does not support masking
– Manoj
Nov 24 '18 at 17:27
add a comment |
Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?
– jerha202
Nov 23 '18 at 16:40
unfortunately yes.
– Manoj
Nov 24 '18 at 17:12
Sorry i mean CuDNNLSTM does not support masking
– Manoj
Nov 24 '18 at 17:27
Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?
– jerha202
Nov 23 '18 at 16:40
Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?
– jerha202
Nov 23 '18 at 16:40
unfortunately yes.
– Manoj
Nov 24 '18 at 17:12
unfortunately yes.
– Manoj
Nov 24 '18 at 17:12
Sorry i mean CuDNNLSTM does not support masking
– Manoj
Nov 24 '18 at 17:27
Sorry i mean CuDNNLSTM does not support masking
– Manoj
Nov 24 '18 at 17:27
add a comment |
In case it helps someone, here's how I ended up implementing a solution:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (m,n) of float64, where m is the timesteps
# and n is the number of features.
# y_list is a list of arrays (m,1) of Boolean.
assert len(x_list) == len(y_list)
num_sequences = len(x_list)
num_features = len(x_list[0][0])
batch_size = 10
batches_per_epoch = 5
assert batch_size * batches_per_epoch == num_sequences
def train_generator():
# Sort by length so the number of timesteps in each batch is minimized
x_list.sort(key=len)
y_list.sort(key=len)
# Generate batches
while True:
for b in range(batches_per_epoch):
longest_index = (b + 1) * batch_size - 1
timesteps = len(x_list[longest_index])
x_train = np.zeros((batch_size, timesteps, num_features))
y_train = np.zeros((batch_size, timesteps, 1))
for i in range(batch_size):
li = b * batch_size + i
x_train[i, 0:len(x_list[li]), :] = x_list[li]
y_train[i, 0:len(y_list[li]), 0] = y_list[li]
yield x_train, y_train
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)
add a comment |
In case it helps someone, here's how I ended up implementing a solution:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (m,n) of float64, where m is the timesteps
# and n is the number of features.
# y_list is a list of arrays (m,1) of Boolean.
assert len(x_list) == len(y_list)
num_sequences = len(x_list)
num_features = len(x_list[0][0])
batch_size = 10
batches_per_epoch = 5
assert batch_size * batches_per_epoch == num_sequences
def train_generator():
# Sort by length so the number of timesteps in each batch is minimized
x_list.sort(key=len)
y_list.sort(key=len)
# Generate batches
while True:
for b in range(batches_per_epoch):
longest_index = (b + 1) * batch_size - 1
timesteps = len(x_list[longest_index])
x_train = np.zeros((batch_size, timesteps, num_features))
y_train = np.zeros((batch_size, timesteps, 1))
for i in range(batch_size):
li = b * batch_size + i
x_train[i, 0:len(x_list[li]), :] = x_list[li]
y_train[i, 0:len(y_list[li]), 0] = y_list[li]
yield x_train, y_train
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)
add a comment |
In case it helps someone, here's how I ended up implementing a solution:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (m,n) of float64, where m is the timesteps
# and n is the number of features.
# y_list is a list of arrays (m,1) of Boolean.
assert len(x_list) == len(y_list)
num_sequences = len(x_list)
num_features = len(x_list[0][0])
batch_size = 10
batches_per_epoch = 5
assert batch_size * batches_per_epoch == num_sequences
def train_generator():
# Sort by length so the number of timesteps in each batch is minimized
x_list.sort(key=len)
y_list.sort(key=len)
# Generate batches
while True:
for b in range(batches_per_epoch):
longest_index = (b + 1) * batch_size - 1
timesteps = len(x_list[longest_index])
x_train = np.zeros((batch_size, timesteps, num_features))
y_train = np.zeros((batch_size, timesteps, 1))
for i in range(batch_size):
li = b * batch_size + i
x_train[i, 0:len(x_list[li]), :] = x_list[li]
y_train[i, 0:len(y_list[li]), 0] = y_list[li]
yield x_train, y_train
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)
In case it helps someone, here's how I ended up implementing a solution:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (m,n) of float64, where m is the timesteps
# and n is the number of features.
# y_list is a list of arrays (m,1) of Boolean.
assert len(x_list) == len(y_list)
num_sequences = len(x_list)
num_features = len(x_list[0][0])
batch_size = 10
batches_per_epoch = 5
assert batch_size * batches_per_epoch == num_sequences
def train_generator():
# Sort by length so the number of timesteps in each batch is minimized
x_list.sort(key=len)
y_list.sort(key=len)
# Generate batches
while True:
for b in range(batches_per_epoch):
longest_index = (b + 1) * batch_size - 1
timesteps = len(x_list[longest_index])
x_train = np.zeros((batch_size, timesteps, num_features))
y_train = np.zeros((batch_size, timesteps, 1))
for i in range(batch_size):
li = b * batch_size + i
x_train[i, 0:len(x_list[li]), :] = x_list[li]
y_train[i, 0:len(y_list[li]), 0] = y_list[li]
yield x_train, y_train
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)
answered Nov 23 '18 at 17:00
jerha202jerha202
33
33
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53301877%2fsequence-to-sequence-classification-with-variable-sequence-lengths-in-keras%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You can try and make some embeddings of fixed size (ex. 500) and feed that into your network
– Novak
Nov 14 '18 at 15:57
@Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.
– jerha202
Nov 15 '18 at 10:30
Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).
– Novak
Nov 15 '18 at 11:26
In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.
– Novak
Nov 15 '18 at 11:28
My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!
– jerha202
Nov 15 '18 at 12:05