Sequence-to-sequence classification with variable sequence lengths in Keras










0















I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.



I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.



Here's the problem in code:



import tensorflow as tf
import numpy as np

# Load data from file
x_list, y_list = loadSequences("train.csv")

# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.

x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?

model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)









share|improve this question
























  • You can try and make some embeddings of fixed size (ex. 500) and feed that into your network

    – Novak
    Nov 14 '18 at 15:57











  • @Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.

    – jerha202
    Nov 15 '18 at 10:30












  • Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).

    – Novak
    Nov 15 '18 at 11:26











  • In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.

    – Novak
    Nov 15 '18 at 11:28











  • My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!

    – jerha202
    Nov 15 '18 at 12:05















0















I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.



I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.



Here's the problem in code:



import tensorflow as tf
import numpy as np

# Load data from file
x_list, y_list = loadSequences("train.csv")

# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.

x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?

model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)









share|improve this question
























  • You can try and make some embeddings of fixed size (ex. 500) and feed that into your network

    – Novak
    Nov 14 '18 at 15:57











  • @Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.

    – jerha202
    Nov 15 '18 at 10:30












  • Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).

    – Novak
    Nov 15 '18 at 11:26











  • In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.

    – Novak
    Nov 15 '18 at 11:28











  • My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!

    – jerha202
    Nov 15 '18 at 12:05













0












0








0








I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.



I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.



Here's the problem in code:



import tensorflow as tf
import numpy as np

# Load data from file
x_list, y_list = loadSequences("train.csv")

# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.

x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?

model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)









share|improve this question
















I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.



I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.



Here's the problem in code:



import tensorflow as tf
import numpy as np

# Load data from file
x_list, y_list = loadSequences("train.csv")

# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.

x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?

model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)






python keras lstm






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 14 '18 at 15:54







jerha202

















asked Nov 14 '18 at 13:54









jerha202jerha202

33




33












  • You can try and make some embeddings of fixed size (ex. 500) and feed that into your network

    – Novak
    Nov 14 '18 at 15:57











  • @Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.

    – jerha202
    Nov 15 '18 at 10:30












  • Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).

    – Novak
    Nov 15 '18 at 11:26











  • In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.

    – Novak
    Nov 15 '18 at 11:28











  • My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!

    – jerha202
    Nov 15 '18 at 12:05

















  • You can try and make some embeddings of fixed size (ex. 500) and feed that into your network

    – Novak
    Nov 14 '18 at 15:57











  • @Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.

    – jerha202
    Nov 15 '18 at 10:30












  • Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).

    – Novak
    Nov 15 '18 at 11:26











  • In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.

    – Novak
    Nov 15 '18 at 11:28











  • My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!

    – jerha202
    Nov 15 '18 at 12:05
















You can try and make some embeddings of fixed size (ex. 500) and feed that into your network

– Novak
Nov 14 '18 at 15:57





You can try and make some embeddings of fixed size (ex. 500) and feed that into your network

– Novak
Nov 14 '18 at 15:57













@Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.

– jerha202
Nov 15 '18 at 10:30






@Novak I only know embeddings as a sort of feature vector for words in NLP. Can you clarify how that would be useful for feeding sequences of variable lengths into model.fit()? Notice that my problem is not NLP-related.

– jerha202
Nov 15 '18 at 10:30














Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).

– Novak
Nov 15 '18 at 11:26





Embeddings are mostly used in NLP, but they are used for other stuff as well. There are functions that can create embeddings from arbitrary sequence. Or you can use something like dimensionality reduction. Either way, goal is to get shorter sequence that preserves the attributes of an object to other objects and that it's mapping with original object is 1-to-1 (every object has a different embedding).

– Novak
Nov 15 '18 at 11:26













In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.

– Novak
Nov 15 '18 at 11:28





In Keras, you can use Embedding layer to embed the input data. So, you feed the padded data to the embedding layer and set the dimension of Embedding layer to be 500 - you get 500-long embeddings. Now, you train it and for predict you can use embedding layer weights to embed the input sequences to feed the data to the model.

– Novak
Nov 15 '18 at 11:28













My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!

– jerha202
Nov 15 '18 at 12:05





My problem is a sequence-to-sequence classification problem, meaning that I need to predict a class for each time step. For example, a (1000, 6) sequence (1000 timesteps, 6 features) of floats should predict a (1000,1) sequence of booleans. Moreover, prediction should work in real time, so it should predict a new boolean for each new incoming 6-feature vector. So I don't think embedding an arbitrary-length sequence into a fixed-size embedding would work here. I'm sorry if my problem description wasn't clear!

– jerha202
Nov 15 '18 at 12:05












2 Answers
2






active

oldest

votes


















0














You can do some thing like this



use generator function take a look at this link fit_generator look for fit_generator method.



def data_generater(batch_size):
print("reading data")
training_file = 'data_location', 'r')

# assuming data is in json format so feels free to change accordingly

training_set = json.loads(training_file.read())
training_file.close()

batch_i = 0 # Counter inside the current batch vector
batch_x = # The current batch's x data
batch_y = # The current batch's y data

while True:

for obj in training_set:
batch_x.append(your input sequences one by one)
if obj['val'] == True:
batch_y.append([1])
elif obj['val'] == False:
batch_y.append([0])
batch_i += 1

if batch_i == batch_size:
# Ready to yield the batch
# pad input to max length in the batch
batch_x = pad_txt_data(batch_x)
yield batch_x, np.array(batch_y)
batch_x =
batch_y =
batch_i = 0

def pad_txt_data(arr):
# expecting arr to be in the shape of (10, m, 6)
paded_arr =
prefered_len = len(max(arr, key=len))

# Now pad all your sequences to preferred length in the batch(arr)

return np.array(paded_arr)


and in the model



model = keras.Sequential()
model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
model.add(keras.layers.LSTM(32))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)


Batch_size, steps_per_epoch, epoch can be different.
Generally



steps_per_epoch = (number of sequences/batch_size)



Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.



And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.






share|improve this answer

























  • Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?

    – jerha202
    Nov 23 '18 at 16:40











  • unfortunately yes.

    – Manoj
    Nov 24 '18 at 17:12











  • Sorry i mean CuDNNLSTM does not support masking

    – Manoj
    Nov 24 '18 at 17:27


















0














In case it helps someone, here's how I ended up implementing a solution:



import tensorflow as tf
import numpy as np

# Load data from file
x_list, y_list = loadSequences("train.csv")

# x_list is now a list of arrays (m,n) of float64, where m is the timesteps
# and n is the number of features.
# y_list is a list of arrays (m,1) of Boolean.
assert len(x_list) == len(y_list)
num_sequences = len(x_list)
num_features = len(x_list[0][0])
batch_size = 10
batches_per_epoch = 5
assert batch_size * batches_per_epoch == num_sequences

def train_generator():
# Sort by length so the number of timesteps in each batch is minimized
x_list.sort(key=len)
y_list.sort(key=len)
# Generate batches
while True:
for b in range(batches_per_epoch):
longest_index = (b + 1) * batch_size - 1
timesteps = len(x_list[longest_index])
x_train = np.zeros((batch_size, timesteps, num_features))
y_train = np.zeros((batch_size, timesteps, 1))
for i in range(batch_size):
li = b * batch_size + i
x_train[i, 0:len(x_list[li]), :] = x_list[li]
y_train[i, 0:len(y_list[li]), 0] = y_list[li]
yield x_train, y_train

model = tf.keras.models.Sequential([
tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)





share|improve this answer






















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53301877%2fsequence-to-sequence-classification-with-variable-sequence-lengths-in-keras%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    You can do some thing like this



    use generator function take a look at this link fit_generator look for fit_generator method.



    def data_generater(batch_size):
    print("reading data")
    training_file = 'data_location', 'r')

    # assuming data is in json format so feels free to change accordingly

    training_set = json.loads(training_file.read())
    training_file.close()

    batch_i = 0 # Counter inside the current batch vector
    batch_x = # The current batch's x data
    batch_y = # The current batch's y data

    while True:

    for obj in training_set:
    batch_x.append(your input sequences one by one)
    if obj['val'] == True:
    batch_y.append([1])
    elif obj['val'] == False:
    batch_y.append([0])
    batch_i += 1

    if batch_i == batch_size:
    # Ready to yield the batch
    # pad input to max length in the batch
    batch_x = pad_txt_data(batch_x)
    yield batch_x, np.array(batch_y)
    batch_x =
    batch_y =
    batch_i = 0

    def pad_txt_data(arr):
    # expecting arr to be in the shape of (10, m, 6)
    paded_arr =
    prefered_len = len(max(arr, key=len))

    # Now pad all your sequences to preferred length in the batch(arr)

    return np.array(paded_arr)


    and in the model



    model = keras.Sequential()
    model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
    model.add(keras.layers.LSTM(32))
    model.add(keras.layers.Dense(1, activation="softmax"))
    model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
    model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)


    Batch_size, steps_per_epoch, epoch can be different.
    Generally



    steps_per_epoch = (number of sequences/batch_size)



    Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.



    And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.






    share|improve this answer

























    • Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?

      – jerha202
      Nov 23 '18 at 16:40











    • unfortunately yes.

      – Manoj
      Nov 24 '18 at 17:12











    • Sorry i mean CuDNNLSTM does not support masking

      – Manoj
      Nov 24 '18 at 17:27















    0














    You can do some thing like this



    use generator function take a look at this link fit_generator look for fit_generator method.



    def data_generater(batch_size):
    print("reading data")
    training_file = 'data_location', 'r')

    # assuming data is in json format so feels free to change accordingly

    training_set = json.loads(training_file.read())
    training_file.close()

    batch_i = 0 # Counter inside the current batch vector
    batch_x = # The current batch's x data
    batch_y = # The current batch's y data

    while True:

    for obj in training_set:
    batch_x.append(your input sequences one by one)
    if obj['val'] == True:
    batch_y.append([1])
    elif obj['val'] == False:
    batch_y.append([0])
    batch_i += 1

    if batch_i == batch_size:
    # Ready to yield the batch
    # pad input to max length in the batch
    batch_x = pad_txt_data(batch_x)
    yield batch_x, np.array(batch_y)
    batch_x =
    batch_y =
    batch_i = 0

    def pad_txt_data(arr):
    # expecting arr to be in the shape of (10, m, 6)
    paded_arr =
    prefered_len = len(max(arr, key=len))

    # Now pad all your sequences to preferred length in the batch(arr)

    return np.array(paded_arr)


    and in the model



    model = keras.Sequential()
    model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
    model.add(keras.layers.LSTM(32))
    model.add(keras.layers.Dense(1, activation="softmax"))
    model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
    model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)


    Batch_size, steps_per_epoch, epoch can be different.
    Generally



    steps_per_epoch = (number of sequences/batch_size)



    Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.



    And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.






    share|improve this answer

























    • Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?

      – jerha202
      Nov 23 '18 at 16:40











    • unfortunately yes.

      – Manoj
      Nov 24 '18 at 17:12











    • Sorry i mean CuDNNLSTM does not support masking

      – Manoj
      Nov 24 '18 at 17:27













    0












    0








    0







    You can do some thing like this



    use generator function take a look at this link fit_generator look for fit_generator method.



    def data_generater(batch_size):
    print("reading data")
    training_file = 'data_location', 'r')

    # assuming data is in json format so feels free to change accordingly

    training_set = json.loads(training_file.read())
    training_file.close()

    batch_i = 0 # Counter inside the current batch vector
    batch_x = # The current batch's x data
    batch_y = # The current batch's y data

    while True:

    for obj in training_set:
    batch_x.append(your input sequences one by one)
    if obj['val'] == True:
    batch_y.append([1])
    elif obj['val'] == False:
    batch_y.append([0])
    batch_i += 1

    if batch_i == batch_size:
    # Ready to yield the batch
    # pad input to max length in the batch
    batch_x = pad_txt_data(batch_x)
    yield batch_x, np.array(batch_y)
    batch_x =
    batch_y =
    batch_i = 0

    def pad_txt_data(arr):
    # expecting arr to be in the shape of (10, m, 6)
    paded_arr =
    prefered_len = len(max(arr, key=len))

    # Now pad all your sequences to preferred length in the batch(arr)

    return np.array(paded_arr)


    and in the model



    model = keras.Sequential()
    model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
    model.add(keras.layers.LSTM(32))
    model.add(keras.layers.Dense(1, activation="softmax"))
    model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
    model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)


    Batch_size, steps_per_epoch, epoch can be different.
    Generally



    steps_per_epoch = (number of sequences/batch_size)



    Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.



    And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.






    share|improve this answer















    You can do some thing like this



    use generator function take a look at this link fit_generator look for fit_generator method.



    def data_generater(batch_size):
    print("reading data")
    training_file = 'data_location', 'r')

    # assuming data is in json format so feels free to change accordingly

    training_set = json.loads(training_file.read())
    training_file.close()

    batch_i = 0 # Counter inside the current batch vector
    batch_x = # The current batch's x data
    batch_y = # The current batch's y data

    while True:

    for obj in training_set:
    batch_x.append(your input sequences one by one)
    if obj['val'] == True:
    batch_y.append([1])
    elif obj['val'] == False:
    batch_y.append([0])
    batch_i += 1

    if batch_i == batch_size:
    # Ready to yield the batch
    # pad input to max length in the batch
    batch_x = pad_txt_data(batch_x)
    yield batch_x, np.array(batch_y)
    batch_x =
    batch_y =
    batch_i = 0

    def pad_txt_data(arr):
    # expecting arr to be in the shape of (10, m, 6)
    paded_arr =
    prefered_len = len(max(arr, key=len))

    # Now pad all your sequences to preferred length in the batch(arr)

    return np.array(paded_arr)


    and in the model



    model = keras.Sequential()
    model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
    model.add(keras.layers.LSTM(32))
    model.add(keras.layers.Dense(1, activation="softmax"))
    model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
    model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)


    Batch_size, steps_per_epoch, epoch can be different.
    Generally



    steps_per_epoch = (number of sequences/batch_size)



    Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.



    And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 18 '18 at 0:57

























    answered Nov 14 '18 at 16:38









    ManojManoj

    11810




    11810












    • Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?

      – jerha202
      Nov 23 '18 at 16:40











    • unfortunately yes.

      – Manoj
      Nov 24 '18 at 17:12











    • Sorry i mean CuDNNLSTM does not support masking

      – Manoj
      Nov 24 '18 at 17:27

















    • Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?

      – jerha202
      Nov 23 '18 at 16:40











    • unfortunately yes.

      – Manoj
      Nov 24 '18 at 17:12











    • Sorry i mean CuDNNLSTM does not support masking

      – Manoj
      Nov 24 '18 at 17:27
















    Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?

    – jerha202
    Nov 23 '18 at 16:40





    Thank you, that works! With regard to what to call the problem, I know what you mean but I was afraid "binary classification" would lead readers to think entire sequences should be classified, rather than outputting a class for each time step. With regard to using CuDNNLSTM, it seems this layer doesn't support masking, am I right?

    – jerha202
    Nov 23 '18 at 16:40













    unfortunately yes.

    – Manoj
    Nov 24 '18 at 17:12





    unfortunately yes.

    – Manoj
    Nov 24 '18 at 17:12













    Sorry i mean CuDNNLSTM does not support masking

    – Manoj
    Nov 24 '18 at 17:27





    Sorry i mean CuDNNLSTM does not support masking

    – Manoj
    Nov 24 '18 at 17:27













    0














    In case it helps someone, here's how I ended up implementing a solution:



    import tensorflow as tf
    import numpy as np

    # Load data from file
    x_list, y_list = loadSequences("train.csv")

    # x_list is now a list of arrays (m,n) of float64, where m is the timesteps
    # and n is the number of features.
    # y_list is a list of arrays (m,1) of Boolean.
    assert len(x_list) == len(y_list)
    num_sequences = len(x_list)
    num_features = len(x_list[0][0])
    batch_size = 10
    batches_per_epoch = 5
    assert batch_size * batches_per_epoch == num_sequences

    def train_generator():
    # Sort by length so the number of timesteps in each batch is minimized
    x_list.sort(key=len)
    y_list.sort(key=len)
    # Generate batches
    while True:
    for b in range(batches_per_epoch):
    longest_index = (b + 1) * batch_size - 1
    timesteps = len(x_list[longest_index])
    x_train = np.zeros((batch_size, timesteps, num_features))
    y_train = np.zeros((batch_size, timesteps, 1))
    for i in range(batch_size):
    li = b * batch_size + i
    x_train[i, 0:len(x_list[li]), :] = x_list[li]
    y_train[i, 0:len(y_list[li]), 0] = y_list[li]
    yield x_train, y_train

    model = tf.keras.models.Sequential([
    tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
    tf.keras.layers.LSTM(32, return_sequences=True),
    tf.keras.layers.Dense(2, activation=tf.nn.softmax)
    ])
    model.compile(optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy'])
    model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)





    share|improve this answer



























      0














      In case it helps someone, here's how I ended up implementing a solution:



      import tensorflow as tf
      import numpy as np

      # Load data from file
      x_list, y_list = loadSequences("train.csv")

      # x_list is now a list of arrays (m,n) of float64, where m is the timesteps
      # and n is the number of features.
      # y_list is a list of arrays (m,1) of Boolean.
      assert len(x_list) == len(y_list)
      num_sequences = len(x_list)
      num_features = len(x_list[0][0])
      batch_size = 10
      batches_per_epoch = 5
      assert batch_size * batches_per_epoch == num_sequences

      def train_generator():
      # Sort by length so the number of timesteps in each batch is minimized
      x_list.sort(key=len)
      y_list.sort(key=len)
      # Generate batches
      while True:
      for b in range(batches_per_epoch):
      longest_index = (b + 1) * batch_size - 1
      timesteps = len(x_list[longest_index])
      x_train = np.zeros((batch_size, timesteps, num_features))
      y_train = np.zeros((batch_size, timesteps, 1))
      for i in range(batch_size):
      li = b * batch_size + i
      x_train[i, 0:len(x_list[li]), :] = x_list[li]
      y_train[i, 0:len(y_list[li]), 0] = y_list[li]
      yield x_train, y_train

      model = tf.keras.models.Sequential([
      tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
      tf.keras.layers.LSTM(32, return_sequences=True),
      tf.keras.layers.Dense(2, activation=tf.nn.softmax)
      ])
      model.compile(optimizer='adam',
      loss='sparse_categorical_crossentropy',
      metrics=['accuracy'])
      model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)





      share|improve this answer

























        0












        0








        0







        In case it helps someone, here's how I ended up implementing a solution:



        import tensorflow as tf
        import numpy as np

        # Load data from file
        x_list, y_list = loadSequences("train.csv")

        # x_list is now a list of arrays (m,n) of float64, where m is the timesteps
        # and n is the number of features.
        # y_list is a list of arrays (m,1) of Boolean.
        assert len(x_list) == len(y_list)
        num_sequences = len(x_list)
        num_features = len(x_list[0][0])
        batch_size = 10
        batches_per_epoch = 5
        assert batch_size * batches_per_epoch == num_sequences

        def train_generator():
        # Sort by length so the number of timesteps in each batch is minimized
        x_list.sort(key=len)
        y_list.sort(key=len)
        # Generate batches
        while True:
        for b in range(batches_per_epoch):
        longest_index = (b + 1) * batch_size - 1
        timesteps = len(x_list[longest_index])
        x_train = np.zeros((batch_size, timesteps, num_features))
        y_train = np.zeros((batch_size, timesteps, 1))
        for i in range(batch_size):
        li = b * batch_size + i
        x_train[i, 0:len(x_list[li]), :] = x_list[li]
        y_train[i, 0:len(y_list[li]), 0] = y_list[li]
        yield x_train, y_train

        model = tf.keras.models.Sequential([
        tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
        tf.keras.layers.LSTM(32, return_sequences=True),
        tf.keras.layers.Dense(2, activation=tf.nn.softmax)
        ])
        model.compile(optimizer='adam',
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy'])
        model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)





        share|improve this answer













        In case it helps someone, here's how I ended up implementing a solution:



        import tensorflow as tf
        import numpy as np

        # Load data from file
        x_list, y_list = loadSequences("train.csv")

        # x_list is now a list of arrays (m,n) of float64, where m is the timesteps
        # and n is the number of features.
        # y_list is a list of arrays (m,1) of Boolean.
        assert len(x_list) == len(y_list)
        num_sequences = len(x_list)
        num_features = len(x_list[0][0])
        batch_size = 10
        batches_per_epoch = 5
        assert batch_size * batches_per_epoch == num_sequences

        def train_generator():
        # Sort by length so the number of timesteps in each batch is minimized
        x_list.sort(key=len)
        y_list.sort(key=len)
        # Generate batches
        while True:
        for b in range(batches_per_epoch):
        longest_index = (b + 1) * batch_size - 1
        timesteps = len(x_list[longest_index])
        x_train = np.zeros((batch_size, timesteps, num_features))
        y_train = np.zeros((batch_size, timesteps, 1))
        for i in range(batch_size):
        li = b * batch_size + i
        x_train[i, 0:len(x_list[li]), :] = x_list[li]
        y_train[i, 0:len(y_list[li]), 0] = y_list[li]
        yield x_train, y_train

        model = tf.keras.models.Sequential([
        tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
        tf.keras.layers.LSTM(32, return_sequences=True),
        tf.keras.layers.Dense(2, activation=tf.nn.softmax)
        ])
        model.compile(optimizer='adam',
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy'])
        model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 23 '18 at 17:00









        jerha202jerha202

        33




        33



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53301877%2fsequence-to-sequence-classification-with-variable-sequence-lengths-in-keras%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            How to read a connectionString WITH PROVIDER in .NET Core?

            In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

            Museum of Modern and Contemporary Art of Trento and Rovereto