Poor exit code when managing multiple sessions
I'm having an weird issue when training models using tf.Graph and tf.Session. And the implementation is somewhat odd, so bear with me. I'd like to explain the application structure.
The issue has been finally (and somewhat embarassingly) resolved by updating all packages.
Application
The application is a service for handling multiple neural networks: training them and making predictions on them. For this reason a single graph wasn't quite enough. So when creating a new model, I firstly initialise both a Graph and a Session like so:
def __init__(self):
self.graph = tf.Graph()
with self.graph.as_default():
self.session = tf.Session()
These are then used both in the training process and when loading a model from disk.
def fit(self, x_train, y_train, n=200, batch=256):
with self.graph.as_default():
with self.session.as_default():
self.model.fit(x_train, y_train, epochs=n, batch_size=batch, verbose=0)
This is where the problem occurs (I've managed to comment everything out one by one, and the fit method is where it's at), but for further context, here is the (stripped down) creation method as well. It uses Keras.
def create(self):
with self.graph.as_default():
with self.session.as_default():
self.model = Sequential()
self.model.add(Dense(64, input_dim=shape[0], activation='relu',
kernel_regularizer=reg.l1_l2(0.1, 0.2)))
self.model.add(Dropout(0.5))
self.model.add(Dense(1, activation='sigmoid', kernel_regularizer=reg.l1_l2(0.1, 0.2)))
self.model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
Problem
When initialising the network and fitting it with data, the process exits with a bad code: 0xC0000005. This doesn't give much information on the problem itself, and the bad exit code is given just then, on exit. Even a print statement is executed after the routines successfully. This has led me to suspect it is not a problem with the implementation, but something else.
Environment
I'm using Python 3.6.5 on PyCharm, but the problem has occured even when executing from a command line. As I said, multiple models are juggled around, but one training is enough to crash.
What could possibly be at fault here? I realise it's not such a reproducable problem, but any pointers towards even debugging would be greatly appreciated.
Adventures
I tried modifying the fit function according to this answer, but with no luck. Here's the modified version:
from keras import backend as K
import gc
def fit(self, x_train, y_train, n=20, batch=256):
K.set_session(self.session)
with self.graph.as_default():
with self.session.as_default():
self.model.fit(x_train, y_train, epochs=n*10, batch_size=batch, verbose=0)
K.clear_session()
gc.collect()
Next I tried to create a new session for each computation (tf.Session(graph=self.graph)). It worked when using the cg.collect(), but after training the model, I could not make predictions with a new session. tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value dense_1/bias
Update
Currently (Nov 8th) I'm releasing all possible resources before creating and when loading a model. This has had the effect that I can create the model once, but the second time around (I do two training passes to evaluate the model independently) the program crashes like before. Let's try a new question, this is getting out of hand. Q v.2
python tensorflow keras
add a comment |
I'm having an weird issue when training models using tf.Graph and tf.Session. And the implementation is somewhat odd, so bear with me. I'd like to explain the application structure.
The issue has been finally (and somewhat embarassingly) resolved by updating all packages.
Application
The application is a service for handling multiple neural networks: training them and making predictions on them. For this reason a single graph wasn't quite enough. So when creating a new model, I firstly initialise both a Graph and a Session like so:
def __init__(self):
self.graph = tf.Graph()
with self.graph.as_default():
self.session = tf.Session()
These are then used both in the training process and when loading a model from disk.
def fit(self, x_train, y_train, n=200, batch=256):
with self.graph.as_default():
with self.session.as_default():
self.model.fit(x_train, y_train, epochs=n, batch_size=batch, verbose=0)
This is where the problem occurs (I've managed to comment everything out one by one, and the fit method is where it's at), but for further context, here is the (stripped down) creation method as well. It uses Keras.
def create(self):
with self.graph.as_default():
with self.session.as_default():
self.model = Sequential()
self.model.add(Dense(64, input_dim=shape[0], activation='relu',
kernel_regularizer=reg.l1_l2(0.1, 0.2)))
self.model.add(Dropout(0.5))
self.model.add(Dense(1, activation='sigmoid', kernel_regularizer=reg.l1_l2(0.1, 0.2)))
self.model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
Problem
When initialising the network and fitting it with data, the process exits with a bad code: 0xC0000005. This doesn't give much information on the problem itself, and the bad exit code is given just then, on exit. Even a print statement is executed after the routines successfully. This has led me to suspect it is not a problem with the implementation, but something else.
Environment
I'm using Python 3.6.5 on PyCharm, but the problem has occured even when executing from a command line. As I said, multiple models are juggled around, but one training is enough to crash.
What could possibly be at fault here? I realise it's not such a reproducable problem, but any pointers towards even debugging would be greatly appreciated.
Adventures
I tried modifying the fit function according to this answer, but with no luck. Here's the modified version:
from keras import backend as K
import gc
def fit(self, x_train, y_train, n=20, batch=256):
K.set_session(self.session)
with self.graph.as_default():
with self.session.as_default():
self.model.fit(x_train, y_train, epochs=n*10, batch_size=batch, verbose=0)
K.clear_session()
gc.collect()
Next I tried to create a new session for each computation (tf.Session(graph=self.graph)). It worked when using the cg.collect(), but after training the model, I could not make predictions with a new session. tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value dense_1/bias
Update
Currently (Nov 8th) I'm releasing all possible resources before creating and when loading a model. This has had the effect that I can create the model once, but the second time around (I do two training passes to evaluate the model independently) the program crashes like before. Let's try a new question, this is getting out of hand. Q v.2
python tensorflow keras
You posted yourtrainingroutine, but where's the code you use for passing inputs through your trained model in order to generate outputs, i.e. predictions? Are you saving the model as a.pklfile?
– Dascienz
Oct 26 '18 at 13:38
@Dascienz As I said, only a call to firstcreateand thenfit(ok, maybe I wasn't clear) is enough to give the error. The predict method is the same nestedwith-statement andmodel.predict. I'm saving the model as it's json structure and .h5 weights file, but again, doesn't affect this situation.
– Felix
Oct 26 '18 at 14:05
Can you post a screenshot of the full error message being thrown on your interface?
– Dascienz
Oct 26 '18 at 14:08
Is the indentation error in yourcreatefunction just a typo, by the way? Your twowithstatements...
– Dascienz
Oct 26 '18 at 14:10
@Dascienz Yes, it is a typo :D But I'm sorry, I'm not able to get my hands on the code for a while now, at least for the weekend. But there really was no thrown error as I said, the exit code of the process was just the error code above.
– Felix
Oct 26 '18 at 18:44
add a comment |
I'm having an weird issue when training models using tf.Graph and tf.Session. And the implementation is somewhat odd, so bear with me. I'd like to explain the application structure.
The issue has been finally (and somewhat embarassingly) resolved by updating all packages.
Application
The application is a service for handling multiple neural networks: training them and making predictions on them. For this reason a single graph wasn't quite enough. So when creating a new model, I firstly initialise both a Graph and a Session like so:
def __init__(self):
self.graph = tf.Graph()
with self.graph.as_default():
self.session = tf.Session()
These are then used both in the training process and when loading a model from disk.
def fit(self, x_train, y_train, n=200, batch=256):
with self.graph.as_default():
with self.session.as_default():
self.model.fit(x_train, y_train, epochs=n, batch_size=batch, verbose=0)
This is where the problem occurs (I've managed to comment everything out one by one, and the fit method is where it's at), but for further context, here is the (stripped down) creation method as well. It uses Keras.
def create(self):
with self.graph.as_default():
with self.session.as_default():
self.model = Sequential()
self.model.add(Dense(64, input_dim=shape[0], activation='relu',
kernel_regularizer=reg.l1_l2(0.1, 0.2)))
self.model.add(Dropout(0.5))
self.model.add(Dense(1, activation='sigmoid', kernel_regularizer=reg.l1_l2(0.1, 0.2)))
self.model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
Problem
When initialising the network and fitting it with data, the process exits with a bad code: 0xC0000005. This doesn't give much information on the problem itself, and the bad exit code is given just then, on exit. Even a print statement is executed after the routines successfully. This has led me to suspect it is not a problem with the implementation, but something else.
Environment
I'm using Python 3.6.5 on PyCharm, but the problem has occured even when executing from a command line. As I said, multiple models are juggled around, but one training is enough to crash.
What could possibly be at fault here? I realise it's not such a reproducable problem, but any pointers towards even debugging would be greatly appreciated.
Adventures
I tried modifying the fit function according to this answer, but with no luck. Here's the modified version:
from keras import backend as K
import gc
def fit(self, x_train, y_train, n=20, batch=256):
K.set_session(self.session)
with self.graph.as_default():
with self.session.as_default():
self.model.fit(x_train, y_train, epochs=n*10, batch_size=batch, verbose=0)
K.clear_session()
gc.collect()
Next I tried to create a new session for each computation (tf.Session(graph=self.graph)). It worked when using the cg.collect(), but after training the model, I could not make predictions with a new session. tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value dense_1/bias
Update
Currently (Nov 8th) I'm releasing all possible resources before creating and when loading a model. This has had the effect that I can create the model once, but the second time around (I do two training passes to evaluate the model independently) the program crashes like before. Let's try a new question, this is getting out of hand. Q v.2
python tensorflow keras
I'm having an weird issue when training models using tf.Graph and tf.Session. And the implementation is somewhat odd, so bear with me. I'd like to explain the application structure.
The issue has been finally (and somewhat embarassingly) resolved by updating all packages.
Application
The application is a service for handling multiple neural networks: training them and making predictions on them. For this reason a single graph wasn't quite enough. So when creating a new model, I firstly initialise both a Graph and a Session like so:
def __init__(self):
self.graph = tf.Graph()
with self.graph.as_default():
self.session = tf.Session()
These are then used both in the training process and when loading a model from disk.
def fit(self, x_train, y_train, n=200, batch=256):
with self.graph.as_default():
with self.session.as_default():
self.model.fit(x_train, y_train, epochs=n, batch_size=batch, verbose=0)
This is where the problem occurs (I've managed to comment everything out one by one, and the fit method is where it's at), but for further context, here is the (stripped down) creation method as well. It uses Keras.
def create(self):
with self.graph.as_default():
with self.session.as_default():
self.model = Sequential()
self.model.add(Dense(64, input_dim=shape[0], activation='relu',
kernel_regularizer=reg.l1_l2(0.1, 0.2)))
self.model.add(Dropout(0.5))
self.model.add(Dense(1, activation='sigmoid', kernel_regularizer=reg.l1_l2(0.1, 0.2)))
self.model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
Problem
When initialising the network and fitting it with data, the process exits with a bad code: 0xC0000005. This doesn't give much information on the problem itself, and the bad exit code is given just then, on exit. Even a print statement is executed after the routines successfully. This has led me to suspect it is not a problem with the implementation, but something else.
Environment
I'm using Python 3.6.5 on PyCharm, but the problem has occured even when executing from a command line. As I said, multiple models are juggled around, but one training is enough to crash.
What could possibly be at fault here? I realise it's not such a reproducable problem, but any pointers towards even debugging would be greatly appreciated.
Adventures
I tried modifying the fit function according to this answer, but with no luck. Here's the modified version:
from keras import backend as K
import gc
def fit(self, x_train, y_train, n=20, batch=256):
K.set_session(self.session)
with self.graph.as_default():
with self.session.as_default():
self.model.fit(x_train, y_train, epochs=n*10, batch_size=batch, verbose=0)
K.clear_session()
gc.collect()
Next I tried to create a new session for each computation (tf.Session(graph=self.graph)). It worked when using the cg.collect(), but after training the model, I could not make predictions with a new session. tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value dense_1/bias
Update
Currently (Nov 8th) I'm releasing all possible resources before creating and when loading a model. This has had the effect that I can create the model once, but the second time around (I do two training passes to evaluate the model independently) the program crashes like before. Let's try a new question, this is getting out of hand. Q v.2
python tensorflow keras
python tensorflow keras
edited Nov 15 '18 at 14:26
Felix
asked Oct 26 '18 at 6:14
FelixFelix
1,060321
1,060321
You posted yourtrainingroutine, but where's the code you use for passing inputs through your trained model in order to generate outputs, i.e. predictions? Are you saving the model as a.pklfile?
– Dascienz
Oct 26 '18 at 13:38
@Dascienz As I said, only a call to firstcreateand thenfit(ok, maybe I wasn't clear) is enough to give the error. The predict method is the same nestedwith-statement andmodel.predict. I'm saving the model as it's json structure and .h5 weights file, but again, doesn't affect this situation.
– Felix
Oct 26 '18 at 14:05
Can you post a screenshot of the full error message being thrown on your interface?
– Dascienz
Oct 26 '18 at 14:08
Is the indentation error in yourcreatefunction just a typo, by the way? Your twowithstatements...
– Dascienz
Oct 26 '18 at 14:10
@Dascienz Yes, it is a typo :D But I'm sorry, I'm not able to get my hands on the code for a while now, at least for the weekend. But there really was no thrown error as I said, the exit code of the process was just the error code above.
– Felix
Oct 26 '18 at 18:44
add a comment |
You posted yourtrainingroutine, but where's the code you use for passing inputs through your trained model in order to generate outputs, i.e. predictions? Are you saving the model as a.pklfile?
– Dascienz
Oct 26 '18 at 13:38
@Dascienz As I said, only a call to firstcreateand thenfit(ok, maybe I wasn't clear) is enough to give the error. The predict method is the same nestedwith-statement andmodel.predict. I'm saving the model as it's json structure and .h5 weights file, but again, doesn't affect this situation.
– Felix
Oct 26 '18 at 14:05
Can you post a screenshot of the full error message being thrown on your interface?
– Dascienz
Oct 26 '18 at 14:08
Is the indentation error in yourcreatefunction just a typo, by the way? Your twowithstatements...
– Dascienz
Oct 26 '18 at 14:10
@Dascienz Yes, it is a typo :D But I'm sorry, I'm not able to get my hands on the code for a while now, at least for the weekend. But there really was no thrown error as I said, the exit code of the process was just the error code above.
– Felix
Oct 26 '18 at 18:44
You posted your
training routine, but where's the code you use for passing inputs through your trained model in order to generate outputs, i.e. predictions? Are you saving the model as a .pkl file?– Dascienz
Oct 26 '18 at 13:38
You posted your
training routine, but where's the code you use for passing inputs through your trained model in order to generate outputs, i.e. predictions? Are you saving the model as a .pkl file?– Dascienz
Oct 26 '18 at 13:38
@Dascienz As I said, only a call to first
create and then fit (ok, maybe I wasn't clear) is enough to give the error. The predict method is the same nested with-statement and model.predict. I'm saving the model as it's json structure and .h5 weights file, but again, doesn't affect this situation.– Felix
Oct 26 '18 at 14:05
@Dascienz As I said, only a call to first
create and then fit (ok, maybe I wasn't clear) is enough to give the error. The predict method is the same nested with-statement and model.predict. I'm saving the model as it's json structure and .h5 weights file, but again, doesn't affect this situation.– Felix
Oct 26 '18 at 14:05
Can you post a screenshot of the full error message being thrown on your interface?
– Dascienz
Oct 26 '18 at 14:08
Can you post a screenshot of the full error message being thrown on your interface?
– Dascienz
Oct 26 '18 at 14:08
Is the indentation error in your
create function just a typo, by the way? Your two with statements...– Dascienz
Oct 26 '18 at 14:10
Is the indentation error in your
create function just a typo, by the way? Your two with statements...– Dascienz
Oct 26 '18 at 14:10
@Dascienz Yes, it is a typo :D But I'm sorry, I'm not able to get my hands on the code for a while now, at least for the weekend. But there really was no thrown error as I said, the exit code of the process was just the error code above.
– Felix
Oct 26 '18 at 18:44
@Dascienz Yes, it is a typo :D But I'm sorry, I'm not able to get my hands on the code for a while now, at least for the weekend. But there really was no thrown error as I said, the exit code of the process was just the error code above.
– Felix
Oct 26 '18 at 18:44
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53002518%2fpoor-exit-code-when-managing-multiple-sessions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53002518%2fpoor-exit-code-when-managing-multiple-sessions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You posted your
trainingroutine, but where's the code you use for passing inputs through your trained model in order to generate outputs, i.e. predictions? Are you saving the model as a.pklfile?– Dascienz
Oct 26 '18 at 13:38
@Dascienz As I said, only a call to first
createand thenfit(ok, maybe I wasn't clear) is enough to give the error. The predict method is the same nestedwith-statement andmodel.predict. I'm saving the model as it's json structure and .h5 weights file, but again, doesn't affect this situation.– Felix
Oct 26 '18 at 14:05
Can you post a screenshot of the full error message being thrown on your interface?
– Dascienz
Oct 26 '18 at 14:08
Is the indentation error in your
createfunction just a typo, by the way? Your twowithstatements...– Dascienz
Oct 26 '18 at 14:10
@Dascienz Yes, it is a typo :D But I'm sorry, I'm not able to get my hands on the code for a while now, at least for the weekend. But there really was no thrown error as I said, the exit code of the process was just the error code above.
– Felix
Oct 26 '18 at 18:44