Understanding memory allocation in numpy: Is “temporary” memory being allocated when storing the result of an operation into variable[:, :]?
Let's assume two large multidimensional numpy arrays a
and b
. I want to perform an element-wise operation, e.g. adding them element by element:
c = a + b
In the above case, new memory is allocated for the result of a + b
. A reference to this memory is then stored in c
.
Now, let's assume that memory for c
has already been allocated. Setting the number of dimensions to two for the purpose of having a simple example, I can do the following:
c[:, :] = a + b
I can not find any documentation on how the above is exactly implemented. I can imagine two ways:
- First, memory is allocated for performing the operation
a + b
. The result is stored into this "temporary" memory before the data i.e. the result of the operation is copied intoc[:, :]
. - There is no allocation of temporary memory. The result of
a + b
goes directly intoc[:, :]
.
I played around with some code and - I could be absolutely wrong here - performance-wise it feels like the first option is more likely. Am I right? If so, how could I avoid the allocation of "temporary memory" and directly store the result into the memory which is already available in c
? I'd guess that I have to be more explicit, use functions like numpy.add
and provide references to the target memory to them.
python python-3.x numpy memory-management
add a comment |
Let's assume two large multidimensional numpy arrays a
and b
. I want to perform an element-wise operation, e.g. adding them element by element:
c = a + b
In the above case, new memory is allocated for the result of a + b
. A reference to this memory is then stored in c
.
Now, let's assume that memory for c
has already been allocated. Setting the number of dimensions to two for the purpose of having a simple example, I can do the following:
c[:, :] = a + b
I can not find any documentation on how the above is exactly implemented. I can imagine two ways:
- First, memory is allocated for performing the operation
a + b
. The result is stored into this "temporary" memory before the data i.e. the result of the operation is copied intoc[:, :]
. - There is no allocation of temporary memory. The result of
a + b
goes directly intoc[:, :]
.
I played around with some code and - I could be absolutely wrong here - performance-wise it feels like the first option is more likely. Am I right? If so, how could I avoid the allocation of "temporary memory" and directly store the result into the memory which is already available in c
? I'd guess that I have to be more explicit, use functions like numpy.add
and provide references to the target memory to them.
python python-3.x numpy memory-management
1
It's the first one because of how Python evaluates left-to-right... while numpy can do some fancy stuff, it can't override the mechanics of how Python evaluates statements...
– Jon Clements♦
Nov 12 '18 at 19:22
@JonClements Yes, I guessed so ... I would not know how to implement__add__
in another way.
– s-m-e
Nov 12 '18 at 19:23
add a comment |
Let's assume two large multidimensional numpy arrays a
and b
. I want to perform an element-wise operation, e.g. adding them element by element:
c = a + b
In the above case, new memory is allocated for the result of a + b
. A reference to this memory is then stored in c
.
Now, let's assume that memory for c
has already been allocated. Setting the number of dimensions to two for the purpose of having a simple example, I can do the following:
c[:, :] = a + b
I can not find any documentation on how the above is exactly implemented. I can imagine two ways:
- First, memory is allocated for performing the operation
a + b
. The result is stored into this "temporary" memory before the data i.e. the result of the operation is copied intoc[:, :]
. - There is no allocation of temporary memory. The result of
a + b
goes directly intoc[:, :]
.
I played around with some code and - I could be absolutely wrong here - performance-wise it feels like the first option is more likely. Am I right? If so, how could I avoid the allocation of "temporary memory" and directly store the result into the memory which is already available in c
? I'd guess that I have to be more explicit, use functions like numpy.add
and provide references to the target memory to them.
python python-3.x numpy memory-management
Let's assume two large multidimensional numpy arrays a
and b
. I want to perform an element-wise operation, e.g. adding them element by element:
c = a + b
In the above case, new memory is allocated for the result of a + b
. A reference to this memory is then stored in c
.
Now, let's assume that memory for c
has already been allocated. Setting the number of dimensions to two for the purpose of having a simple example, I can do the following:
c[:, :] = a + b
I can not find any documentation on how the above is exactly implemented. I can imagine two ways:
- First, memory is allocated for performing the operation
a + b
. The result is stored into this "temporary" memory before the data i.e. the result of the operation is copied intoc[:, :]
. - There is no allocation of temporary memory. The result of
a + b
goes directly intoc[:, :]
.
I played around with some code and - I could be absolutely wrong here - performance-wise it feels like the first option is more likely. Am I right? If so, how could I avoid the allocation of "temporary memory" and directly store the result into the memory which is already available in c
? I'd guess that I have to be more explicit, use functions like numpy.add
and provide references to the target memory to them.
python python-3.x numpy memory-management
python python-3.x numpy memory-management
asked Nov 12 '18 at 19:19
s-m-e
1,41221437
1,41221437
1
It's the first one because of how Python evaluates left-to-right... while numpy can do some fancy stuff, it can't override the mechanics of how Python evaluates statements...
– Jon Clements♦
Nov 12 '18 at 19:22
@JonClements Yes, I guessed so ... I would not know how to implement__add__
in another way.
– s-m-e
Nov 12 '18 at 19:23
add a comment |
1
It's the first one because of how Python evaluates left-to-right... while numpy can do some fancy stuff, it can't override the mechanics of how Python evaluates statements...
– Jon Clements♦
Nov 12 '18 at 19:22
@JonClements Yes, I guessed so ... I would not know how to implement__add__
in another way.
– s-m-e
Nov 12 '18 at 19:23
1
1
It's the first one because of how Python evaluates left-to-right... while numpy can do some fancy stuff, it can't override the mechanics of how Python evaluates statements...
– Jon Clements♦
Nov 12 '18 at 19:22
It's the first one because of how Python evaluates left-to-right... while numpy can do some fancy stuff, it can't override the mechanics of how Python evaluates statements...
– Jon Clements♦
Nov 12 '18 at 19:22
@JonClements Yes, I guessed so ... I would not know how to implement
__add__
in another way.– s-m-e
Nov 12 '18 at 19:23
@JonClements Yes, I guessed so ... I would not know how to implement
__add__
in another way.– s-m-e
Nov 12 '18 at 19:23
add a comment |
1 Answer
1
active
oldest
votes
The operation you're looking for is
numpy.add(a, b, out=c)
With c[:, :] = a + b
, the evaluation of a + b
does not have information about the fact that the result will be assigned to c[:, :]
. It must allocate a new array to hold the result of a + b
.
(Recent versions of NumPy do try to perform some C-level stack inspection to aggressively optimize temporaries beyond what the Python execution model would normally allow, but those optimizations don't handle this case. You can see the code in temp_elide.c
, including some notes about what platforms it works on and why Python stack inspection isn't enough.)
Yup... the object on the RHS has no idea what it's being bound to (if it's being bound at all), so the only way to do it is like you say here usingnumpy.add(a, b, out=c)
where it's explicitly given some space it can work in without having to make the presumption it has to build its own array for the result. (I tend to think of it like using C'smemcpy
kind of thing vs... amalloc
operation)
– Jon Clements♦
Nov 12 '18 at 19:26
Thanks for the quick reply, this makes sense. I was simply wondering whether (or not) there is some black Python magic that I was not aware of :)
– s-m-e
Nov 12 '18 at 19:28
1
Expressions likenp.add(A[:,1:],A[:,:-1],out=A[:,1:])
suggest, though, that even with anout
,numpy
creates a temporary buffer.
– hpaulj
Nov 12 '18 at 19:50
1
@hpaulj: NumPy only makes copies for such an operation if it thinks a copy is necessary to handle overlapping input and output. For something likeadd(a, b, out=c)
with no overlap, it won't make a copy. (The safety copies were introduced in NumPy 1.13.0.)
– user2357112
Nov 12 '18 at 19:56
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268723%2funderstanding-memory-allocation-in-numpy-is-temporary-memory-being-allocated%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The operation you're looking for is
numpy.add(a, b, out=c)
With c[:, :] = a + b
, the evaluation of a + b
does not have information about the fact that the result will be assigned to c[:, :]
. It must allocate a new array to hold the result of a + b
.
(Recent versions of NumPy do try to perform some C-level stack inspection to aggressively optimize temporaries beyond what the Python execution model would normally allow, but those optimizations don't handle this case. You can see the code in temp_elide.c
, including some notes about what platforms it works on and why Python stack inspection isn't enough.)
Yup... the object on the RHS has no idea what it's being bound to (if it's being bound at all), so the only way to do it is like you say here usingnumpy.add(a, b, out=c)
where it's explicitly given some space it can work in without having to make the presumption it has to build its own array for the result. (I tend to think of it like using C'smemcpy
kind of thing vs... amalloc
operation)
– Jon Clements♦
Nov 12 '18 at 19:26
Thanks for the quick reply, this makes sense. I was simply wondering whether (or not) there is some black Python magic that I was not aware of :)
– s-m-e
Nov 12 '18 at 19:28
1
Expressions likenp.add(A[:,1:],A[:,:-1],out=A[:,1:])
suggest, though, that even with anout
,numpy
creates a temporary buffer.
– hpaulj
Nov 12 '18 at 19:50
1
@hpaulj: NumPy only makes copies for such an operation if it thinks a copy is necessary to handle overlapping input and output. For something likeadd(a, b, out=c)
with no overlap, it won't make a copy. (The safety copies were introduced in NumPy 1.13.0.)
– user2357112
Nov 12 '18 at 19:56
add a comment |
The operation you're looking for is
numpy.add(a, b, out=c)
With c[:, :] = a + b
, the evaluation of a + b
does not have information about the fact that the result will be assigned to c[:, :]
. It must allocate a new array to hold the result of a + b
.
(Recent versions of NumPy do try to perform some C-level stack inspection to aggressively optimize temporaries beyond what the Python execution model would normally allow, but those optimizations don't handle this case. You can see the code in temp_elide.c
, including some notes about what platforms it works on and why Python stack inspection isn't enough.)
Yup... the object on the RHS has no idea what it's being bound to (if it's being bound at all), so the only way to do it is like you say here usingnumpy.add(a, b, out=c)
where it's explicitly given some space it can work in without having to make the presumption it has to build its own array for the result. (I tend to think of it like using C'smemcpy
kind of thing vs... amalloc
operation)
– Jon Clements♦
Nov 12 '18 at 19:26
Thanks for the quick reply, this makes sense. I was simply wondering whether (or not) there is some black Python magic that I was not aware of :)
– s-m-e
Nov 12 '18 at 19:28
1
Expressions likenp.add(A[:,1:],A[:,:-1],out=A[:,1:])
suggest, though, that even with anout
,numpy
creates a temporary buffer.
– hpaulj
Nov 12 '18 at 19:50
1
@hpaulj: NumPy only makes copies for such an operation if it thinks a copy is necessary to handle overlapping input and output. For something likeadd(a, b, out=c)
with no overlap, it won't make a copy. (The safety copies were introduced in NumPy 1.13.0.)
– user2357112
Nov 12 '18 at 19:56
add a comment |
The operation you're looking for is
numpy.add(a, b, out=c)
With c[:, :] = a + b
, the evaluation of a + b
does not have information about the fact that the result will be assigned to c[:, :]
. It must allocate a new array to hold the result of a + b
.
(Recent versions of NumPy do try to perform some C-level stack inspection to aggressively optimize temporaries beyond what the Python execution model would normally allow, but those optimizations don't handle this case. You can see the code in temp_elide.c
, including some notes about what platforms it works on and why Python stack inspection isn't enough.)
The operation you're looking for is
numpy.add(a, b, out=c)
With c[:, :] = a + b
, the evaluation of a + b
does not have information about the fact that the result will be assigned to c[:, :]
. It must allocate a new array to hold the result of a + b
.
(Recent versions of NumPy do try to perform some C-level stack inspection to aggressively optimize temporaries beyond what the Python execution model would normally allow, but those optimizations don't handle this case. You can see the code in temp_elide.c
, including some notes about what platforms it works on and why Python stack inspection isn't enough.)
edited Nov 12 '18 at 19:30
answered Nov 12 '18 at 19:24
user2357112
151k12158249
151k12158249
Yup... the object on the RHS has no idea what it's being bound to (if it's being bound at all), so the only way to do it is like you say here usingnumpy.add(a, b, out=c)
where it's explicitly given some space it can work in without having to make the presumption it has to build its own array for the result. (I tend to think of it like using C'smemcpy
kind of thing vs... amalloc
operation)
– Jon Clements♦
Nov 12 '18 at 19:26
Thanks for the quick reply, this makes sense. I was simply wondering whether (or not) there is some black Python magic that I was not aware of :)
– s-m-e
Nov 12 '18 at 19:28
1
Expressions likenp.add(A[:,1:],A[:,:-1],out=A[:,1:])
suggest, though, that even with anout
,numpy
creates a temporary buffer.
– hpaulj
Nov 12 '18 at 19:50
1
@hpaulj: NumPy only makes copies for such an operation if it thinks a copy is necessary to handle overlapping input and output. For something likeadd(a, b, out=c)
with no overlap, it won't make a copy. (The safety copies were introduced in NumPy 1.13.0.)
– user2357112
Nov 12 '18 at 19:56
add a comment |
Yup... the object on the RHS has no idea what it's being bound to (if it's being bound at all), so the only way to do it is like you say here usingnumpy.add(a, b, out=c)
where it's explicitly given some space it can work in without having to make the presumption it has to build its own array for the result. (I tend to think of it like using C'smemcpy
kind of thing vs... amalloc
operation)
– Jon Clements♦
Nov 12 '18 at 19:26
Thanks for the quick reply, this makes sense. I was simply wondering whether (or not) there is some black Python magic that I was not aware of :)
– s-m-e
Nov 12 '18 at 19:28
1
Expressions likenp.add(A[:,1:],A[:,:-1],out=A[:,1:])
suggest, though, that even with anout
,numpy
creates a temporary buffer.
– hpaulj
Nov 12 '18 at 19:50
1
@hpaulj: NumPy only makes copies for such an operation if it thinks a copy is necessary to handle overlapping input and output. For something likeadd(a, b, out=c)
with no overlap, it won't make a copy. (The safety copies were introduced in NumPy 1.13.0.)
– user2357112
Nov 12 '18 at 19:56
Yup... the object on the RHS has no idea what it's being bound to (if it's being bound at all), so the only way to do it is like you say here using
numpy.add(a, b, out=c)
where it's explicitly given some space it can work in without having to make the presumption it has to build its own array for the result. (I tend to think of it like using C's memcpy
kind of thing vs... a malloc
operation)– Jon Clements♦
Nov 12 '18 at 19:26
Yup... the object on the RHS has no idea what it's being bound to (if it's being bound at all), so the only way to do it is like you say here using
numpy.add(a, b, out=c)
where it's explicitly given some space it can work in without having to make the presumption it has to build its own array for the result. (I tend to think of it like using C's memcpy
kind of thing vs... a malloc
operation)– Jon Clements♦
Nov 12 '18 at 19:26
Thanks for the quick reply, this makes sense. I was simply wondering whether (or not) there is some black Python magic that I was not aware of :)
– s-m-e
Nov 12 '18 at 19:28
Thanks for the quick reply, this makes sense. I was simply wondering whether (or not) there is some black Python magic that I was not aware of :)
– s-m-e
Nov 12 '18 at 19:28
1
1
Expressions like
np.add(A[:,1:],A[:,:-1],out=A[:,1:])
suggest, though, that even with an out
, numpy
creates a temporary buffer.– hpaulj
Nov 12 '18 at 19:50
Expressions like
np.add(A[:,1:],A[:,:-1],out=A[:,1:])
suggest, though, that even with an out
, numpy
creates a temporary buffer.– hpaulj
Nov 12 '18 at 19:50
1
1
@hpaulj: NumPy only makes copies for such an operation if it thinks a copy is necessary to handle overlapping input and output. For something like
add(a, b, out=c)
with no overlap, it won't make a copy. (The safety copies were introduced in NumPy 1.13.0.)– user2357112
Nov 12 '18 at 19:56
@hpaulj: NumPy only makes copies for such an operation if it thinks a copy is necessary to handle overlapping input and output. For something like
add(a, b, out=c)
with no overlap, it won't make a copy. (The safety copies were introduced in NumPy 1.13.0.)– user2357112
Nov 12 '18 at 19:56
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268723%2funderstanding-memory-allocation-in-numpy-is-temporary-memory-being-allocated%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
It's the first one because of how Python evaluates left-to-right... while numpy can do some fancy stuff, it can't override the mechanics of how Python evaluates statements...
– Jon Clements♦
Nov 12 '18 at 19:22
@JonClements Yes, I guessed so ... I would not know how to implement
__add__
in another way.– s-m-e
Nov 12 '18 at 19:23