How do I write a futures::Stream to disk without storing it entirely in memory first?

up vote
0
down vote

favorite

There's an example of downloading a file with Rusoto S3 here:
How to save a file downloaded from S3 with Rusoto to my hard drive?

The problem is that it looks like it's downloading the whole file into memory and then writing it to disk, because it uses the write_all method which takes an array of bytes, not a stream. How can I use the StreamingBody, which implements futures::Stream to stream the file to disk?

edited Nov 11 at 4:26

Shepmaster

144k11268400

asked Nov 11 at 2:46

Nicholas Bishop

542413

for x in stream file.write_all(&x) something like that...
– Stargateur
Nov 11 at 4:04

That would require StreamingBody to be an iterator, which it is not.
– Nicholas Bishop
Nov 11 at 4:09

add a comment |

up vote
0
down vote

favorite

There's an example of downloading a file with Rusoto S3 here:
How to save a file downloaded from S3 with Rusoto to my hard drive?

edited Nov 11 at 4:26

Shepmaster

144k11268400

asked Nov 11 at 2:46

Nicholas Bishop

542413

for x in stream file.write_all(&x) something like that...
– Stargateur
Nov 11 at 4:04

That would require StreamingBody to be an iterator, which it is not.
– Nicholas Bishop
Nov 11 at 4:09

add a comment |

up vote
0
down vote

favorite

There's an example of downloading a file with Rusoto S3 here:
How to save a file downloaded from S3 with Rusoto to my hard drive?

edited Nov 11 at 4:26

Shepmaster

144k11268400

asked Nov 11 at 2:46

Nicholas Bishop

542413

There's an example of downloading a file with Rusoto S3 here:
How to save a file downloaded from S3 with Rusoto to my hard drive?

stream rust future

edited Nov 11 at 4:26

Shepmaster

144k11268400

asked Nov 11 at 2:46

Nicholas Bishop

542413

edited Nov 11 at 4:26

Shepmaster

144k11268400

asked Nov 11 at 2:46

Nicholas Bishop

542413

edited Nov 11 at 4:26

Shepmaster

144k11268400

edited Nov 11 at 4:26

Shepmaster

144k11268400

edited Nov 11 at 4:26

Shepmaster

144k11268400

asked Nov 11 at 2:46

Nicholas Bishop

542413

asked Nov 11 at 2:46

Nicholas Bishop

542413

asked Nov 11 at 2:46

Nicholas Bishop

542413

for x in stream file.write_all(&x) something like that...
– Stargateur
Nov 11 at 4:04

That would require StreamingBody to be an iterator, which it is not.
– Nicholas Bishop
Nov 11 at 4:09

add a comment |

for x in stream file.write_all(&x) something like that...
– Stargateur
Nov 11 at 4:04

That would require StreamingBody to be an iterator, which it is not.
– Nicholas Bishop
Nov 11 at 4:09

for x in stream file.write_all(&x) something like that...
– Stargateur
Nov 11 at 4:04

That would require StreamingBody to be an iterator, which it is not.
– Nicholas Bishop
Nov 11 at 4:09

add a comment |

1 Answer
1

active

oldest

votes

up vote
0
down vote

accepted

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> 
 streaming_body().for_each(move

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 at 14:12

answered Nov 11 at 20:11

Shepmaster

144k11268400

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?
– Nicholas Bishop
Nov 12 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)
– Sven Marnach
Nov 12 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.
– Nicholas Bishop
Nov 12 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.
– Sven Marnach
Nov 12 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.
– Shepmaster
Nov 12 at 16:29

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245412%2fhow-do-i-write-a-futuresstream-to-disk-without-storing-it-entirely-in-memory-f%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

accepted

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> 
 streaming_body().for_each(move

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 at 14:12

answered Nov 11 at 20:11

Shepmaster

144k11268400

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?
– Nicholas Bishop
Nov 12 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)
– Sven Marnach
Nov 12 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.
– Nicholas Bishop
Nov 12 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.
– Sven Marnach
Nov 12 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.
– Shepmaster
Nov 12 at 16:29

add a comment |

up vote
0
down vote

accepted

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> 
 streaming_body().for_each(move

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 at 14:12

answered Nov 11 at 20:11

Shepmaster

144k11268400

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?
– Nicholas Bishop
Nov 12 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)
– Sven Marnach
Nov 12 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.
– Nicholas Bishop
Nov 12 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.
– Sven Marnach
Nov 12 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.
– Shepmaster
Nov 12 at 16:29

add a comment |

up vote
0
down vote

accepted

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> 
 streaming_body().for_each(move

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 at 14:12

answered Nov 11 at 20:11

Shepmaster

144k11268400

Since StreamingBody implements Stream<Item = Vec<u8>, Error = Error>, we can construct a MCVE that represents that:

extern crate futures; // 0.1.25

use futures::prelude::*, stream;

type Error = Box<std::error::Error>;

fn streaming_body() -> impl Stream<Item = Vec<u8>, Error = Error> b.to_owned());
 stream::iter_ok(iter_of_owned_bytes)

We can then get a "streaming body" somehow and use Stream::for_each to process each element in the Stream. Here, we just call write_all with some provided output location:

use std::fs::File, io::Write;

fn save_to_disk(mut file: impl Write) -> impl Future<Item = (), Error = Error> 
 streaming_body().for_each(move

We can then write a little testing main:

fn main() 
 let mut file = Vec::new();

 
 let fut = save_to_disk(&mut file);
 fut.wait().expect("Could not drive future");
 

 assert_eq!(file, b"0123456789ABCDEF");

Important notes about the quality of this naïve implementation:

The call to write_all may potentially block, which you should not do in an asynchronous program. It would be better to hand off that blocking work to a threadpool.

The usage of Future::wait forces the thread to block until the future is done, which is great for tests but may not be correct for your real use case.

See also:

What is the best approach to encapsulate blocking I/O in future-rs?

How do I synchronously return a value calculated in an asynchronous Future in stable Rust?

edited Nov 12 at 14:12

answered Nov 11 at 20:11

Shepmaster

144k11268400

edited Nov 12 at 14:12

answered Nov 11 at 20:11

Shepmaster

144k11268400

answered Nov 11 at 20:11

Shepmaster

144k11268400

answered Nov 11 at 20:11

Shepmaster

144k11268400

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?
– Nicholas Bishop
Nov 12 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)
– Sven Marnach
Nov 12 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.
– Nicholas Bishop
Nov 12 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.
– Sven Marnach
Nov 12 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.
– Shepmaster
Nov 12 at 16:29

add a comment |

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?
– Nicholas Bishop
Nov 12 at 2:45

1

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)
– Sven Marnach
Nov 12 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.
– Nicholas Bishop
Nov 12 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.
– Sven Marnach
Nov 12 at 15:27

1

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.
– Shepmaster
Nov 12 at 16:29

One question about this. Where you call streaming_body().for_each(...), is that more or less equivalent to doing for chunk in streaming_body().wait() ... , other than one using a closure and the other using an iterator?
– Nicholas Bishop
Nov 12 at 2:45

@NicholasBishop There is a relevant difference between the two. The for loop you suggest blocks the current thread, so the thread can't do any other work until the complete stream is resolved. The stream combinator for_each(), on the other hand, yields control to the event loop whenever it would block. (Of course the test code in this answer does not use an event loop, and also blocks until the future is resolved. However, the whole point of asynchronous code is not to unnecessarily block the current thread, so you wouldn't do this in real code.)
– Sven Marnach
Nov 12 at 10:12

For my use case I do actually want to block the thread -- just because the API provided is async doesn't mean that the calling code is.
– Nicholas Bishop
Nov 12 at 15:11

@NicholasBishop Then both versions are fine. I'd probably use Stream::wait() in that case, since working with stream combinators can be cumbersome.
– Sven Marnach
Nov 12 at 15:27

Stream::wait is fine for now, but it's being removed in the futures rework (just like Future::wait). There will be a direct replacement for Future::wait, but I don't know of one for Stream::wait.
– Shepmaster
Nov 12 at 16:29

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

z,rip1ooeiyz,IagZO5xXbD

搜尋此網誌

Odtnhj