How does dieharder know when processing binary numbers from a file how “long” these numbers are?










0















I want to examine the random numbers generated by a random number generator. Each random number is 4 bytes. I have collected 50,000 numbers, each number is in a separate file (so I have 50,000 files each containing 4 bytes).
I'd like dieharder (a testing and benchmarking tool for random number generators) to benchmark these random numbers.



dieharder supports multiple ways of feeding random bytes. For feeding ASCII numbers it supports a header format where one can define "numbit". According to the man page this looks like that:



dieharder -g 202 -f testrands.txt -a


Where testrands.txt should consist of a header such as:



#==================================================================
# generator mt19937_1999 seed = 1274511046
#==================================================================
type: d
count: 100000
numbit: 32
3129711816
85411969
2545911541


Now it also supports raw binary data:



dieharder -g 201 -f testrands.bin -a


My question is:
How would dieharder then know that my original numbers were 4 bytes long (it would only see a stream of 50,000 * 4 bytes)?










share|improve this question
























  • Please give appropriate links to dieharder (not everyone knows what you are talking about) as well as a Minimal, Complete, and Verifiable example which illustrates the problem.

    – John Coleman
    Nov 15 '18 at 15:20






  • 1





    Wouldn't the simplest solution be to convert your data to ASCII?

    – 500 - Internal Server Error
    Nov 15 '18 at 15:28











  • I edited the question. Is it clear now? @500-InternalServerError: I know, but that is not the question, I'd like to understand the raw binary processing.

    – dudekowsky
    Nov 15 '18 at 16:35












  • For binary data, length is irrelevant. Random bytes, random words, random longs, should all be indistinguishable from random bits.

    – Lee Daniel Crocker
    Nov 15 '18 at 18:06











  • @LeeDanielCrocker That's what I wonder. Is it really like that? Does that mean I can always "concatenate" random data of arbitrary length and test it for randomness? I wondered about that also with tools like this one: csrc.nist.gov/projects/random-bit-generation/…

    – dudekowsky
    Nov 15 '18 at 19:17
















0















I want to examine the random numbers generated by a random number generator. Each random number is 4 bytes. I have collected 50,000 numbers, each number is in a separate file (so I have 50,000 files each containing 4 bytes).
I'd like dieharder (a testing and benchmarking tool for random number generators) to benchmark these random numbers.



dieharder supports multiple ways of feeding random bytes. For feeding ASCII numbers it supports a header format where one can define "numbit". According to the man page this looks like that:



dieharder -g 202 -f testrands.txt -a


Where testrands.txt should consist of a header such as:



#==================================================================
# generator mt19937_1999 seed = 1274511046
#==================================================================
type: d
count: 100000
numbit: 32
3129711816
85411969
2545911541


Now it also supports raw binary data:



dieharder -g 201 -f testrands.bin -a


My question is:
How would dieharder then know that my original numbers were 4 bytes long (it would only see a stream of 50,000 * 4 bytes)?










share|improve this question
























  • Please give appropriate links to dieharder (not everyone knows what you are talking about) as well as a Minimal, Complete, and Verifiable example which illustrates the problem.

    – John Coleman
    Nov 15 '18 at 15:20






  • 1





    Wouldn't the simplest solution be to convert your data to ASCII?

    – 500 - Internal Server Error
    Nov 15 '18 at 15:28











  • I edited the question. Is it clear now? @500-InternalServerError: I know, but that is not the question, I'd like to understand the raw binary processing.

    – dudekowsky
    Nov 15 '18 at 16:35












  • For binary data, length is irrelevant. Random bytes, random words, random longs, should all be indistinguishable from random bits.

    – Lee Daniel Crocker
    Nov 15 '18 at 18:06











  • @LeeDanielCrocker That's what I wonder. Is it really like that? Does that mean I can always "concatenate" random data of arbitrary length and test it for randomness? I wondered about that also with tools like this one: csrc.nist.gov/projects/random-bit-generation/…

    – dudekowsky
    Nov 15 '18 at 19:17














0












0








0








I want to examine the random numbers generated by a random number generator. Each random number is 4 bytes. I have collected 50,000 numbers, each number is in a separate file (so I have 50,000 files each containing 4 bytes).
I'd like dieharder (a testing and benchmarking tool for random number generators) to benchmark these random numbers.



dieharder supports multiple ways of feeding random bytes. For feeding ASCII numbers it supports a header format where one can define "numbit". According to the man page this looks like that:



dieharder -g 202 -f testrands.txt -a


Where testrands.txt should consist of a header such as:



#==================================================================
# generator mt19937_1999 seed = 1274511046
#==================================================================
type: d
count: 100000
numbit: 32
3129711816
85411969
2545911541


Now it also supports raw binary data:



dieharder -g 201 -f testrands.bin -a


My question is:
How would dieharder then know that my original numbers were 4 bytes long (it would only see a stream of 50,000 * 4 bytes)?










share|improve this question
















I want to examine the random numbers generated by a random number generator. Each random number is 4 bytes. I have collected 50,000 numbers, each number is in a separate file (so I have 50,000 files each containing 4 bytes).
I'd like dieharder (a testing and benchmarking tool for random number generators) to benchmark these random numbers.



dieharder supports multiple ways of feeding random bytes. For feeding ASCII numbers it supports a header format where one can define "numbit". According to the man page this looks like that:



dieharder -g 202 -f testrands.txt -a


Where testrands.txt should consist of a header such as:



#==================================================================
# generator mt19937_1999 seed = 1274511046
#==================================================================
type: d
count: 100000
numbit: 32
3129711816
85411969
2545911541


Now it also supports raw binary data:



dieharder -g 201 -f testrands.bin -a


My question is:
How would dieharder then know that my original numbers were 4 bytes long (it would only see a stream of 50,000 * 4 bytes)?







random






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 16:33







dudekowsky

















asked Nov 15 '18 at 15:13









dudekowskydudekowsky

11




11












  • Please give appropriate links to dieharder (not everyone knows what you are talking about) as well as a Minimal, Complete, and Verifiable example which illustrates the problem.

    – John Coleman
    Nov 15 '18 at 15:20






  • 1





    Wouldn't the simplest solution be to convert your data to ASCII?

    – 500 - Internal Server Error
    Nov 15 '18 at 15:28











  • I edited the question. Is it clear now? @500-InternalServerError: I know, but that is not the question, I'd like to understand the raw binary processing.

    – dudekowsky
    Nov 15 '18 at 16:35












  • For binary data, length is irrelevant. Random bytes, random words, random longs, should all be indistinguishable from random bits.

    – Lee Daniel Crocker
    Nov 15 '18 at 18:06











  • @LeeDanielCrocker That's what I wonder. Is it really like that? Does that mean I can always "concatenate" random data of arbitrary length and test it for randomness? I wondered about that also with tools like this one: csrc.nist.gov/projects/random-bit-generation/…

    – dudekowsky
    Nov 15 '18 at 19:17


















  • Please give appropriate links to dieharder (not everyone knows what you are talking about) as well as a Minimal, Complete, and Verifiable example which illustrates the problem.

    – John Coleman
    Nov 15 '18 at 15:20






  • 1





    Wouldn't the simplest solution be to convert your data to ASCII?

    – 500 - Internal Server Error
    Nov 15 '18 at 15:28











  • I edited the question. Is it clear now? @500-InternalServerError: I know, but that is not the question, I'd like to understand the raw binary processing.

    – dudekowsky
    Nov 15 '18 at 16:35












  • For binary data, length is irrelevant. Random bytes, random words, random longs, should all be indistinguishable from random bits.

    – Lee Daniel Crocker
    Nov 15 '18 at 18:06











  • @LeeDanielCrocker That's what I wonder. Is it really like that? Does that mean I can always "concatenate" random data of arbitrary length and test it for randomness? I wondered about that also with tools like this one: csrc.nist.gov/projects/random-bit-generation/…

    – dudekowsky
    Nov 15 '18 at 19:17

















Please give appropriate links to dieharder (not everyone knows what you are talking about) as well as a Minimal, Complete, and Verifiable example which illustrates the problem.

– John Coleman
Nov 15 '18 at 15:20





Please give appropriate links to dieharder (not everyone knows what you are talking about) as well as a Minimal, Complete, and Verifiable example which illustrates the problem.

– John Coleman
Nov 15 '18 at 15:20




1




1





Wouldn't the simplest solution be to convert your data to ASCII?

– 500 - Internal Server Error
Nov 15 '18 at 15:28





Wouldn't the simplest solution be to convert your data to ASCII?

– 500 - Internal Server Error
Nov 15 '18 at 15:28













I edited the question. Is it clear now? @500-InternalServerError: I know, but that is not the question, I'd like to understand the raw binary processing.

– dudekowsky
Nov 15 '18 at 16:35






I edited the question. Is it clear now? @500-InternalServerError: I know, but that is not the question, I'd like to understand the raw binary processing.

– dudekowsky
Nov 15 '18 at 16:35














For binary data, length is irrelevant. Random bytes, random words, random longs, should all be indistinguishable from random bits.

– Lee Daniel Crocker
Nov 15 '18 at 18:06





For binary data, length is irrelevant. Random bytes, random words, random longs, should all be indistinguishable from random bits.

– Lee Daniel Crocker
Nov 15 '18 at 18:06













@LeeDanielCrocker That's what I wonder. Is it really like that? Does that mean I can always "concatenate" random data of arbitrary length and test it for randomness? I wondered about that also with tools like this one: csrc.nist.gov/projects/random-bit-generation/…

– dudekowsky
Nov 15 '18 at 19:17






@LeeDanielCrocker That's what I wonder. Is it really like that? Does that mean I can always "concatenate" random data of arbitrary length and test it for randomness? I wondered about that also with tools like this one: csrc.nist.gov/projects/random-bit-generation/…

– dudekowsky
Nov 15 '18 at 19:17













0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53322482%2fhow-does-dieharder-know-when-processing-binary-numbers-from-a-file-how-long-th%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53322482%2fhow-does-dieharder-know-when-processing-binary-numbers-from-a-file-how-long-th%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

Barbados

How to read a connectionString WITH PROVIDER in .NET Core?

Node.js Script on GitHub Pages or Amazon S3