Finding a bit pattern in a binary file using Python and memory map

I am processing a binary file that is not byte aligned at the start. Shortly in the file there is a 24 bit pattern 0xfaf330 that is a sync marker that marks subsequent byte aligned data. I am using Python mmap on the file and desire to use Python memoryview once the marker is found to process the remaining part of the file. So, how do I find the 24 bit pattern and then use mmap and memoryview from that point forward?

edited Nov 15 '18 at 12:52

asked Nov 15 '18 at 12:49

GAF

7624

Is there are reason why you mmap the file and don't just open and stream it?

– MisterMiyagi
Nov 15 '18 at 12:51

The file is very large and memory mapping helps to manage it.

– GAF
Nov 15 '18 at 12:55

Using open will only buffer a portion of the file at any time. Do you need random access? Your description sounds ideal for stream processing.

– MisterMiyagi
Nov 15 '18 at 12:57

Subsequently, memoryview helps to process the remaining byte aligned data in chunks based on the file format specification.

– GAF
Nov 15 '18 at 12:57

The data read is subject to Python's regular garbage collection. Unless you hang on to it, it is reclaimed.

– MisterMiyagi
Nov 15 '18 at 13:22

add a comment |

edited Nov 15 '18 at 12:52

asked Nov 15 '18 at 12:49

GAF

7624

Is there are reason why you mmap the file and don't just open and stream it?

– MisterMiyagi
Nov 15 '18 at 12:51

The file is very large and memory mapping helps to manage it.

– GAF
Nov 15 '18 at 12:55

Using open will only buffer a portion of the file at any time. Do you need random access? Your description sounds ideal for stream processing.

– MisterMiyagi
Nov 15 '18 at 12:57

Subsequently, memoryview helps to process the remaining byte aligned data in chunks based on the file format specification.

– GAF
Nov 15 '18 at 12:57

The data read is subject to Python's regular garbage collection. Unless you hang on to it, it is reclaimed.

– MisterMiyagi
Nov 15 '18 at 13:22

add a comment |

edited Nov 15 '18 at 12:52

asked Nov 15 '18 at 12:49

GAF

7624

python-3.x binaryfiles

edited Nov 15 '18 at 12:52

asked Nov 15 '18 at 12:49

GAF

7624

edited Nov 15 '18 at 12:52

asked Nov 15 '18 at 12:49

GAF

7624

edited Nov 15 '18 at 12:52

asked Nov 15 '18 at 12:49

GAF

7624

asked Nov 15 '18 at 12:49

GAF

7624

asked Nov 15 '18 at 12:49

GAF

7624

Is there are reason why you mmap the file and don't just open and stream it?

– MisterMiyagi
Nov 15 '18 at 12:51

The file is very large and memory mapping helps to manage it.

– GAF
Nov 15 '18 at 12:55

Using open will only buffer a portion of the file at any time. Do you need random access? Your description sounds ideal for stream processing.

– MisterMiyagi
Nov 15 '18 at 12:57

Subsequently, memoryview helps to process the remaining byte aligned data in chunks based on the file format specification.

– GAF
Nov 15 '18 at 12:57

The data read is subject to Python's regular garbage collection. Unless you hang on to it, it is reclaimed.

– MisterMiyagi
Nov 15 '18 at 13:22

add a comment |

Is there are reason why you mmap the file and don't just open and stream it?

– MisterMiyagi
Nov 15 '18 at 12:51

The file is very large and memory mapping helps to manage it.

– GAF
Nov 15 '18 at 12:55

Using open will only buffer a portion of the file at any time. Do you need random access? Your description sounds ideal for stream processing.

– MisterMiyagi
Nov 15 '18 at 12:57

Subsequently, memoryview helps to process the remaining byte aligned data in chunks based on the file format specification.

– GAF
Nov 15 '18 at 12:57

The data read is subject to Python's regular garbage collection. Unless you hang on to it, it is reclaimed.

– MisterMiyagi
Nov 15 '18 at 13:22

Is there are reason why you mmap the file and don't just open and stream it?

– MisterMiyagi
Nov 15 '18 at 12:51

The file is very large and memory mapping helps to manage it.

– GAF
Nov 15 '18 at 12:55

Using open will only buffer a portion of the file at any time. Do you need random access? Your description sounds ideal for stream processing.

– MisterMiyagi
Nov 15 '18 at 12:57

Subsequently, memoryview helps to process the remaining byte aligned data in chunks based on the file format specification.

– GAF
Nov 15 '18 at 12:57

The data read is subject to Python's regular garbage collection. Unless you hang on to it, it is reclaimed.

– MisterMiyagi
Nov 15 '18 at 13:22

add a comment |

2 Answers
2

active

oldest

votes

If you do not need random access, you can use open to stream the file. Using file.read, you can get consecutive bytes from the file. If your file were byte-aligned, you could directly search through it:

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while in_stream.peek(1) != b'xfaxf3x30':
 in_stream.read(1)
# in_stream is now positioned directly after the marker
print(in_stream.tell())

By default, open uses a small read buffer but never loads the entire file. You can stream through the file using further in_stream.read calls.

Alternatively, you can use the result of in_stream.tell() to jump to the correct position in an mmap'ed file.

Searching non-aligned bits

To manage non-byte aligned data, you must sift through bytes manually: bit-shifting allows to inspect sub-ranges of bytes. Note that Python only allows bit-shifting int, not bytes.

>>> pattern = 0xfaf330
>>> bin((pattern << 4) + 0b1011) # pattern shifted by 4 plus garbage
0b1111101011110011001100001011

You can use this to scan a window of bytes:

def find_bits(pattern: int, window: int, n: int):
 """Find an n-byte bit pattern in an n+1-byte window and return the offset"""
 for offset in range(8):
 window_slice = (window >> offset) & (2 ** (n*8) -1)
 if pattern == window_slice:
 return offset
 raise IndexError('pattern not in window')

You can again use this to scan the file stream:

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while True:
 try:
 offset = find_bits(
 0xfaf330,
 int.from_bytes(in_stream.peek(3)[:4], 'big'),
 3
 )
 except IndexError:
 in_stream.read(1)
 else:
 break
# in_stream is now positioned directly after the marker
print('byte-offset:', in_stream.tell(), 'bit-offset:', offset)

Alternatively, you can use binary representation to literally find the pattern in the window. Note that you have to mind padding of zero bits, so it is about the same work.

Reading non-aligned bits

Once you have the bit-offset, you can read-and-align data from the file. Basically, read one byte more than you need, then shift as needed:

def align_read(file, num_bytes: int, bit_offset: int):
 if bit_offset == 0:
 return file.read(num_bytes)
 window = file.peek(num_bytes + 1)[:num_bytes + 1]
 file.read(num_bytes)
 data = (int.from_bytes(window, 'big') >> bit_offset) & (2 ** (num_bytes*8) - 1)
 return data.to_bytes(num_bytes, 'big')

edited Nov 15 '18 at 14:27

answered Nov 15 '18 at 13:17

MisterMiyagi

8,0162446

This will not work because the beginning of the file is not byte aligned. Meaning that I could read several bytes and come to the sync marker but read just a few bits of it. A subsequent read of one byte would read another misaligned part of the sync marker. Therefore, the marker could be read and not recognized. Thanks for your suggestion.

– GAF
Nov 15 '18 at 13:23

@GAF Sorry, missed that one. AFAIK Python does not support a resolution smaller than bytes - neither for open nor mmap nor other means. You will have to bit-shift each chunk.

– MisterMiyagi
Nov 15 '18 at 13:33

Thanks. Appreciate the follow up.

– GAF
Nov 15 '18 at 14:15

@GAF Added a (working) draft how to handle the shifting to find the offset and re-align data. This is probably worth using Cython if your file is large and you read only small chunks at a time.

– MisterMiyagi
Nov 15 '18 at 14:29

add a comment |

MisterMiyagi's answer is a good solution. Another solution uses the bitstring module.

aFile = open(someFilePath, 'rb')
aBinaryStream = bitstring.ConstBitStream(aFile)
aTuple = aBinaryStream.find('0b111110101111001100100000') #the sync marker

If found, the position in the file is moved to the found location. Then you can read byte aligned data.

aBuffer = aBinaryStream.read('bytes:1024') # to read 1024 bytes

answered Nov 15 '18 at 20:16

GAF

7624

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319850%2ffinding-a-bit-pattern-in-a-binary-file-using-python-and-memory-map%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while in_stream.peek(1) != b'xfaxf3x30':
 in_stream.read(1)
# in_stream is now positioned directly after the marker
print(in_stream.tell())

By default, open uses a small read buffer but never loads the entire file. You can stream through the file using further in_stream.read calls.

Alternatively, you can use the result of in_stream.tell() to jump to the correct position in an mmap'ed file.

Searching non-aligned bits

To manage non-byte aligned data, you must sift through bytes manually: bit-shifting allows to inspect sub-ranges of bytes. Note that Python only allows bit-shifting int, not bytes.

>>> pattern = 0xfaf330
>>> bin((pattern << 4) + 0b1011) # pattern shifted by 4 plus garbage
0b1111101011110011001100001011

You can use this to scan a window of bytes:

def find_bits(pattern: int, window: int, n: int):
 """Find an n-byte bit pattern in an n+1-byte window and return the offset"""
 for offset in range(8):
 window_slice = (window >> offset) & (2 ** (n*8) -1)
 if pattern == window_slice:
 return offset
 raise IndexError('pattern not in window')

You can again use this to scan the file stream:

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while True:
 try:
 offset = find_bits(
 0xfaf330,
 int.from_bytes(in_stream.peek(3)[:4], 'big'),
 3
 )
 except IndexError:
 in_stream.read(1)
 else:
 break
# in_stream is now positioned directly after the marker
print('byte-offset:', in_stream.tell(), 'bit-offset:', offset)

Alternatively, you can use binary representation to literally find the pattern in the window. Note that you have to mind padding of zero bits, so it is about the same work.

Reading non-aligned bits

Once you have the bit-offset, you can read-and-align data from the file. Basically, read one byte more than you need, then shift as needed:

def align_read(file, num_bytes: int, bit_offset: int):
 if bit_offset == 0:
 return file.read(num_bytes)
 window = file.peek(num_bytes + 1)[:num_bytes + 1]
 file.read(num_bytes)
 data = (int.from_bytes(window, 'big') >> bit_offset) & (2 ** (num_bytes*8) - 1)
 return data.to_bytes(num_bytes, 'big')

edited Nov 15 '18 at 14:27

answered Nov 15 '18 at 13:17

MisterMiyagi

8,0162446

This will not work because the beginning of the file is not byte aligned. Meaning that I could read several bytes and come to the sync marker but read just a few bits of it. A subsequent read of one byte would read another misaligned part of the sync marker. Therefore, the marker could be read and not recognized. Thanks for your suggestion.

– GAF
Nov 15 '18 at 13:23

@GAF Sorry, missed that one. AFAIK Python does not support a resolution smaller than bytes - neither for open nor mmap nor other means. You will have to bit-shift each chunk.

– MisterMiyagi
Nov 15 '18 at 13:33

Thanks. Appreciate the follow up.

– GAF
Nov 15 '18 at 14:15

@GAF Added a (working) draft how to handle the shifting to find the offset and re-align data. This is probably worth using Cython if your file is large and you read only small chunks at a time.

– MisterMiyagi
Nov 15 '18 at 14:29

add a comment |

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while in_stream.peek(1) != b'xfaxf3x30':
 in_stream.read(1)
# in_stream is now positioned directly after the marker
print(in_stream.tell())

By default, open uses a small read buffer but never loads the entire file. You can stream through the file using further in_stream.read calls.

Alternatively, you can use the result of in_stream.tell() to jump to the correct position in an mmap'ed file.

Searching non-aligned bits

To manage non-byte aligned data, you must sift through bytes manually: bit-shifting allows to inspect sub-ranges of bytes. Note that Python only allows bit-shifting int, not bytes.

>>> pattern = 0xfaf330
>>> bin((pattern << 4) + 0b1011) # pattern shifted by 4 plus garbage
0b1111101011110011001100001011

You can use this to scan a window of bytes:

def find_bits(pattern: int, window: int, n: int):
 """Find an n-byte bit pattern in an n+1-byte window and return the offset"""
 for offset in range(8):
 window_slice = (window >> offset) & (2 ** (n*8) -1)
 if pattern == window_slice:
 return offset
 raise IndexError('pattern not in window')

You can again use this to scan the file stream:

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while True:
 try:
 offset = find_bits(
 0xfaf330,
 int.from_bytes(in_stream.peek(3)[:4], 'big'),
 3
 )
 except IndexError:
 in_stream.read(1)
 else:
 break
# in_stream is now positioned directly after the marker
print('byte-offset:', in_stream.tell(), 'bit-offset:', offset)

Alternatively, you can use binary representation to literally find the pattern in the window. Note that you have to mind padding of zero bits, so it is about the same work.

Reading non-aligned bits

Once you have the bit-offset, you can read-and-align data from the file. Basically, read one byte more than you need, then shift as needed:

def align_read(file, num_bytes: int, bit_offset: int):
 if bit_offset == 0:
 return file.read(num_bytes)
 window = file.peek(num_bytes + 1)[:num_bytes + 1]
 file.read(num_bytes)
 data = (int.from_bytes(window, 'big') >> bit_offset) & (2 ** (num_bytes*8) - 1)
 return data.to_bytes(num_bytes, 'big')

edited Nov 15 '18 at 14:27

answered Nov 15 '18 at 13:17

MisterMiyagi

8,0162446

This will not work because the beginning of the file is not byte aligned. Meaning that I could read several bytes and come to the sync marker but read just a few bits of it. A subsequent read of one byte would read another misaligned part of the sync marker. Therefore, the marker could be read and not recognized. Thanks for your suggestion.

– GAF
Nov 15 '18 at 13:23

@GAF Sorry, missed that one. AFAIK Python does not support a resolution smaller than bytes - neither for open nor mmap nor other means. You will have to bit-shift each chunk.

– MisterMiyagi
Nov 15 '18 at 13:33

Thanks. Appreciate the follow up.

– GAF
Nov 15 '18 at 14:15

@GAF Added a (working) draft how to handle the shifting to find the offset and re-align data. This is probably worth using Cython if your file is large and you read only small chunks at a time.

– MisterMiyagi
Nov 15 '18 at 14:29

add a comment |

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while in_stream.peek(1) != b'xfaxf3x30':
 in_stream.read(1)
# in_stream is now positioned directly after the marker
print(in_stream.tell())

By default, open uses a small read buffer but never loads the entire file. You can stream through the file using further in_stream.read calls.

Alternatively, you can use the result of in_stream.tell() to jump to the correct position in an mmap'ed file.

Searching non-aligned bits

To manage non-byte aligned data, you must sift through bytes manually: bit-shifting allows to inspect sub-ranges of bytes. Note that Python only allows bit-shifting int, not bytes.

>>> pattern = 0xfaf330
>>> bin((pattern << 4) + 0b1011) # pattern shifted by 4 plus garbage
0b1111101011110011001100001011

You can use this to scan a window of bytes:

def find_bits(pattern: int, window: int, n: int):
 """Find an n-byte bit pattern in an n+1-byte window and return the offset"""
 for offset in range(8):
 window_slice = (window >> offset) & (2 ** (n*8) -1)
 if pattern == window_slice:
 return offset
 raise IndexError('pattern not in window')

You can again use this to scan the file stream:

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while True:
 try:
 offset = find_bits(
 0xfaf330,
 int.from_bytes(in_stream.peek(3)[:4], 'big'),
 3
 )
 except IndexError:
 in_stream.read(1)
 else:
 break
# in_stream is now positioned directly after the marker
print('byte-offset:', in_stream.tell(), 'bit-offset:', offset)

Alternatively, you can use binary representation to literally find the pattern in the window. Note that you have to mind padding of zero bits, so it is about the same work.

Reading non-aligned bits

Once you have the bit-offset, you can read-and-align data from the file. Basically, read one byte more than you need, then shift as needed:

def align_read(file, num_bytes: int, bit_offset: int):
 if bit_offset == 0:
 return file.read(num_bytes)
 window = file.peek(num_bytes + 1)[:num_bytes + 1]
 file.read(num_bytes)
 data = (int.from_bytes(window, 'big') >> bit_offset) & (2 ** (num_bytes*8) - 1)
 return data.to_bytes(num_bytes, 'big')

edited Nov 15 '18 at 14:27

answered Nov 15 '18 at 13:17

MisterMiyagi

8,0162446

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while in_stream.peek(1) != b'xfaxf3x30':
 in_stream.read(1)
# in_stream is now positioned directly after the marker
print(in_stream.tell())

By default, open uses a small read buffer but never loads the entire file. You can stream through the file using further in_stream.read calls.

Alternatively, you can use the result of in_stream.tell() to jump to the correct position in an mmap'ed file.

Searching non-aligned bits

To manage non-byte aligned data, you must sift through bytes manually: bit-shifting allows to inspect sub-ranges of bytes. Note that Python only allows bit-shifting int, not bytes.

>>> pattern = 0xfaf330
>>> bin((pattern << 4) + 0b1011) # pattern shifted by 4 plus garbage
0b1111101011110011001100001011

You can use this to scan a window of bytes:

def find_bits(pattern: int, window: int, n: int):
 """Find an n-byte bit pattern in an n+1-byte window and return the offset"""
 for offset in range(8):
 window_slice = (window >> offset) & (2 ** (n*8) -1)
 if pattern == window_slice:
 return offset
 raise IndexError('pattern not in window')

You can again use this to scan the file stream:

in_stream = open('/dev/urandom', 'rb')
# discard individual bytes until first marker byte
while True:
 try:
 offset = find_bits(
 0xfaf330,
 int.from_bytes(in_stream.peek(3)[:4], 'big'),
 3
 )
 except IndexError:
 in_stream.read(1)
 else:
 break
# in_stream is now positioned directly after the marker
print('byte-offset:', in_stream.tell(), 'bit-offset:', offset)

Alternatively, you can use binary representation to literally find the pattern in the window. Note that you have to mind padding of zero bits, so it is about the same work.

Reading non-aligned bits

Once you have the bit-offset, you can read-and-align data from the file. Basically, read one byte more than you need, then shift as needed:

def align_read(file, num_bytes: int, bit_offset: int):
 if bit_offset == 0:
 return file.read(num_bytes)
 window = file.peek(num_bytes + 1)[:num_bytes + 1]
 file.read(num_bytes)
 data = (int.from_bytes(window, 'big') >> bit_offset) & (2 ** (num_bytes*8) - 1)
 return data.to_bytes(num_bytes, 'big')

edited Nov 15 '18 at 14:27

answered Nov 15 '18 at 13:17

MisterMiyagi

8,0162446

edited Nov 15 '18 at 14:27

answered Nov 15 '18 at 13:17

MisterMiyagi

8,0162446

answered Nov 15 '18 at 13:17

MisterMiyagi

8,0162446

answered Nov 15 '18 at 13:17

MisterMiyagi

8,0162446

This will not work because the beginning of the file is not byte aligned. Meaning that I could read several bytes and come to the sync marker but read just a few bits of it. A subsequent read of one byte would read another misaligned part of the sync marker. Therefore, the marker could be read and not recognized. Thanks for your suggestion.

– GAF
Nov 15 '18 at 13:23

@GAF Sorry, missed that one. AFAIK Python does not support a resolution smaller than bytes - neither for open nor mmap nor other means. You will have to bit-shift each chunk.

– MisterMiyagi
Nov 15 '18 at 13:33

Thanks. Appreciate the follow up.

– GAF
Nov 15 '18 at 14:15

@GAF Added a (working) draft how to handle the shifting to find the offset and re-align data. This is probably worth using Cython if your file is large and you read only small chunks at a time.

– MisterMiyagi
Nov 15 '18 at 14:29

add a comment |

This will not work because the beginning of the file is not byte aligned. Meaning that I could read several bytes and come to the sync marker but read just a few bits of it. A subsequent read of one byte would read another misaligned part of the sync marker. Therefore, the marker could be read and not recognized. Thanks for your suggestion.

– GAF
Nov 15 '18 at 13:23

@GAF Sorry, missed that one. AFAIK Python does not support a resolution smaller than bytes - neither for open nor mmap nor other means. You will have to bit-shift each chunk.

– MisterMiyagi
Nov 15 '18 at 13:33

Thanks. Appreciate the follow up.

– GAF
Nov 15 '18 at 14:15

@GAF Added a (working) draft how to handle the shifting to find the offset and re-align data. This is probably worth using Cython if your file is large and you read only small chunks at a time.

– MisterMiyagi
Nov 15 '18 at 14:29

This will not work because the beginning of the file is not byte aligned. Meaning that I could read several bytes and come to the sync marker but read just a few bits of it. A subsequent read of one byte would read another misaligned part of the sync marker. Therefore, the marker could be read and not recognized. Thanks for your suggestion.

– GAF
Nov 15 '18 at 13:23

@GAF Sorry, missed that one. AFAIK Python does not support a resolution smaller than bytes - neither for open nor mmap nor other means. You will have to bit-shift each chunk.

– MisterMiyagi
Nov 15 '18 at 13:33

Thanks. Appreciate the follow up.

– GAF
Nov 15 '18 at 14:15

@GAF Added a (working) draft how to handle the shifting to find the offset and re-align data. This is probably worth using Cython if your file is large and you read only small chunks at a time.

– MisterMiyagi
Nov 15 '18 at 14:29

add a comment |

MisterMiyagi's answer is a good solution. Another solution uses the bitstring module.

aFile = open(someFilePath, 'rb')
aBinaryStream = bitstring.ConstBitStream(aFile)
aTuple = aBinaryStream.find('0b111110101111001100100000') #the sync marker

If found, the position in the file is moved to the found location. Then you can read byte aligned data.

aBuffer = aBinaryStream.read('bytes:1024') # to read 1024 bytes

answered Nov 15 '18 at 20:16

GAF

7624

add a comment |

MisterMiyagi's answer is a good solution. Another solution uses the bitstring module.

aFile = open(someFilePath, 'rb')
aBinaryStream = bitstring.ConstBitStream(aFile)
aTuple = aBinaryStream.find('0b111110101111001100100000') #the sync marker

If found, the position in the file is moved to the found location. Then you can read byte aligned data.

aBuffer = aBinaryStream.read('bytes:1024') # to read 1024 bytes

answered Nov 15 '18 at 20:16

GAF

7624

add a comment |

MisterMiyagi's answer is a good solution. Another solution uses the bitstring module.

aFile = open(someFilePath, 'rb')
aBinaryStream = bitstring.ConstBitStream(aFile)
aTuple = aBinaryStream.find('0b111110101111001100100000') #the sync marker

If found, the position in the file is moved to the found location. Then you can read byte aligned data.

aBuffer = aBinaryStream.read('bytes:1024') # to read 1024 bytes

answered Nov 15 '18 at 20:16

GAF

7624

MisterMiyagi's answer is a good solution. Another solution uses the bitstring module.

aFile = open(someFilePath, 'rb')
aBinaryStream = bitstring.ConstBitStream(aFile)
aTuple = aBinaryStream.find('0b111110101111001100100000') #the sync marker

If found, the position in the file is moved to the found location. Then you can read byte aligned data.

aBuffer = aBinaryStream.read('bytes:1024') # to read 1024 bytes

answered Nov 15 '18 at 20:16

GAF

7624

answered Nov 15 '18 at 20:16

GAF

7624

answered Nov 15 '18 at 20:16

GAF

7624

answered Nov 15 '18 at 20:16

GAF

7624

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Odtnhj

Finding a bit pattern in a binary file using Python and memory map

2 Answers
2

Searching non-aligned bits

Reading non-aligned bits

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Searching non-aligned bits

Reading non-aligned bits

Searching non-aligned bits

Reading non-aligned bits

Searching non-aligned bits

Reading non-aligned bits

Searching non-aligned bits

Reading non-aligned bits

Post as a guest

這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

Museum of Modern and Contemporary Art of Trento and Rovereto

Finding a bit pattern in a binary file using Python and memory map

2 Answers 2

Searching non-aligned bits

Reading non-aligned bits

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Searching non-aligned bits

Reading non-aligned bits

Searching non-aligned bits

Reading non-aligned bits

Searching non-aligned bits

Reading non-aligned bits

Searching non-aligned bits

Reading non-aligned bits

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

In R, how to develop a multiplot heatmap.2 figure showing key labels successfully

Museum of Modern and Contemporary Art of Trento and Rovereto

2 Answers
2

2 Answers
2

2 Answers
2