79386397

Date: 2025-01-25 08:24:58
Score: 1.5
Natty:
Report link

I found the reason for this strange behavior and thought I would share insight into the inner workings of changing basis in python

tl;dr : I need to pad my base 16 output up to the size of a 32 byte container.

base10 = 199 
container_size= 32 # in bytes
bits_in_byte = 8
bytes_string = int(base10).to_bytes(container_size, 'big')
print(f"Lets convert the base 10 value of {base10}, into a {container_size} bytes container, using int({base10}).to_bytes({container_size}, 'big')=\n{bytes_string}\n")

print(f"NOTICE: the leading zeros due to using big-endian format (with little-endian, they would trailing insted of leading),\nand that this has a length of {container_size}. I.e. len(int({base10}).to_bytes({container_size}, 'big')) = {len(bytes_string)}\n")

print(f"NOTICE: the leading zeros do not appear in the conversion to base 2 from base 10 of value of {base10} using the bin(int({base10})).lstrip('0b') method\n(here there is no specification of the container size which {base10} should be placed into)\n")
print(f"bin({base10})={bin(int(base10)).lstrip('0b')}, a length of {bin(int(base10)).lstrip('0b').__len__()}")
# print(f"But we know that this is a number {x} is supposed to be in 32 bytes (256 bit) container")
print(f"Hence, we may fill with leading zeros (given big-endian format), until we read a length of {container_size}x{bits_in_byte}=256 given a base 2 value, equaling the container size of 32 bytes ")
print(f"I.e.: {bin(int(base10)).lstrip('0b')} and \n{bin(int(base10)).lstrip('0b').zfill(container_size * bits_in_byte)}\nwill be parsed by python as the same value.")
xb = bin(int(base10)).lstrip('0b')
xbp = bin(int(base10)).lstrip('0b').zfill(container_size * bits_in_byte)
print("\nPROOF:")
print(f"Are\n{xb}\nand\n{xbp}\nequal when parse from base 2 into base 10?")
print(int(xb, 2) == int(xbp, 2)) 

print(f"\nBut for serialization and in my circumstances I want to serialized the values, padded upto their container size of {container_size} bytes but in base 16\n")

print(f"The problem arises when changing the base from 2, to 16, hexadicmal, and calculating what the container length of a 32 byte container should be, when representing a value in base 16\n")
print(f"The value of {base10} (base 10), in base 16 presentation: {hex(int(base10)).lstrip('0x')}")
print(f"But given a container size of {container_size} bytes, what should the length of of the container be in base 16?\n")
print(f"To jump from base 2 to base 16, you exponentiate the base 2 to the power of 4; 2**4=16\n")
print(f"This means the final container length for the value {base10} (base 10) into base 16, given a container size {container_size} bytes,  will have a length of {256}/{4}={int(256/4)}.\nI.e. the container lengths in the different basis changes, for the same container size")
base16_32byte_contaner_length = int(256/4) 
print(f"I.e\n{hex(int(base10)).lstrip('0x')} and\n {hex(int(base10)).lstrip('0x').zfill(base16_32byte_contaner_length)}\n will be parsed by python as the same value\n")
print("PROOF:")
xh = hex(int(base10)).lstrip('0x')
xhp = hex(int(base10)).lstrip('0x').zfill(base16_32byte_contaner_length)
print(f"Are\n{xh}\nand\n{xhp}\nequal when parsed from base 16 in base 10?")
print(int(xh, 16) == int(xhp, 16)) 

The output for this should be

Lets convert the base 10 value of 199, into a 32 bytes container, using int(199).to_bytes(32, 'big')=
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc7'

NOTICE: the leading zeros due to using big-endian format (with little-endian, they would trailing insted of leading),
and that this has a length of 32. I.e. len(int(199).to_bytes(32, 'big')) = 32

NOTICE: the leading zeros do not appear in the conversion to base 2 from base 10 of value of 199 using the bin(int(199)).lstrip('0b') method
(here there is no specification of the container size which 199 should be placed into)

bin(199)=11000111, a length of 8
Hence, we may fill with leading zeros (given big-endian format), until we read a length of 32x8=256 given a base 2 value, equaling the container size of 32 bytes 
I.e.: 11000111 and 
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011000111
will be parsed by python as the same value.

PROOF:
Are
11000111
and
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011000111
equal when parse from base 2 into base 10?
True

But for serialization and in my circumstances I want to serialized the values, padded upto their container size of 32 bytes but in base 16

The problem arises when changing the base from 2, to 16, hexadicmal, and calculating what the container length of a 32 byte container should be, when representing a value in base 16

The value of 199 (base 10), in base 16 presentation: c7
But given a container size of 32 bytes, what should the length of of the container be in base 16?

To jump from base 2 to base 16, you exponentiate the base 2 to the power of 4; 2**4=16

This means the final container length for the value 199 (base 10) into base 16, given a container size 32 bytes,  will have a length of 256/4=64.
I.e. the container lengths in the different basis changes, for the same container size
I.e
c7 and
 00000000000000000000000000000000000000000000000000000000000000c7
 will be parsed by python as the same value

PROOF:
Are
c7
and
00000000000000000000000000000000000000000000000000000000000000c7
equal when parsed from base 16 in base 10?
True
Reasons:
  • Blacklisted phrase (0.5): I need
  • RegEx Blacklisted phrase (1): I want
  • Long answer (-1):
  • Has code block (-0.5):
  • Self-answer (0.5):
  • Low reputation (1):
Posted by: Akin Wilson