You can construct a new (flattened) dtype for the array and reinterpret the data based on the new type.
def flatten_dtype(dtype: np.dtype, join_char = '_'):
if not dtype.fields:
# Not a structured type
return dtype
fields = {
'names': [],
'formats': [],
'offsets': [],
'itemsize': dtype.itemsize
}
for field_name, (field_dtype, field_offset) in dtype.fields.items():
flattened_dtype = flatten_dtype(field_dtype)
if not flattened_dtype.fields:
# Field is not a structured type, just add it as is
fields['names'].append(field_name)
fields['formats'].append(field_dtype)
fields['offsets'].append(field_offset)
continue
for flattened_field_name, (flattened_field_dtype, flattened_field_offset) in flattened_dtype.fields.items():
# Field is a structured type, so break it down into its subtypes
fields['names'].append(f'{field_name}{join_char}{flattened_field_name}')
fields['formats'].append(flattened_field_dtype)
fields['offsets'].append(field_offset + flattened_field_offset)
return np.dtype(fields)
In the given example, this could be used as follows:
import pandas as pd
# print(example)
# print(example.dtype)
flattened_example = example.view(flatten_dtype(example.dtype))
# print(flattened_example)
# print(flattened_example.dtype)
df = pd.DataFrame(flattened_example)
print(df)
which gives output as desired:
state variability target measured_mean measured_low measured_hi var_mid var_low var_hi
0 4.0 0.0 0.51 0.52 0.41 0.68 0.6 0.2 0.2
1 5.0 0.0 0.89 0.80 0.71 1.12 0.6 0.2 0.2
2 4.0 -1.0 0.59 0.62 0.46 0.78 0.6 0.2 0.2
3 5.0 -1.0 0.94 1.10 0.77 1.19 0.6 0.2 0.2
This solution has the advantage of only operating on the type of the array, rather than its contents. This will likely be more efficient for large arrays than any solution that treats columns individually.