79272070

Date: 2024-12-11 14:33:27
Score: 0.5
Natty:
Report link

You can achieve what you want, by processing each group in b using the following code:

import pyarrow as pa
import pyarrow.compute as pc

table = pa.table({'a': [1, 2, 3, 4, 5, 6], 'b': ['x']*3 + ['y']*3})

unique_b = pc.unique(table['b'])

cumsum_list = []
b_list = []

for value in unique_b:
    mask = pc.equal(table['b'], value)
    group = table.filter(mask)
    cumsum = pc.cumulative_sum(group['a'])
    cumsum_list.extend(cumsum)
    b_list.extend(group['b'])

final_result = pa.table({'a': cumsum_list, 'b': b_list})

To visualize the result you can convert it back to pandas using:

print(final_result.to_pandas())

which returns the following:

enter image description here

Reasons:
  • Probably link only (1):
  • Long answer (-0.5):
  • Has code block (-0.5):
  • Low reputation (0.5):
Posted by: Kelo