I did some experiment on this topic and came up with observations that I need to understand in light of prior analyses reported above. I have a DF with three columns named A, B, and C. My goal is see if groupby
stores a copy of the DF. My test code snippet is as follows:
# Make Df with columns A, B, C.
grp = df.groupby(by=['A', 'B'])
del df
print(grp.transform(lambda x: x)) # The above outputs the whole DF.
The above snippet seems to indicate that grp
contains the DF because the original DF has been deleted and grp
can still produce it. Is this conclusion true?
May be that grp
maintains a pointer to the DF and after the del
operation, the reference count does not go to zero so the data hangs around in memory for grp
to use. Can this be true?
My Pandas is V 2.2.2. Thanks in advance for clarification.