Reports

i find your method very interesting , however in a case where you don't know anything about your data, meaning you don't know the optimal distance threshold , how would you encounter the problem ( also you can't choose it based on intuition with no profound study , since you can have a lot of clusters/sets of points with big variety of deviations , very dense ones vs very sparse ones ) ? My goal is: given a numerical variable ( 1 dimension only ) , do an agglomerative clustering and get the optimal number of clusters to do a binning just after that and transform it to a categorical one , I'm thinking about using some indicators of information purity in order to measure the impact of each merge, but i haven't figured it out yet.

Reasons:

Blacklisted phrase (1): how would you
Long answer (-0.5):
No code block (0.5):
Contains question mark (0.5):
Single line (0.5):
Low reputation (1):

Posted by: Zakaria El Kazdam

79685870