79785868

Date: 2025-10-08 21:25:36
Score: 2
Natty:
Report link

Here is the reasoning from the inventor of the F1 score, C.J. Rijsbergen, in his 1979 PhD thesis.

Define the set A of all positive items (|A|=TP+FN) and the set B of all items classified as positive (|B|=TP+FP). The "symmetric difference" A-B is all items that appear in A or B but not both. These are the false positives and false negatives (|A-B|=FP+FN).

We want to minimize the size of A-B, which ranges from 0 to |A|+|B|. Rijsbergen argues that in fact we want to minimize a normalized size of A-B, defined as enter image description here

Since we are looking for a "performance metric", ie something to maximize, let's instead define F=1-E and maximize that. Plugging in the definitions and crunching the algebra we get that this F is indeed the F1 score, as shown below.

enter image description here

Reasons:
  • Blacklisted phrase (1): enter image description here
  • Long answer (-0.5):
  • No code block (0.5):
  • Low reputation (1):
Posted by: Gabriel Gomes