Reports

Here is the reasoning from the inventor of the F1 score, C.J. Rijsbergen, in his 1979 PhD thesis.

Define the set A of all positive items (|A|=TP+FN) and the set B of all items classified as positive (|B|=TP+FP). The "symmetric difference" A-B is all items that appear in A or B but not both. These are the false positives and false negatives (|A-B|=FP+FN).

We want to minimize the size of A-B, which ranges from 0 to |A|+|B|. Rijsbergen argues that in fact we want to minimize a normalized size of A-B, defined as enter image description here

Since we are looking for a "performance metric", ie something to maximize, let's instead define F=1-E and maximize that. Plugging in the definitions and crunching the algebra we get that this F is indeed the F1 score, as shown below.

enter image description here

79785868