79589115

Date: 2025-04-23 17:02:47
Score: 1.5
Natty:
Report link

If your paths have a limit to how nested they are, it might be a good idea to unroll them into separate columns, with the higher paths as SYMBOLs (if there aren’t too many distinct values). Filtering on SYMBOL columns is much faster than on strings.

Do you have an estimate of the cardinality i.e. how many entries you expect to be in those columns?

If each are less than perhaps 100,000, symbols could be a good option.

Filtering on SYMBOL columns (or numbers) can be JITed i.e. compiled to native code. Filtering against VARCHAR will use Java-side string comparisons.

You can also try searching using LIKE on the single path column, and see how fast it is. Maybe it performs better than a substring.

It is better to use VARCHAR rather than STRING if you are not using SYMBOL. This will half the amount of storage you need if your text is mostly ASCII.

Reasons:
  • RegEx Blacklisted phrase (2.5): Do you have an
  • Long answer (-0.5):
  • Has code block (-0.5):
  • Contains question mark (0.5):
  • Self-answer (0.5):
  • High reputation (-1):
Posted by: Javier Ramirez