79602114

Date: 2025-05-01 15:37:42
Score: 2
Natty:
Report link

OK, time to answer my own question with what happened so that future Andrews who might also have this problem know how to solve it:

The hardware that we were running Tableau Server on was quite old (budget constraints) so the scheduled cleanup jobs weren't running properly. Extracts that were old and were supposed to get deleted weren't getting deleted.

I discovered this by querying the PostgreSQL database on the Tableau server itself, seeing which extracts were present in the database, and comparing that to the files that existed on the hard drive. IIRC, the files were named with GUIDs, and stored like Squid stores it's files, in subdirectories "00" to "FF" in the \data\TabSvc\DataEngine\Extract directory.

So I created a C# program to query the internal Postgres database, get a list of extracts, iterate over the files in the extract subdirectories, and delete whatever GUIDs weren't in the database. PLEASE NOTE THAT WHEN WE MIGRATED TO NEWER HARDWARE WITH A FASTER CPU, THIS WAS NO LONGER AN ISSUE, OLD EXTRACTS WERE BEING HARVESTED (yes, they call it "harvest" or "harvesting" old extracts) CORRECTLY.

For reference, the table was "public.extracts" (or something like that, consult Tableau's documentation for your specific version):
https://tableau.github.io/tableau-data-dictionary/2025.1/data_dictionary.htm#public.extracts_anchor

Reasons:
  • Blacklisted phrase (1): how to solve
  • RegEx Blacklisted phrase (2): know how to solve
  • Long answer (-1):
  • Has code block (-0.5):
  • Self-answer (0.5):
Posted by: AndrewJacksonZA