No need to use a regex for this at all! They are a solution of "last resort" - certainly they are very powerful, but they are also resource heavy! As a rule of thumb, you're better off using PostgreSQL's built-in functions in preference to regexes.
Sorry about the state of the results parts of my post (see fiddle) but this new SO interface is God awful!
So, I did the following (all of the code below is available on the fiddle here):
SELECT
label,
REVERSE(label),
SPLIT_PART(REVERSE(label), ' ', 1),
RIGHT(SPLIT_PART(REVERSE(label), ' ', 1), 1),
CASE
WHEN RIGHT(SPLIT_PART(REVERSE(label), ' ', 1), 1) IN ('R', 'W')
THEN 'KO'
ELSE 'OK'
END AS result,
should
FROM
t;
Result:
label reversesplit_partrightresultshouldThomas Hawk AQWS456654SWQA kwaH samohT654SWQAAOKOKCecile Star RQWS456654SWQR ratS eliceC654SWQRRKOKOMickey Mouse WQWS456654SWQW esuoM yekciM654SWQWWKOKODonald Duck SQWS456654SWQS kcuD dlanoD654SWQSSOKOK
It's more general than the other code, because it'll pick out the first character of the last string, no matter how many strings precede the final (target) string.
So, to check my assertion that the "ordinary" functions are better than regexes, I wrote the following function (https://stackoverflow.com/a/14328164/470530) (Erwin Brandstetter to the rescue yet again - I also found this thread (<https://stackoverflow.com/questions/24938311/create-a-function-declaring-a-predefined-text-array >) helpful).
CREATE OR REPLACE FUNCTION random_pick(arr TEXT[])
RETURNS TEXT
LANGUAGE sql VOLATILE PARALLEL SAFE AS
$func$
SELECT (arr)[trunc((random() * (CARDINALITY($1)) + 1))::int];
$func$;
It returns a random element from an array.
So, I have to construct a table to test against - you can see the code in the fiddle - the last few lines are:
SELECT
random_pick((SELECT w1 FROM words_1)) || ' ' ||
random_pick((SELECT w1 FROM words_1)) || ' ' ||
random_pick((SELECT w2 FROM words_2)) AS label
FROM
GENERATE_SERIES(1,25)
)
SELECT
*,
CASE
WHEN RIGHT(SPLIT_PART(REVERSE(t.label), ' ', 1), 1) IN ('R', 'W')
THEN 'KO'
ELSE 'OK'
END AS result
FROM
t;
Result (snipped for brevity):
labelresultMouse Mouse WQWS456KOStar Cecile WQWS456KOMickey Star SQWS456OKStar Cecile RQWS456KOStar Hawk AQWS456OK
I also test to see that my records are inserted and that the data looks OK1
Now, down to the nitty-gritty! I'm using PostgreSQL 18 beta because I want to look at the new features in EXPLAIN ANALYZE VERBOSE
. Here's the output of that functionality for my query:
Seq Scan on public.lab (cost=0.00..2369.64 rows=86632 width=64) (actual time=0.013..49.298 rows=100000.00 loops=1) Output: label, CASE WHEN ("right"(split_part(reverse(label), ' '::text, 1), 1) = ANY ('{R,W}'::text[])) THEN 'KO'::text ELSE 'OK'::text END Buffers: shared hit=637Planning Time: 0.027 msExecution Time: 54.068 ms
Note: Planning Time: 0.024 ms Execution Time: 50.476 ms
The output for Guillaume's query are similar (trumpets sound) except for the last 2 lines:
Planning Time: 0.044 ms Execution Time: 150.038 ms
Over a few runs, my query takes approx. 33 - 35% of the time that his does. So, my original assertion holds true - regexes - Caveat Emptor!