After lots of trial and error I got these two variants which both work:
INSERT INTO test_csv
SELECT
a,
cast(splitByChar(',', b) as Array(Int16)) AS c
FROM file('test.csv', 'CSVWithNames', 'auto')
SETTINGS schema_inference_make_columns_nullable = 0
INSERT INTO test_csv
SELECT
a,
arrayMap(x -> toInt16(x), splitByChar(',', b)) AS c
FROM file('test.csv', 'CSVWithNames', 'auto')
SETTINGS schema_inference_make_columns_nullable = 0
Also, if there are values where the column b is empty (such as a line y,""
) then replace b in the statements above with:
CASE WHEN b = '' THEN '0' ELSE b END
I am not sure why adding the SELECT * around the entire query does not work. I guess the nested select causes confusion for ClickHouse? I think this may be a bug?