79618320

Date: 2025-05-12 17:06:43
Score: 0.5
Natty:
Report link

I got it. The main reason why, after each row, I was getting the null rows was because of hidden characters in the CSV file

  1. Windows ends lines with Carriage Return + Line Feed (\r\n)

  2. Unix/Linux uses just Line Feed (\n)

If a file created on Windows is read on a Unix/Linux system (like Hadoop/Hive), the \r character can:

  1. Show up as "invisible junk" (like ^M or CR)

  2. Break parsers or formatters (like Hive or awk), resulting in:

    • Extra blank rows

    • All NULL columns

    • Malformed data

So that's the reason why I was getting empty null rows after each valid data row,

Sol: I used dos2unix, which converts our files to Linux format, and I got the expected result.

Reasons:
  • Long answer (-0.5):
  • Has code block (-0.5):
  • Self-answer (0.5):
  • Low reputation (1):
Posted by: Akhilesh