79220951

Date: 2024-11-24 19:52:12
Score: 0.5
Natty:
Report link
def handle_default_values(df):
    for column, dtype in df.dtypes:
        if dtype == 'int':
            df = df.withColumn(column, when(col(column).isNull(), lit(-1)).otherwise(col(column)))
        elif (dtype == 'float') or (dtype == 'double') or (dtype =='decimal(18,2)'):
            df = df.withColumn(column, when(col(column).isNull(), lit(0.0)).otherwise(col(column)))
        elif dtype == 'string':
            df = df.withColumn(column, when(col(column).isNull(), lit('UNK')).otherwise(col(column)))
        elif dtype == 'timestamp':
            df = df.withColumn(column, when(col(column).isNull(), to_date(lit('1900-01-01'))).otherwise(col(column)))
    return df
Reasons:
  • Long answer (-0.5):
  • Has code block (-0.5):
  • Unregistered user (0.5):
  • Low reputation (1):
Posted by: pugazhendhi