In case someone else encountered this, After much troubleshooting and experimentation, I realised that KCL is not designed to work this way.
KCL will automatically scale your application by spinning up multiple threads in the same machine, based on the number of kinesis shards.
KCL is not designed to work directly with platforms auto scalling.
Setting up autoscalling on AWS will not help. For example, say you only have 5 shards, but the messages are a lot that your auto scalling spins up 10 more instances, only 5 of those instances will be used by KCL (1 per shard) while the rest of your spun up instances will be idle.
If the stream's thoroughput is still too high, auto scalling will continue provissioning until max number of instances but none will be useful.
If you can sacrifice a bit of lag, SQS is a better approach.