79763514

Date: 2025-09-13 05:16:00
Score: 1
Natty:
Report link
  1. There are two major options IMO. Microservice or Message buses like Kafka. Depends on your use cases and tech stack feasibility. if the handler is just a forwarder, message buses like Kafka works best to forward the data to spark to process. The choice also depends on how often the data needs to be processed. (i.e if it is streaming or batch). There are options to even directly read from S3 in spark. you don't even need to have handlers in between. Only thing you need to take care of is reading just new files arrived by appropriate filters.

  2. Adding to my first point. Microservice be it .net or python needs a lot more to add for ex. how do you send the data to spark. Will there be a storage or if its directly will it be sockets or JDBC etc. etc and scaling might be difficult if your data increases over time.

  3. The first two points out the pros and cons. I don't see much of pros of using .net as a handler to send data to spark due to scaling and maintainability restraints.

  4. Kafka or event hub or any such message queues would scale much better IMO. Kafka is opensource too.

    https://dev.to/matteojoliveau/microservices-communications-why-you-should-switch-to-message-queues--48ia
    https://medium.com/@Shamimw/connect-to-aws-s3-and-read-files-using-apache-spark-186943a5169a

Reasons:
  • Blacklisted phrase (1): how do you
  • Blacklisted phrase (0.5): medium.com
  • Long answer (-1):
  • No code block (0.5):
Posted by: Vindhya G