For your small project involving batch data ingestion into an HDFS data lake with formats like RDBMS, CSV, and flat files, here are recommendations based on the details you shared:
Talend:
Talend is an excellent choice for batch data ingestion. It supports various data formats, including RDBMS and flat files, and offers a low-code interface for creating pipelines. Its integration with HDFS makes it highly suitable for your use case.
Hevo:
Hevo simplifies data ingestion with its no-code platform. It supports batch ingestion and has over 150 pre-configured connectors for diverse data sources, including RDBMS and CSV files. Hevo’s drag-and-drop interface makes it beginner-friendly.
Apache Kafka:
Although Kafka is better known for real-time streaming, it can also be configured for batch ingestion. Its scalability and robust support for HDFS make it a reliable option for your project.
Estuary Flow:
Estuary Flow offers real-time and batch processing capabilities. With minimal coding required, it’s an excellent choice for ingesting CSV and flat files into HDFS efficiently.
For your specific project, Talend and Hevo stand out for their simplicity and direct integration with HDFS. Choose the one that aligns best with your familiarity and project requirements.