Position Overview:
We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. The ideal candidate will have extensive experience with AWS Glue, Apache Airflow, Kafka, SQL, Pythonand DataOps tools and technologies. Knowledge of SAP HANA & Snowflake is a plus. This role is critical for designing, developing, and maintaining our client’s data pipeline architecture, ensuring the efficient and reliable flow of data across the organization.
Key Responsibilities:
- Design, Develop, and Maintain Data Pipelines:
- Develop robust and scalable data pipelines using AWS Glue, Apache Airflow, and other relevant technologies.
- Integrate various data sources, including SAP HANA, Kafka, and SQL databases, to ensure seamless data flow and processing.
- Optimize data pipelines for performance and reliability.
- Data Management and Transformation:
- Design and implement data transformation processes to clean, enrich, and structure data for analytical purposes.
- Utilize SQL and Python for data extraction, transformation, and loading (ETL) tasks.
- Ensure data quality and integrity through rigorous testing and validation processes.
- Collaboration and Communication:
- Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet their needs.
- Collaborate with cross-functional teams to implement DataOps practices and improve data life cycle management.
- Monitoring and Optimization:
- Monitor data pipeline performance and implement improvements to enhance efficiency and reduce latency.
- Troubleshoot and resolve data-related issues, ensuring minimal disruption to data workflows.
- Implement and manage monitoring and alerting systems to proactively identify and address potential issues.
- Documentation and Best Practices:
- Maintain comprehensive documentation of data pipelines, transformations, and processes.
- Adhere to best practices in data engineering, including code versioning, testing, and deployment procedures.
- Stay up-to-date with the latest industry trends and technologies in data engineering and DataOps.
Required Skills and Qualifications:
- Extensive experience with AWS Glue for data integration and transformation.
- Proficient in Apache Airflow for workflow orchestration.
- Strong knowledge of Kafka for real-time data streaming and processing.
- Advanced SQL skills for querying and managing relational databases.
- Proficiency in Python for scripting and automation tasks.
- Experience with SAP HANA for data storage and management.
- Familiarity with DataOps tools and methodologies for continuous integration and delivery in data engineering.
- Knowledge of Snowflake for cloud-based data warehousing solutions.
- Experience with other AWS data services such as Redshift, S3, and Athena.
- Familiarity with big data technologies such as Hadoop, Spark, and Hive.
- Strong analytical and problem-solving skills.
- Excellent communication and collaboration abilities.
- Detail-oriented with a commitment to data quality and accuracy.
- Ability to work independently and manage multiple projects simultaneously.