Lead Data Engineer
Apply now »Posted On: 13 Apr 2026
Location: Noida, UP, India
Company: Iris Software
Are you ready to do the best work of your career at one of India’s Top 25 Best Workplaces in IT industry? Do you want to grow in an award-winning culture that truly values your talent and ambitions?
Join Iris Software — one of the fastest-growing IT services companies — where you own and shape your success story.
At Iris Software, our vision is to be our client’s most trusted technology partner, and the first choice for the industry’s top professionals to realize their full potential.
At Iris, every role is more than a job — it’s a launchpad for growth.
Job Description
We are seeking a highly skilled Senior Data Engineer with 9+ years of 100% hands-on experience building and maintaining enterprise-grade data pipelines. This is a pure Individual Contributor role focused on writing production-quality code, developing scalable ETL/ELT solutions using PySpark and AWS, and orchestrating workflows with Airflow. If you thrive on solving complex technical problems and shipping robust, well-tested code, this role is for you.
Key Responsibilities
• Develop and maintain robust, scalable ETL/ELT pipelines using PySpark on AWS EMR
• Build data ingestion and transformation workflows from diverse sources (S3, EMR, RDS, Kafka, APIs) into AWS-based data lakes and warehouses
• Write clean, modular, testable Python code following best practices and coding standards
• Implement comprehensive unit tests using pytest/unittest with mocking, fixtures, and high code coverage
• Design and build production-grade Airflow DAGs for workflow orchestration, scheduling, and monitoring
• Optimize Spark jobs for performance, memory efficiency, and cost reduction
• Implement CI/CD pipelines for automated testing and deployment using Jenkins, GitHub Actions, or AWS CodePipeline
• Troubleshoot and debug complex data pipeline issues in production environments
• Collaborate with Data Scientists, Analysts, and Platform Engineers to deliver data solutions
• Ensure data quality, security, and compliance standards are met
Required Skills & Qualifications
• 9+ years of hands-on data engineering experience (no management responsibilities required)
• Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
• Expert-level Python programming – OOP, design patterns, clean code practices
• Advanced PySpark/Spark skills – partitioning strategies, shuffle optimization, memory tuning, broadcast joins
• Strong unit testing expertise using pytest/unittest – mocking, parametrized tests, fixtures, TDD mindset
• Hands-on Airflow experience – DAG design, custom operators, sensors, XComs, debugging failed tasks
• Deep AWS experience: S3, EMR, Glue, Redshift, Lambda, Step Functions, IAM, CloudWatch
• Solid understanding of data lake and warehouse architectures (medallion architecture, Delta Lake)
• Strong SQL skills – complex queries, window functions, query optimization
• Proficiency with Git, code reviews, and collaborative development workflows
• Experience with CI/CD pipelines and automated testing frameworks
Nice to Have (Preferred)
• Familiarity with Docker for containerized data workloads
• Exposure to streaming data (Kafka, Spark Streaming)
• Knowledge of data quality frameworks
• Background in financial services or regulated industries
• Understanding of data security and privacy practices (GDPR)
Mandatory Competencies
Perks and Benefits for Irisians
Iris provides world-class benefits for a personalized employee experience. These benefits are designed to support financial, health and well-being needs of Irisians for a holistic professional and personal growth. Click here to view the benefits.