Data Engineer - Lead
Apply now »Posted On: 18 May 2026
Location: Noida, UP, India
Company: Iris Software
Are you ready to do the best work of your career at one of India’s Top 25 Best Workplaces in IT industry? Do you want to grow in an award-winning culture that truly values your talent and ambitions?
Join Iris Software — one of the fastest-growing IT services companies — where you own and shape your success story.
At Iris Software, our vision is to be our client’s most trusted technology partner, and the first choice for the industry’s top professionals to realize their full potential.
At Iris, every role is more than a job — it’s a launchpad for growth.
Job Description
PRIMARY RESPONSIBILITIES
- Design and implement robust, scalable data pipelines to support AI/ML model development and deployment
- Clean, transform, and curate structured and unstructured data from diverse sources to ensure model-ready quality
- Collaborate with data scientists, ML engineers, and business teams to ensure data readiness, usability, and alignment with AI objectives
- Develop and maintain metadata management, data lineage, and data quality frameworks to support AI governance and compliance
- Enable advanced feature engineering capabilities and implement real-time data streaming solutions for AI applications
- Design and deploy data collection instruments and oversee data aggregation processes
- Ensure data compliance, privacy, and ethical use standards across all AI workflows
- Support enterprise-wide data initiatives including business glossary development, taxonomy creation, and the DARE program's automation goals
KNOWLEDGE/SKILLS
- Programming & Technical Skills: Advanced proficiency in Python, SQL, and distributed computing frameworks (Spark, Hadoop)
- Cloud Platforms: Hands-on experience with cloud data platforms (AWS, Azure, GCP) and their data services
- Data Engineering Tools: Expertise with data orchestration tools (Airflow, Prefect), ETL/ELT frameworks, and data pipeline optimization
- AI/ML Data Specialization: Experience with ML feature stores, data versioning, model-ready data design, and real-time streaming technologies
- Data Management: Strong understanding of data governance, metadata management, data quality frameworks, and lineage tracking
- Compliance & Ethics: Knowledge of data privacy regulations, compliance standards, and ethical AI data practices
- Collaboration: Excellent communication skills and ability to work effectively with cross-functional teams including data scientists, ML engineers, and business stakeholders
EDUCATION AND EXPERIENCE
Required:
- Bachelor's degree in Computer Science, Data Engineering, Computer Engineering, or related technical field
- 3+ years of experience in data engineering with demonstrated experience in AI/ML environments
- Proficiency in Python and SQL for data manipulation and pipeline development
- Experience with cloud data platforms (AWS, Azure, or GCP)
- Hands-on experience with data orchestration tools (e.g., Airflow, Prefect) and ETL frameworks
Preferred:
- Master's degree in Computer Science, Data Engineering, or related field
- 5+ years of experience in data engineering, preferably in AI/ML environments
- Experience with real-time data processing (Kafka, Kinesis).
- Exposure to LLMs and generative AI data preparation.
- Knowledge of MLOps and integration with ML lifecycle tools.
- Familiarity with BI tools and semantic layer design.
Must Have -
• Python & advanced SQL
• Spark / distributed processing
• Cloud platforms (AWS)
• Airflow / orchestration tools
• Data pipeline automation
• Data governance & quality
Nice to have -
• Kafka/Kinesis streaming
• Feature stores
• LLM‑ready data pipelines
• MLOps exposure
Mandatory Competencies
Perks and Benefits for Irisians
Iris provides world-class benefits for a personalized employee experience. These benefits are designed to support financial, health and well-being needs of Irisians for a holistic professional and personal growth. Click here to view the benefits.