Description
Our Purpose
Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Title and Summary
Senior / Lead Data Engineer We are seeking great talents for our roles - Lead Data Engineer & Senior Data Engineer to join Mastercard Foundry R&D. You will help shape our innovation roadmap by exploring new technologies and building scalable, data‑driven prototypes and products. The ideal candidate is hands‑on, curious, adaptable, and motivated to experiment and learn.Lead Data Engineer
What You'll Do
* Drive Data Architecture: Own the data architecture and modeling strategy for AI projects. Define how data is stored, organized, and accessed. Select technologies, design schemas/formats, and ensure systems support scalable AI and analytics workloads.
* Build Scalable Data Pipelines: Lead development of robust ETL/ELT workflows and data models. Build pipelines that move large datasets with high reliability and low latency to support training and inference for AI and generative AI systems.
* Ensure Data Quality & Governance: Oversee data governance and compliance with internal standards and regulations. Implement data anonymization, quality checks, lineage, and controls for handling sensitive information.
* Provide Technical Leadership: Offer hands‑on leadership across data engineering projects. Conduct code reviews, enforce best practices, and promote clean, well‑tested code. Introduce improvements in development processes and tooling.
* Cross‑Functional Collaboration: Work closely with engineers, scientists, and product stakeholders. Scope work, manage data deliverables in agile sprints, and ensure timely delivery of data components aligned with project milestones.
What You'll Bring
* Extensive Data Engineering Experience: 8–12+ years in data engineering or backend engineering, including senior/lead roles. Experience designing end‑to‑end data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production.
* Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Hands‑on work with AWS, Azure, or GCP using cloud‑native processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, cost‑efficient workloads for experimental and variable R&D environments.
* AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learning—dataset preparation, feature/label management, and supporting real‑time or batch training pipelines. Experience with feature stores or streaming data is useful.
* Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
* Problem‑Solving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
* Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 8-12+ years of proven experience architecting and operating production‑grade data systems, especially those supporting analytics or ML workloads.
* Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
* Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
* Big Data Technologies: Hands‑on Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
* Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloud‑based processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
* Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
* Collaboration & Agile Delivery: Strong communication skills and experience working with cross‑functional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
* Advanced Cloud & Data Platform Expertise: Experience with AWS data engineering services, Databricks, and Lakehouse/Delta Lake architectures (including bronze/silver/gold layers).
* Modern Data Stack: Familiarity with dbt, Great Expectations, containerization (Docker/Kubernetes), and monitoring tools like Grafana or cloud‑native monitoring.
* DevOps & CI/CD for Data: Experience implementing CI/CD pipelines for data workflows and using IaC tools like Terraform or CloudFormation. Knowledge of data versioning (e.g., Delta Lake time‑travel) and supporting continuous delivery for ML systems.
* Continuous Learning: Motivation to explore emerging technologies, especially in AI and generative AI data workflows.
Senior Data Engineer
What You'll Do
* Drive Data Architecture: Own the data architecture and modeling strategy for AI projects. Define how data is stored, organized, and accessed. Select technologies, design schemas/formats, and ensure systems support scalable AI and analytics workloads.
* Build Scalable Data Pipelines: Lead development of robust ETL/ELT workflows and data models. Build pipelines that move large datasets with high reliability and low latency to support training and inference for AI and generative AI systems.
* Ensure Data Quality & Governance: Oversee data governance and compliance with internal standards and regulations. Implement data anonymization, quality checks, lineage, and controls for handling sensitive information.
* Provide Technical Leadership: Offer hands‑on leadership across data engineering projects. Conduct code reviews, enforce best practices, and promote clean, well‑tested code. Introduce improvements in development processes and tooling.
* Cross‑Functional Collaboration: Work closely with engineers, scientists, and product stakeholders. Scope work, manage data deliverables in agile sprints, and ensure timely delivery of data components aligned with project milestones.
What You'll Bring
* Data Engineering Experience: Experience in data engineering or backend engineering. Experience designing end‑to‑end data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production would be a plus.
* Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Hands‑on work with AWS, Azure, or GCP using cloud‑native processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, cost‑efficient workloads for experimental and variable R&D environments.
* AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learning—dataset preparation, feature/label management, and supporting real‑time or batch training pipelines. Experience with feature stores or streaming data is useful.
* Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
* Problem‑Solving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
* Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 5+ years of proven experience architecting and operating production‑grade data systems, especially those supporting analytics or ML workloads.
* Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
* Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
* Big Data Technologies: Hands‑on Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
* Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloud‑based processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
* Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
* Collaboration & Agile Delivery: Strong communication skills and experience working with cross‑functional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
* Advanced Cloud & Data Platform Expertise: Experience with AWS data engineering services, Databricks, and Lakehouse/Delta Lake architectures (including bronze/silver/gold layers).
* Modern Data Stack: Familiarity with dbt, Great Expectations, containerization (Docker/Kubernetes), and monitoring tools like Grafana or cloud‑native monitoring.
* DevOps & CI/CD for Data: Experience implementing CI/CD pipelines for data workflows and using IaC tools like Terraform or CloudFormation. Knowledge of data versioning (e.g., Delta Lake time‑travel) and supporting continuous delivery for ML systems.
* Continuous Learning: Motivation to explore emerging technologies, especially in AI and generative AI data workflows.
Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
Abide by Mastercard's security policies and practices;
Ensure the confidentiality and integrity of the information being accessed;
Report any suspected information security violation or breach, and
Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.
Apply on company website