
Description
Our Purpose
Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Title and Summary
SRE Team Leader Who is Mastercard?Mastercard is a global technology company in the payments industry. Our mission is to connect and power an inclusive, digital economy that benefits everyone, everywhere by making transactions safe, simple, smart, and accessible. Using secure data and networks, partnerships and passion, our innovations and solutions help individuals, financial institutions, governments, and businesses realize their greatest potential.
Our decency quotient, or DQ, drives our culture and everything we do inside and outside of our company. With connections across more than 210 countries and territories, we are building a sustainable world that unlocks priceless possibilities for all.
Overview:
Dynamic Yield by Mastercard is seeking a Site Reliability Engineer Team Lead to oversee and drive the availability, performance, and scalability of our systems and services. In this leadership role, you will guide and mentor a team of SREs while collaborating closely with development teams to design and implement robust, reliable solutions. You will be responsible for leading efforts to enhance and streamline automation infrastructure, spearheading automation initiatives to streamline operations, and continuously optimizing system performance to meet evolving needs. Additionally, you will play a crucial role in strategic planning to ensure our infrastructure can support growth and adapt to changing technological demands.
Role:
• Lead a team of SREs in maintaining and enhancing system reliability and performance. Develop strategic plans to meet and exceed established SLA/SLOs and drive initiatives that align with business objectives.
• Oversee the implementation and optimization of monitoring systems to detect anomalies and deviations in real-time. Ensure the team continuously reviews metrics and trends to proactively address emerging issues before they affect users.
• Champion the development and implementation of automation tools and processes. Drive efforts to improve operational efficiency, minimize manual intervention, and eliminate repetitive tasks across the team.
• Lead capacity planning and performance tuning initiatives. Oversee resource utilization monitoring, and work with your team to forecast future needs and ensure systems can handle anticipated loads.
• Collaborate closely with development teams to embed reliability best practices into system design and feature implementation. Provide expert guidance on system architecture, deployment strategies, and reliability engineering principles.
• Identify and spearhead opportunities for system and process improvements. Promote initiatives that enhance system reliability, scalability, and performance, ensuring that the team is always pushing the boundaries of excellence.
About You:
• 5+ years of experience in Site Reliability Engineering, DevOps, or related roles, with a proven track record of leading teams and projects.
• Deep proficiency in AWS cloud platforms, Kubernetes, and scripting languages (e.g., Python, Bash). Extensive experience with system administration, configuration management tools (e.g., Ansible, Puppet, Chef), and monitoring/logging tools (e.g., Prometheus, Grafana, ELK stack).
• Strong understanding of incident management processes and best practices, with experience leading incident response and resolution efforts.
• Expertise in automation tools and practices for deployment and infrastructure management. Demonstrated ability to implement and advocate for effective automation strategies.
• Exceptional communication and collaboration skills. Proven ability to lead, mentor, and work effectively within a team environment, driving a culture of teamwork and continuous learning.
• Strong analytical and problem-solving abilities. Capable of troubleshooting complex issues and guiding the team through resolution.
Preferred Qualifications:
• Relevant certifications such as AWS Certified Solutions Architect or Google Professional Data Engineer.
• Familiarity with advanced topics like distributed systems, microservices architecture, and network protocols.
Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
Abide by Mastercard's security policies and practices;
Ensure the confidentiality and integrity of the information being accessed;
Report any suspected information security violation or breach, and
Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.
Apply on company website