The Team The IT Consumer group is accountable for the architecture/design, develop/industrialize, integrate/deploy, engineer/operationalize and, sustain/improve, delivery areas. It’s a very diverse group with our key people being located in London, UK, Jakarta, Indonesia and Buenos Aires, Argentina. We look after corporate web platforms, global commerce platforms, digital engagement platforms, consumer lifecycle platforms and omnichannel services for B2B and B2C channels. We also partner closely with Infosys in Mohali and Hyderabad in India, who are a big and important part of our family too. Our team is called Engineering & Ops and is responsible for the delivery, run and improvement of the technology landscape that underpins our products and services. As well as providing application support and management, the team is responsible for the run and improvement of all technology in the production environment and ensuring an effective service operation across all platform services. Role Summary We are looking for world-class Site Reliability Engineer who will be responsible for the availability, reliability, latency, performance, efficiency, monitoring, emergency response, and capacity planning of our products and services, who has the skill set necessary to write software to replace their previously manual work. This position is an integral member of the wider technology function and will be highly visible, work across multiple teams to deliver reliable solutions and drive both efficiency and effectiveness. It is essential that the role holder is a highly collaborative individual. We are seeking individuals that enjoy automating and reducing manual work – quality and time to market is important to us so it’s key that we have people who truly believe in this direction. Key Responsibilities • Collaborate with different technology groups to deliver services and solutions for the technology stack • Design and implement logging, monitoring and alerting solutions, increasing systems visibility and enabling faster recovery from incidents • Automate systems management, focusing on performance and scalability, improving utilization and reducing toil • Design and management of cloud architectures guaranteeing high availability, top performance and reliability • Optimise systems for performance and scalability, building infrastructure and eliminating work through automation • Ensure all services and solutions designed are built in adherence to PMI’s InfoSec policies and are fully industrialized for consumption by customers and technology groups • Research and development of tooling and/or process to enable delivery, operations and infrastructure teams • Enable consumption of our features and functions via self-service • Be involved in engineering and applications operations • Design cost-effective solutions and services ensuring that value is measured, tracked and realized at the end of key delivery points • Drive on-boarding and adoption of our core automation tools, services and apps • Participate in and promoting SRE to PMI IT communities Essential Skills & Experience • Deep understanding of performance monitoring and web application profiling • Extensive experience of integrating logging, monitoring and alerting technologies, such as ELK, New Relic and CloudWatch and driving significant change in the customer experience • Excellent skills and experience in configuration management via Puppet, Chef, Ansible, or others • Understanding of Internet infrastructure services including DNS, DHCP, LDAP, server virtualization, server monitoring and cloud services • Demonstrate history in automating operations processes • Consistent track record of troubleshooting and resolving issues in live production environments and implementing strategies to eliminate them • Driven approach to continually improving service levels • Experience with DNS and Content Distribution Networks (Akamai is highly desirable) • Extensive experience of deploying technology solutions across cloud-native platforms. We predominantly use Amazon Web Services, Salesforce Cloud, Adobe Cloud and SAP Hybris Cloud • Demonstrable experience of integrating and industrializing technology platforms (SaaS, PaaS, IaaS) on a global scale reducing operational and process waste • Fluency in one or more high-level programming languages like Java, Python, Go, Ruby or equivalent • Experience with microservices architectures • Knowledge of data platforms, including but not limited to: Apache, Kafka, Solr, Redis, MySQL, Cassandra, Hadoop • Strong ability and enthusiasm to learn new technologies in a short period of time. We seek a self-starter, visionary person with leadership capabilities • Demonstrable understanding of security and networking principles in a cloud-native environment • Has worked within, and can appreciate, the need for applying delivery methodologies such as Scrum, Kanban, Waterfall etc. appropriately to the work being delivered