Technical Engineer, Second Line Support, BSS Operations [DevOps & Platform Maintenance] - BSS Ops Department (BSOPD)
Salary not provided
RakutenJob Description: Business Overview The Technology Platforms Division (TPD) drives the growth of Rakuten's ecosystem by delivering innovative, high-quality technology platforms characterized by integrated control and strategic partnerships and responsible for building and operating the infrastructure and ecosystem platforms which power the Rakuten Group. Department Overview Our department, BSS Ops Department (BSOPD) provides operational service for BSS applications both B2B & B2C and also responsible for maintenance of IT infra (on-premise and cloud environment) for BSS platform. Position: Why We Hire We are looking for Entrepreneurial, Innovative, Growth-Oriented, and Customer-obsessed individuals to join our growing team to build the Telco of the Future. We are a truly global organization, with team members from Japan, India, North America, South America, Europe, China, Korea, Australia, Africa, and more, shifting to a fast-paced, agile way of working. Position Details - This role contributes to the operational excellence of Rakuten's DevOps and Observability platforms. Key responsibilities include: - Providing proactive system monitoring, initial alert handling, and first-level operational assistance. - Conducting complex troubleshooting and developing automation solutions for platform efficiency. - Actively managing application environments across DevOps and Observability platforms. - Ensuring service continuity through diligent adherence to established runbooks and escalation processes. - Collaborating closely with L3 support and development teams to ensure platform stability, efficiency, and continuous improvement. - Participate in the on-call rotation for critical incidents, lead service restoration, and perform detailed Root Cause Analyses (RCA). - Contribute to post-incident reviews, drive automation for recurring issues, and continuously enhance system resilience. - Create and maintain runbooks, dashboards, and knowledge base documentation for operational readiness and training. - Support regular maintenance, feature rollouts, and security patching for production and pre-production environments. Mandatory Qualifications: 1) Technical Expertise - Containerization & Orchestration: Hands-on experience with Docker and Kubernetes (K8s) for managing containerized applications, including familiarity with Helm charts, ConfigMaps, and Secrets management for environment configuration. - Certification: Certification in Kubernetes (CKA/CKAD) is a plus. - CI/CD Automation: Proficiency in designing and implementing CI/CD pipelines using tools like Jenkins, GitLab CI/CD, or Azure DevOps. Knowledge of artifact management (e.g., Nexus, Artifactory) and understanding of CI/CD concepts to verify deployment success. - Monitoring & Observability: Strong understanding of monitoring tools such as Prometheus and Grafana for real-time system observation, metrics collection, and alert tuning for production systems. - Logging & Tracing: Skilled in centralized logging with ELK Stack (Elasticsearch, Logstash, Kibana) or Splunk, and experience with basic tracing tools like Dynatrace or AppDynamics. Ability to perform basic log analysis to identify potential issues. - Scripting & Automation: Experience in scripting languages (Bash or Python) for automating deployment, maintenance tasks, and executing standard health-check scripts. - Infrastructure as Code (IaC): Familiarity with Infrastructure as Code (IaC) principles and tools such as Terraform or Ansible. - Cloud Platforms: Exposure to cloud services on AWS, Azure, or GCP. - Certification: Certification in Cloud DevOps tools (e.g., AWS Certified DevOps Engineer) is a plus. - Incident Management & Processes: Familiarity with incident tracking systems (e.g., ServiceNow, JIRA) and understanding of ITIL processes. Experience assisting with Root Cause Analysis (RCA) and system optimization. - Operating Systems & Networking: Exposure to Linux system operations and basic networking concepts. - Basic Container Knowledge: Basic knowledge of containerized environments (Docker/Kubernetes). 2) Domain & Methodological Knowledge - Telecom BSS/OSS Systems: Strong understanding of customer-facing portals, CRM, order workflows, and the broader telecommunications BSS/OSS landscape. - Site Reliability Engineering (SRE): Ability to define and monitor SLOs, SLIs, and SLAs to ensure service reliability and uptime targets. Familiarity with SRE best practices (e.g., Google SRE model) and error budget management. - Hybrid/Multi-Cloud: Experience managing Kubernetes clusters and deploying applications in hybrid cloud or multi-cloud environments (AWS EKS, Rakuten Cloud Platform). - Cost Optimization & Capacity Planning: Experience with cost optimization strategies and capacity planning in cloud environments. - IT Governance: Familiarity with ITIL and ISO 27001 standards. 3) Professional Competencies - Problem-Solving: Exceptional analytical and troubleshooting capabilities to resolve complex, time-sensitive issues efficiently. - Communication: Excellent verbal and written communication skills to articulate technical issues to both technical teams and non-technical stakeholders (e.g., business users, L1 support). - Adaptability: The ability to quickly learn and adapt to new front-end technologies, frameworks, and evolving business processes within a dynamic environment. - Customer Focus: A strong commitment to ensuring a positive and efficient user experience for both customers and internal agents. 4) Experience & Education - Bachelor's degree in Computer Science, Information Technology, or a related technical field. - Typically, 4-7 years of experience in an L2 or equivalent technical support role, ideally within the telecommunications sector. - Proven experience with ITSM methodologies and ticketing tools such as ServiceNow or Jira. Desired Qualifications: - Proactive approach to problem-solving. - Strong organizational skills & Experience with budget management. - Knowledge of industry standards and compliance requirements. - Ability to work independently and as part of a team. - Commitment to continuous learning and professional development. Other Information: Additional information on Location Rakuten Crimson House (Head office) #engineer #developmentsupport #technologyplatformdiv Languages: English (Overall - 4 - Fluent) In Japanese, Rakuten stands for ‘optimism.’ It means we believe in the future. It’s an understanding that, with the right mind-set, we can make the future better by what we do today. So we challenge ourselves to evolve, innovate and experiment, to create a better, brighter future for everyone. Today, our 70+ businesses span e-commerce, digital content, communications and fintech, bringing the joy of discovery to almost 1.3 billion members across the world. If you have any trouble logging in, please contact us here Rakuten Group, Inc.: rakuten-recruiting-info@mail.rakuten.com Please read the Application Requirements(EN) / 募集要項(JP) before applying. Our Diversity & Inclusion Policy and Application Documents Rakuten’s corporate mission is to “contribute to society by creating value through innovation and entrepreneurship.” We foster a culture that provides equal opportunities to those who share this founding philosophy and take on the challenge to transform society, regardless of age, gender, nationality, or any other status. Diversity is one of Rakuten's core strategies and a driving force for innovation. Because of this, you are not required to submit any of the following information in order to apply for our job positions. - Gender - Age - Photo - Nationality - Information not related to business, such as ideological beliefs, family structure, etc. * For legal compliance, we may ask you about your work eligibility. See the details