Job Application for Senior MLOps at Technology & Product
Salary not provided
KubernetesAzurePyTorchGitDockerAWSGCPPythonTensorFlow
english_only
Minimum year of experience: 5
ZEALSSenior MLOps Engineer
Location: Japan – Remote
About the Role
As a Senior MLOps Engineer, you will be at the forefront of deploying, optimizing, and monitoring Large Language Models (LLMs) in production environments. Your responsibilities involve building and maintaining scalable pipelines, ensuring low-latency inference, and implementing best practices in monitoring and observability. You will work with state-of-the-art tools like Hugging Face and MLFlow to fine-tune models and integrate them into robust AI solutions.
Key Responsibilities
Model Deployment & Management
- Develop and maintain scalable pipelines for deploying LLMs, with an emphasis on efficient, low-latency inference.
- Utilize tools such as Hugging Face and MLFlow for seamless model integration and version control.
- Automate deployment processes (e.g., model validation, CI/CD).
Monitoring & Observability
- Implement comprehensive monitoring frameworks to track performance and reliability of production models.
- Use advanced observability tools to proactively detect and address performance issues.
- Deploy alerting systems for rapid response to anomalies in model behavior.
Infrastructure Optimization
- Architect and optimize cloud and on-premise infrastructure to support large-scale LLM operations.
- Collaborate with cloud providers (AWS, Azure, GCP) to optimize costs and performance.
- Coordinate with backend engineers for smooth model integration into conversational platforms.
Collaboration & Documentation
- Partner with AI engineers and data scientists to align on project objectives and deployment strategies.
- Document MLOps processes, best practices, and tools.
- Provide training and support to team members on MLOps methodologies and tools.
What You’ll Need
Experience
- 5+ years in MLOps, DevOps, or related fields, focusing on deploying and managing LLMs or large-scale machine learning models.
- Experience with tools like Hugging Face, MLFlow, Docker, and Kubernetes.
- Strong background with cloud platforms (AWS, Azure, GCP) and infrastructure as code (Terraform).
- Proven track record in reducing inference latency and optimizing AI infrastructure.
Technical Skills
- Proficiency in Python and experience with ML libraries such as TensorFlow, PyTorch, etc.
- Expertise in CI/CD pipelines, version control (Git), and orchestration tools.
- Familiarity with Generative AI, prompt engineering, and deploying models at scale.
Soft Skills
- Excellent problem-solving skills and ability to tackle complex challenges independently.
- Strong communication skills, able to translate technical concepts for non-technical stakeholders.
- Proactive mindset with a focus on continuous learning and keeping up with industry trends.
Tech Stack
- Backend Languages: Go, Python, Elixir
- Frontend: HTML, CSS, JavaScript (TypeScript, React.js, Recoil, Zod, Tanstack, etc.)
- Infrastructure: Google Cloud Platform, Pub/Sub, Kubernetes, MongoDB, MySQL, Postgres, BigQuery, Elasticsearch, Qdrant
- Configuration Management: Terraform
- CI/CD: Github Actions, ArgoCD, CircleCI
- Monitoring: Grafana, GCP Cloud Monitoring, Logging, Cloud Trace, Opsgenie
- Data: BigQuery, Parquet, Spark, Scala, Python, dbt
- Knowledge Tools: Confluence
- Other: GitHub, Slack, Jira
- Process: Scrum
Benefits & Perks
- Salary Range: Competitive (performance review every 6 months)
- Paid Holidays: 10 days during the first year, weekends off, national holidays, summer and winter breaks
- Visa Support: Full support provided for the right candidates
- Flexible Working: Highly flexible, remote-first international environment
- Work from Anywhere: For residents of Japan, including interim work from overseas and full flex time
- Housing Allowance: For those within 1.5km of the office
- Club Activity Allowance
- Shuffle Lunch Allowance: Cross-department lunches
- Bar Nights: Bi-monthly free-flow beer party
Why Join?
- Opportunity to make meaningful changes at a fast-growing tech company
- Highly flexible and inclusive work environment
- Build innovative solutions in conversational commerce technology
- Twice recognized as a LinkedIn Top Startup
Ready to roll up your sleeves? Be part of an organization that values hospitality, embraces unique needs, and is ready to change the status quo for a better employee experience!