Database Reliability Engineer (DBRE) - Mercari

Salary not provided

GoPHPPerlGCPDockerKubernetesNginxMemcachedMySQLPythonRubyAWS
English: Fluent
Mercari

Database Reliability Engineer (DBRE)

Employment Status: Full-time
Work Hours: Full Flextime (no core time)
Office: Roppongi


About

"Circulate all forms of value to unleash the potential in all people."

The mission is to create opportunities for anyone to realize their dreams and contribute to society by circulating all forms of value using technology, connecting people globally.
For more information about the mission and culture, see the Culture Doc.


Organization / Team Mission

Engineering Principles:

  • Passion For The Product
  • Grow Together
  • Solve Through Mechanisms
  • Collaborate Openly

The DBRE Team owns the reliability, scalability, and cost efficiency of the production databases (primarily TiDB Cloud currently, expanding to AlloyDB, CloudSQL, Spanner, and others). The goal is to drive cost optimization, automation, and tighter integration with product and platform teams as the scope transitions to a multi-engine DBRE team.


Work Responsibilities

  • Operate production databases (TiDB, MySQL, PostgreSQL, cloud-managed DBs) under a 99.99% availability SLA, including on-call rotations.
  • Independently lead end-to-end epics, from design through rollout, monitoring, and post-launch tuning.
  • Design and execute database migrations (engine/version changes, on-prem to cloud cutovers) with zero or minimal downtime.
  • Build and operate Change Data Capture (CDC) and streaming pipelines (Debezium, TiCDC, Kafka).
  • Implement Infrastructure-as-Code (Terraform, Ansible) and automation (Go, Python) for operational scalability.
  • Build and maintain monitoring, alerting, and SLO/SLI dashboards (Datadog, Grafana).
  • Diagnose and resolve production incidents, lead post-mortems, and implement permanent fixes.
  • Performance tuning (query plans, schema, indexing, replication topology).
  • Partner with product and platform teams on schema changes, capacity planning, and new DB adoption.
  • Contribute to the team's expansion into multi-engine DBRE by learning and adopting new database technologies.

Unique Challenges

  • Operating TiDB at large scale (hundreds of thousands of QPS, multi-TB datasets) with a 99.99% SLA.
  • Transitioning from a TiDB-centric to a multi-engine operation model (including AlloyDB, CloudSQL, Spanner).
  • Balancing ~40–50% reactive work (alerts, tickets, DDL operations, incident response) alongside project delivery.
  • Driving cost optimization while maintaining reliability.
  • Working in a bilingual environment: daily team syncs primarily in Japanese; cross-team collaboration in English.

Qualifications

Required

  • Production DBA/DBRE experience with MySQL, TiDB, PostgreSQL, or equivalent RDBMS (including internals: replication, indexing, query planning).
  • Experience operating databases at scale (>10K QPS or >1TB datasets) under production SLAs.
  • Hands-on with GCP or AWS managed database services (CloudSQL, AlloyDB, RDS, Aurora, etc.).
  • Experience with Infrastructure-as-Code (Terraform, Ansible) and scripting (Go, Python, or shell).
  • Experience with database monitoring and observability (Datadog, Grafana, etc.).
  • Ownership of incident response, performance tuning, and on-call rotations.
  • Ability to lead tasks end-to-end independently.
  • Willingness to learn and adopt new database technologies (AlloyDB, Spanner, PostgreSQL, etc.).

Preferred

  • Experience with distributed databases (sharding, replication, consensus) or operation of TiDB, Spanner, Vitess, CockroachDB.
  • Experience with CDC / streaming pipelines (Debezium, TiCDC, Kafka Connect, etc.).
  • Kubernetes operational experience.
  • Experience leading major database migrations (engine, version, or data center).
  • Experience with cross-team coordination for schema changes, migrations, or adopting platforms.

Language

  • Japanese: Independent (CEFR – B1)
  • English: Independent (CEFR – B1)

For details about CEFR, see here.


Useful Links


Japanese Version / 日本語版

雇用形態: 正社員
働き方: フレックスタイム制(コアタイムなし)
勤務地: 六本木


業務内容

  • システムモニタリング、キャパシティプランニング
  • パフォーマンス、スケーラビリティの改善
  • 障害検知、障害対応
  • CloudSQL(MySQL, PostgreSQL) AlloyDB, TiDB, Spannerや各種ミドルウェアの運用/性能チューニング
  • オペレーションの自動化を実現させるためのソフトウェア開発
  • データベースの利用/運用に関するガイダンスの策定

ユニークなチャレンジ

  • 数十テラバイトのデータ・数十万QPSを処理する大規模で重要なDatabaseクラスタの運用
  • 多様・大規模・ミッションクリティカルな環境下での安定運用と将来的なスケーラビリティのための基盤構築

必須条件

  • ミッション・バリューへの共感
  • データベース運用経験・パフォーマンスチューニング経験
  • SQL知識・テーブル設計・データ運用経験
  • TCP/IP, HTTP等のネットワーク知識
  • Go, PHP, Perl, Python, Ruby, Node.jsなどでの開発運用経験
  • SLI/SLO設計等SREプラクティス知識

歓迎条件

  • GCP/AWS等クラウドプラットフォーム利用経験
  • オンプレ運用経験
  • Ansible/Terraform等オーケストレーションツール利用経験
  • Cloud Spanner/Amazon Aurora等DBaaS利用経験
  • Docker/Kubernetes利用経験
  • マイクロサービス開発運用経験
  • Datadog等モニタリングツール利用経験
  • nginx, memcached, MySQL等ミドルウェア・RDBMS運用経験
  • 大規模Webサービス開発・運用経験
  • 分散コンピューティング知識

選考のポイント

  • 必要な技術知識の保有
  • 複雑な状況での課題解決力
  • チームのために率先して行動できること

語学力

  • 日本語:Basic (CEFR - A2) 歓迎
  • 英語:Independent (CEFR - B2) 必須

Equal Opportunity

This opportunity is open regardless of background, age, gender, sexual orientation, race, religion, physical ability, or other factors.

For details, see the I&D statement.

Please read the Privacy Policy prior to applying.