Data Scientist (Creative Vision)
Salary: 650 - 1800 百万円
English: FluentJapanese: Basic
english_only
SB IntuitionsData Scientist (Creative Vision)
Key Responsibilities
- Work on data-related tasks, including data processing, curation, and captioning.
Required Skills and Experience
-
Data Platform & Pipelines:
- Design large-scale multimodal pipelines (ingest → dedupe → filter → shard → snapshot/version).
- Provide standardized dataset APIs and high-throughput loaders (streaming, caching, sampling, etc.) for model training, fine-tuning, and evaluation.
-
Captioning & Human-in-the-Loop Labeling:
- Build captioning/annotation workflows (schemas, multilingual coverage).
- Manage raters, gold sets, agreement metrics, QC dashboards.
- Bootstrap and verify auto-caption systems (e.g., CLIP/VLM-assisted).
-
Curation & Quality Control:
- Implement deduplication, clustering, quality/aesthetic scoring, and policy filters (NSFW, violence, PII, watermarks).
- Balance data mixtures across domains/styles/locales; measure impact of dense captions and synthetic data.
-
Research-Driven Data Science:
- Run ablations (mixture composition, caption density, synthetic:real ratios).
- Prototype quality/safety scorers, produce internal reports.
-
Collaboration & Enablement:
- Work with research and product teams to align data mixes to roadmap.
- Document schemas, manifests, and SLAs for multi-team reuse.
Preferred Skills and Experience
-
Training-Time Quality Tracking & Evaluation:
- Build evaluation hooks (fixed prompts, deterministic seeds).
- Track CLIP alignment, aesthetics, safety rate, edit-specific metrics.
- Operate frozen test sets and checkpoint gating.
-
Safety, Governance & Provenance:
- Maintain data lineage (source, license, consent), takedown flows, and isolation of customer data.
- Enforce policy filters and NSFW audits.
Ideal Candidate
- Shares the team's mission and is motivated to embrace new business opportunities.
Mission
Collaborate with researchers and engineers to build an efficient data-related codebase, supporting large-scale foundational model training.
What We Offer
- Opportunity to start with the local market and progress to large multimodal generation model projects.
- Apply research to real-world applications with measurable impact.
- Work within a diverse, international team.
- Access to the largest computational resources in Japan.
- Competitive compensation package.
Working Hours
- Flexible work schedule available
- Standard hours: 9:00 AM – 5:45 PM (1 hour break)
- Overtime: Required as needed
Salary and Bonuses
- Monthly Salary: ¥541,667 – ¥1,500,000
- Estimated Annual Salary: ¥6,500,000 – ¥18,000,000
Upper limit is not guaranteed. Salary includes base pay plus fixed overtime allowance (35 hours). Actual overtime exceeding 35 hours will be paid additionally. Incentives may be provided separately.
Allowances
- Commuting allowance: up to ¥150,000 per month
- Non-managerial employees: eligible for overtime, night shift, holiday work, and commuting allowances
- Managerial employees: eligible for management, night shift, and commuting allowances
Holidays and Leave
- Two days off per week (Saturday and Sunday)
- National holidays
- Year-end and New Year holidays (December 29 – January 3)
- Company-designated holidays
- Annual paid leave: 6–21 days (dependent on month of joining)
Benefits
- Health insurance, employment insurance, workers’ compensation, and welfare pension plan
- Employee support programs (e.g., Benefit One, premium discounts)
- Savings and retirement plans (property accumulation savings, defined contribution pension plan)
- Various group and corporate insurance programs
- Comprehensive welfare group term insurance