Staff Data Engineer
応募 後で応募 Job ID 10129825 勤務地-都市 Nicasio, カリフォルニア州, アメリカ合衆国 勤務地-国 Lucasfilm 掲載日 2025/09/16仕事内容:
The Skywalker Sound Development Group is seeking an experienced Data Engineer to specialize in the creation, management, and optimization of data pipelines to support cutting-edge AI/ML research. This is a critical role in preparing high-quality datasets for the training, retraining, and evaluation of machine learning models tailored to immersive and multichannel audio applications.
As a Data Engineer, you will focus on developing robust pipelines for processing complex media datasets, enabling AI/ML researchers to build transformative solutions for speech processing, style transfer, and source separation. Your work will directly contribute to creating innovative soundtrack workflows for global media production.
This role is considered Hybrid, which means the employee will work 2-3 days onsite at our Nicasio, CA office and occasionally from home.
What You'll Do
Design, implement, and maintain scalable, automated data pipelines for the ingestion, preprocessing, and transformation of large-scale audio datasets.
Ensure pipelines support efficient model training and retraining workflows, enabling continuous improvement of AI/ML models.
Collaborate with AI/ML researchers to define data requirements and integrate feedback to improve data pipeline functionality.
Develop advanced preprocessing techniques for immersive and multichannel audio formats (e.g., Dolby Atmos, high-order ambisonics).
Automate data cleaning, normalization, and augmentation processes to prepare datasets for various model architectures, including foundational models and transformers.
Integrate external datasets and APIs while ensuring compliance with legal and ethical data usage standards.
Monitor and optimize pipeline performance to handle complex and dynamic data structures effectively.
Create tools and workflows for annotating, labeling, and curating datasets, including the use of active learning methods.
Perform exploratory data analysis to uncover trends, validate dataset quality, and identify data gaps.
What We’re Looking For
Master’s Degree with preference for PhD in Data Engineering/Science, Computer Science, Signal Processing, or a related field.
8+years of experience in data engineering or data science with a focus on building pipelines for AI/ML applications.
Proficiency in Python, with expertise in data manipulation libraries such as Pandas, NumPy, and PyTorch’s data utilities.
Hands-on experience with audio processing libraries and tools (e.g., Librosa, FFmpeg, SoX) for handling complex audio formats.
Familiarity with scalable pipeline tools like GitLab, Apache Spark, Airflow, or Luigi, and experience with containerized workflows (Docker, Kubernetes).
Strong understanding of data pipeline requirements for model training, retraining, and evaluation in iterative research workflows.
Experience with immersive and multichannel audio formats.
Knowledge of cloud-based platforms and tools for storage and processing, such as AWS S3, Redshift, or Google BigQuery.
Strong problem-solving skills, with a proactive mindset for addressing evolving data challenges.
Preferred Qualifications
Experience integrating data pipelines with AI/ML workflows, including active learning and model retraining.
Familiarity with audio-specific datasets and metadata management strategies.
Knowledge of machine learning principles and how data quality impacts model performance.
Experience with distributed training pipelines and large-scale dataset processing.
Contributions to open-source projects or published research in the fields of data science or audio processing.
Experience with visualization tools (e.g., Tableau, Matplotlib) for quality assurance and exploratory data analysis.
Expertise in designing systems to support AI/ML model monitoring and retraining over time.
The hiring range for this position in Nicasio, CA is $166,800 to $223,600 per year. The base pay actually offered will take into account internal equity and also may vary depending on the candidate’s geographic region, job-related knowledge, skills, and experience among other factors. A bonus and/or long-term incentive units may be provided as part of the compensation package, in addition to the full range of medical, financial, and/or other benefits, dependent on the level and position offered.
Lucasfilm について:
Lucasfilmは、映画、テレビ、デジタルエンターテインメント製作の分野におけるグローバルリーダーです。モーションピクチャーやテレビの製作に加えて、Lucasfilmは視覚効果、音響のポストプロダクション、最先端デジタルアニメーション、インタラクティブ・エンターテインメント・ソフトウェア、STAR WARSやINDIANA JONESのフランチャイズなどエンターテインメント資産の売買活動などを行っています。Lucasfilm Ltd.は北カリフォルニアに本部を置いています。
The Walt Disney Company について:
The Walt Disney Companyは、その子会社・関連会社とともに、多様性あふれる国際企業として、Disney Entertainment、ESPN、Disney Experiencesの3事業を柱に、ファミリー向けエンターテインメントとメディアの世界をけん引しています。1920年代に小さなアニメ・スタジオとしてスタートしたDisneyは、今日のエンターテインメント業界において卓越した存在となりました。ディズニーは今後も、子供から大人まで、ご家族のだれもが楽しめる一流の物語や体験を生み出し続けます。Disneyのストーリーやキャラクター、体験は、世界中のあらゆる場所の消費者やお客様に届けられています。当社は40カ国以上で、従業員とキャストメンバーが一丸となり、世界的にも地域的にも歓迎されるエンターテインメント体験を創出しています。
このポジションは Lucasfilm Ent Co Ltd, LLC Payroll Svc という事業部門の一つである Lucasfilmでのお仕事です。
Lucasfilm Ent Co Ltd, LLC Payroll Svc は機会均等雇用主です。応募者は、人種、宗教、肌の色、性別、性的指向、ジェンダー、性自認、性表現、国籍、家柄、年齢、配偶者の有無、軍役経験の有無やその地位、健康状態、遺伝情報や障がい、または連邦法や州法、地方法で禁止されているその他の法的根拠に関係なく、雇用の検討対象となります。Disneyは、すべての人々のアイデアと決断が、成長、革新、最高のストーリーの創造に役立ち、絶えず進化する世界において、価値ある存在になれるよう支援するビジネス環境を支持します。