As a Machine Learning Engineer in the LLM Optimization team at Apple, you will have the opportunity to be part of an innovative ML organization that enables LLM for Apple products. The LLM Optimization team focuses on designing and implementing ML-based solutions to improve runtime latency, training time, memory usage, time to first token, and decoding speed across all Apple applications. The team is strategically positioned for significant contributions both in the short term (on well-known Apple products) and in the long term (on highly ambitious, high-risk, high-reward projects). This role emphasizes shipping ML-based features and products.
As a Full Stack ML Engineer, you will innovate across the entire end-to-end ML production pipeline. Your responsibilities will include but are not limited to:
* Designing new neural network architectures
* Developing efficient model training and fine-tuning methods
* Enhancing on-device and server side inference
Our ideal team member is fearless in trying new things and willing to iterate on ideas. We value team members who can quickly prototype and iterate towards high-quality implementations.
As a Full Stack ML Engineer on our team, you will leverage your background to:
* Design and implement ML-based solutions to improve runtime latency, training time, memory usage, time to first token, and decoding speed for Apple applications
* Innovate across the entire end-to-end ML production pipeline, including dataset creation, neural network architecture design, model training, fine-tuning methods, training time optimization, on-device and server side inference
* Quickly prototype and iterate to achieve high-quality implementations for pioneering machine learning algorithms
* Collaborate with hardware and software teams to integrate research findings into market-ready solutions
* Translate theoretical ideas into tangible innovations, demonstrating their industrial applicability