Apple Machine Learning
machinelearning.apple.com/
Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
Uncertainty Quantification (UQ) in Language Models (LMs) is key to improving their safety and reliability. Evaluations often use metrics…
Normalizing Flows are Capable Generative Models

Trade-offs in Data Memorization via Strong Data Processing Inequalities
Recent research demonstrated that training large language models involves memorization of a significant fraction of training data. Such…

Aligning LLMs by Predicting Preferences from User Writing Samples
Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent…

INRFlow: Flow Matching for INRs in Ambient Space
Flow matching models have emerged as a powerful method for generative modeling on domains like images or videos, and even on irregular or…

Variational Rectified Flow Matching
We study Variational Rectified Flow Matching, a framework that enhances classic rectified flow matching by modeling multi-modal velocity…

Apple Machine Learning Research at CVPR 2025
Apple researchers are advancing AI and ML through fundamental research, and to support the broader research community and help accelerate…

Updates to Apple's On-Device and Server Foundation Language Models
With Apple Intelligence, we're integrating powerful generative AI right into the apps and experiences people use every day, all while…

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes…

Proxy-FDA: Proxy-Based Feature Distribution Alignment for Fine-Tuning Vision Foundation Models Without Forgetting
Vision foundation models pre-trained on massive data encode rich representations of real-world concepts, which can be adapted to downstream…

Voice Quality Dimensions as Interpretable Primitives for Speaking Style for Atypical Speech and Affect
Perceptual voice quality dimensions describe key characteristics of atypical speech and other speech modulations. Here we develop and…

Improve Vision Language Model Chain-of-thought Reasoning
Chain-of-thought (CoT) reasoning in vision language models (VLMs) is crucial for improving interpretability and trustworthiness…

Beyond Text Compression: Evaluating Tokenizers Across Scales
Tokenizer design significantly impacts language model performance, yet evaluating tokenizer quality remains challenging. While text…

Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Input Representations Matter
Cross-lingual transfer is a popular approach to increase the amount of training data for NLP tasks in a low-resource context. However, the…

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025
Apple is sponsoring the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), which will take place in person from June…

Distillation Scaling Laws
We propose a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the…