Last updated: 2025/06/21 06:00

Variational Rectified Flow Matching
We study Variational Rectified Flow Matching, a framework that enhances classic rectified flow matching by modeling multi-modal velocity…

INRFlow: Flow Matching for INRs in Ambient Space
Flow matching models have emerged as a powerful method for generative modeling on domains like images or videos, and even on irregular or…

Aligning LLMs by Predicting Preferences from User Writing Samples
Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent…

Trade-offs in Data Memorization via Strong Data Processing Inequalities
Recent research demonstrated that training large language models involves memorization of a significant fraction of training data. Such…
Normalizing Flows are Capable Generative Models

Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
Uncertainty Quantification (UQ) in Language Models (LMs) is key to improving their safety and reliability. Evaluations often use metrics…
Fault Tolerant Llama: training with 2000 synthetic failures every ~15 seconds and no checkpoints on Crusoe L40S

Build a scalable AI video generator using Amazon SageMaker AI and CogVideoX
In recent years, the rapid advancement of artificial intelligence and machine learning (AI/ML) technologies has revolutionized various aspects of digital content creation. One particularly exciting development is the emergence of video generation capabilities, which offer unprecedented opportunities for companies across diverse industries. This technology allows for the creation of short video clips that can be […]

Building trust in AI: The AWS approach to the EU AI Act
The EU AI Act establishes comprehensive regulations for AI development and deployment within the EU. AWS is committed to building trust in AI through various initiatives including being among the first signatories of the EU's AI Pact, providing AI Service Cards and guardrails, and offering educational resources while helping customers understand their responsibilities under the new regulatory framework.

Update on the AWS DeepRacer Student Portal
Starting July 14, 2025, the AWS DeepRacer Student Portal will enter a maintenance phase where new registrations will be disabled. Until September 15, 2025, existing users will retain full access to their content and training materials, with updates limited to critical security fixes, after which the portal will no longer be available.

Accelerate foundation model training and inference with Amazon SageMaker HyperPod and Amazon SageMaker Studio
In this post, we discuss how SageMaker HyperPod and SageMaker Studio can improve and speed up the development experience of data scientists by using IDEs and tooling of SageMaker Studio and the scalability and resiliency of SageMaker HyperPod with Amazon EKS. The solution simplifies the setup for the system administrator of the centralized system by using the governance and security capabilities offered by the AWS services.

6 inspiring, real-world AI activities for educators
Explore AI activities for educators using the Microsoft Education AI Toolkit. Access resources to get started with AI in education.

7 Best AI Agent Builders: An Expert Market Breakdown
Explore 7 top AI agent builders: workflow-native, AI-native & hybrid. Compare AI agent builder platforms by use case, integration & flexibility!

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
PyTorch Docathon 2025: Wrap Up

Meeting summarization and action item extraction with Amazon Nova
In this post, we present a benchmark of different understanding models from the Amazon Nova family available on Amazon Bedrock, to provide insights on how you can choose the best model for a meeting summarization task.

Building a custom text-to-SQL agent using Amazon Bedrock and Converse API
Developing robust text-to-SQL capabilities is a critical challenge in the field of natural language processing (NLP) and database management. The complexity of NLP and database management increases in this field, particularly while dealing with complex queries and database structures. In this post, we introduce a straightforward but powerful solution with accompanying code to text-to-SQL using a custom agent implementation along with Amazon Bedrock and Converse API.

Accelerate threat modeling with generative AI
In this post, we explore how generative AI can revolutionize threat modeling practices by automating vulnerability identification, generating comprehensive attack scenarios, and providing contextual mitigation strategies.

Search Live: Talk, listen and explore in real time with AI Mode
Search Live with voice facilitates back-and-forth conversations in AI Mode.

AI strategies from the frontlines of higher education
Explore the latest strategies from higher education institutions and how they’re creating AI-ready campuses with Microsoft AI solutions.

Hear a podcast discussion about Gemini’s coding capabilities.
The latest episode of the Google AI: Release Notes podcast focuses on how the Gemini team built one of the world’s leading AI coding models.Host Logan Kilpatrick chats w…

Toward understanding and preventing misalignment generalization
We study how training on incorrect responses can cause broader misalignment in language models and identify an internal feature driving this behavior—one that can be reversed with minimal fine-tuning.

Preparing for future AI risks in biology
As our models grow more capable in biology, we’re layering in safeguards and partnering with global experts, including hosting a biodefense summit this July.

What AI Companies Actually Need Right Now
At Cline, we've scaled to 500k+ users and raised significant funding from top-tier VCs. As Head of AI, I recently interviewed a strong ML engineer candidate. Despite their solid background, I voted "no hire." Let me explain why - it reveals a broader pattern about what AI companies actually need right now, and getting this wrong can be a $200k+ mistake. The $200k Mistake: Why Hiring MLEs Too Early Kills AI Startups Here's a pattern I see repeatedly in well-funded AI startups: 1. Raise a sub

The Local LLM Reality Check: What Actually Happens When You Try to Run AI Models on Your Computer
If you've used DeepSeek's R1 (or V3 for that matter), you've probably been impressed at its performance for the price. And if you've run into issues with its API recently, your next thought was probably, “Hey, I’ve got a decent computer—maybe I can run this locally and run this myself!” Then reality hits: the full DeepSeek R1 model needs about 1,342 GB of VRAM—no, that’s not a typo. It’s designed to run on a cluster of 16 NVIDIA A100 GPUs, each with 80GB of memory (source). Let’s break down wha

DeepSeek's Wild Week: A View from the Developer Trenches
Last week, Chinese AI startup, DeepSeek, caused the biggest single-day drop in NVIDIA's history, wiping nearly $600 billion from the chip giant's market value. But while Wall Street panicked about DeepSeek's cost claims, Cline users in our community were discovering a more nuanced reality. The Promise vs The Reality "R1 is so hesitant to open and read files while Claude just bulldozes through them," observed one of our users. This perfectly captures the gap between DeepSeek's impressive bench

Best AI Coding Assistant 2025: Complete Guide to Cline and Cursor
Updated March 4, 2025 article to reflect recent developments Remember when GitHub Copilot first launched and we thought AI-assisted coding couldn't get more revolutionary? Two years later, we're seeing a fascinating divergence in how AI coding assistants approach development. With recent releases from both Cline (now 3.5) and Cursor (0.46), we're witnessing not just a battle of features, but a philosophical split in how AI should partner with developers. I've watched both tools mature. Let's c

The Developer's Guide to MCP: From Basics to Advanced Workflows
Picture this: You're deep into development with your AI assistant, trying to juggle multiple tools – GitHub issues need updating, tests need running, and documentation needs reviewing. But instead of the seamless workflow you imagined, you're stuck with manual context switching and disconnected tools. Your AI assistant, brilliant as it is, feels trapped in its chat window. This is where the Model Context Protocol (MCP) changes everything. It's not just another developer tool – it's a fundamenta

Everyone's Talking About R1 vs o1 Benchmarks. But Here's What Really Matters.
In an interesting coincidence, DeepSeek released R1 on the same day we launched Plan & Act modes in Cline. And something fascinating started happening immediately: developers began naturally using R1 for planning phases and 3.5-Sonnet for implementation. Not because anyone suggested it – it just made sense. 0:00 /0:54 1× What's Actually Happening Here's what developers discovered works best: 1. Start new tasks in Plan mode using R1 ($0.55/M tokens)

Why AI Engineers Need Planning More Than Perfect Prompts
The best AI engineers I know follow a specific pattern. They don't obsess over prompt crafting – they obsess over planning. There's a reason for this, and it's not what most people think. The Reality Check Here's what typically happens when someone starts working with AI: 1. They throw requirements at the model 2. They get mediocre outputs 3. They blame their prompting skills 4. They spend hours "optimizing" prompts 5. They still get mediocre results Sound familiar? But here's what eli
DeepNVMe: Affordable I/O scaling for Deep Learning Applications

Gemini 2.5: Updates to our family of thinking models
Explore the latest Gemini 2.5 model updates with enhanced performance and accuracy: Gemini 2.5 Pro and Flash generally available and stable, and the new Flash-Lite in preview.

We’re expanding our Gemini 2.5 family of models
Gemini 2.5 Flash and Pro are now generally available, and we’re introducing 2.5 Flash-Lite, our most cost-efficient and fastest 2.5 model yet.

We’re expanding our Gemini 2.5 family of models
Gemini 2.5 Flash and Pro are now generally available, and we’re introducing 2.5 Flash-Lite, our most cost-efficient and fastest 2.5 model yet.

How Anomalo solves unstructured data quality issues to deliver trusted assets for AI with AWS
In this post, we explore how you can use Anomalo with Amazon Web Services (AWS) AI and machine learning (AI/ML) to profile, validate, and cleanse unstructured data collections to transform your data lake into a trusted source for production ready AI initiatives.

An innovative financial services leader finds the right AI solution: Robinhood and Amazon Nova
In this post, we share how Robinhood delivers democratized finance and real-time market insights using generative AI and Amazon Nova.

Build conversational interfaces for structured data using Amazon Bedrock Knowledge Bases
This post provides instructions to configure a structured data retrieval solution, with practical code examples and templates. It covers implementation samples and additional considerations, empowering you to quickly build and scale your conversational data interfaces.

Advancing healthcare AI innovation for global impact at HLTH Europe 2025
At HLTH Europe 2025, Microsoft will showcase our commitment to move forward the next frontier of health AI innovation. Learn more.

MCP will be the death of low-code automation, and other spooky stories
Explore the state and future of MCP in AI agents—from vendor security and model risks to cost and orchestration. MCP shows promise but faces adoption hurdles due to immaturity, security flaws, and backward compatibility challenges.

AI in sales: Applying historical lessons to modern challenges
See the latest AI sales transformation offering from Microsoft and agents to help sales teams nurture and close deals.

4 ways Microsoft Copilot empowers financial services employees
In the rapidly evolving landscape of financial services, staying ahead of the curve with technological innovation is not simply an advantage—it's a necessity.

How and when to build multi-agent systems
Late last week two great blog posts were released with seemingly opposite titles. “Don’t Build Multi-Agents” by the Cognition team, and “How we built our multi-agent research system” by the Anthropic team. Despite their opposing titles, I would argue they actually have a lot in common and contain some

Groq on Hugging Face Inference Providers 🔥
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Why PLAID Japan builds agents on their Google Cloud infrastructure with Mastra
How PLAID Japan migrated from GUI-based AI tools to Mastra for better collaboration and productivity for their engineering team building on Google Cloud.

The Last AI Coding Agent
It feels like every month there's a new "must-have" AI coding tool. The FOMO is real; but so is the fatigue of constantly switching, learning new workflows, and migrating settings. It’s exhausting, but that's the price for developers who want to be armed with the greatest leverage powered by AI. The magic of AI coding isn't just in the tool itself; it's in the power of the underlying model. And the "best" model is a moving target. One year ago, GPT-4o led the way. Then Anthropic's Claude 3.5 So
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization

Get an audio overview of Search results in Labs, then click through to learn more.
Today, we’re launching a new Search experiment in Labs – Audio Overviews, which uses our latest Gemini models to generate quick, conversational audio overviews for certa…

Behind “ANCESTRA:” combining Veo with live-action filmmaking
We partnered with Darren Aronofsky, Eliza McNitt and a team of more than 200 to make ANCESTRA.
Mastra Changelog 2025-06-13
Cross-thread memory recall, universal schema support, and enhanced workflow observability.

The Hidden Metric That Determines AI Product Success
Co-authored by Assaf Elovic and Harrison Chase. You can also find a version of this post published on Assaf's Medium. Why do some AI products explode in adoption while others struggle to gain traction? After a decade of building AI products and watching hundreds of launches across the industry, we’

Building efficient MCP servers
MCP is becoming the standard for building AI model integrations. See how you can use Vercel's open-source MCP adapter to quickly build your own MCP server, like the teams at Zapier, Composio, and Solana.

Don’t Build Multi-Agents
Frameworks for LLM Agents have been surprisingly disappointing. I want to offer some principles for building agents based on our own trial & error, and explain why some tempting ideas are actually quite bad in practice.

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Featherless AI on Hugging Face Inference Providers 🔥
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

The Complete MCP Experience: Full Specification Support in VS Code
VS Code now supports the complete Model Context Protocol specification, including authorization, prompts, resources, and sampling.

Google for Nonprofits will expand to 100+ new countries and launch 10+ new no-cost AI features
Google for Nonprofits is expanding to 100 more countries, and introducing new Workspace for Nonprofits and Ad Grants AI features.

Get started with agents for finance: Learnings from 2025 Gartner® CFO & Finance Executive Conference
Based on conversations at the 2025 Gartner CFO Conference, here are three things finance leaders should know about getting started with agents and AI. Learn more.

Benchmarking Multi-Agent Architectures
By Will Fu-Hinthorn In this blog, we explore a few common multi-agent architectures. We discuss both the motivations and constraints of different architectures. We benchmark their performance on a variant of the Tau-bench dataset. Finally, we discuss improvements we made to our “supervisor” implementation that yielded a nearly 50% increase

Introducing Training Cluster as a Service - a new collaboration with NVIDIA
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Why Vetnio powers their AI veterinary technician with Mastra
How Vetnio uses Mastra's workflow orchestrator to build specialized veterinary AI assistants

How we used generative media at I/O 2025
From the keynote countdown to speaker title cards and beyond, generative AI took the stage at I/O 2025.

Empower your teams to grow their AI skills and boost adoption
Microsoft has developed a series of best practices and resources that now guide our employee AI skill-building initiatives. Learn more.

How we’re adapting SEO for LLMs and AI search
AI is changing how content gets discovered. Now, SEO ranking ≠ LLM visibility. No one has all the answers, but here's how we're adapting our approach to SEO for LLMs and AI search.

Apple Machine Learning Research at CVPR 2025
Apple researchers are advancing AI and ML through fundamental research, and to support the broader research community and help accelerate…

How we built one of the most ambitious datasets in brain activity research
Learn how Google Research’s team worked with collaborators at HHMI Janelia and Harvard University to build a dataset that tracks both the neural activity and nanoscale s…

4 ways Microsoft Copilot empowers financial services employees
Read the latest news and posts about Financial services from Microsoft's team of experts at Microsoft Industry Blogs.

Here’s the next cohort of the Google.org Accelerator: Generative AI
Meet the 20 organizations using generative AI to address tough societal issues.

Building secure AI agents
Learn how to design secure AI agents that resist prompt injection attacks. Understand tool scoping, input validation, and output sanitization strategies to protect LLM-powered systems.

Observability added to AI Gateway alpha
Vercel Observability now includes a dedicated AI section to surface metrics related to the AI Gateway.
v0-1.5-md & v0-1.5-lg now in beta on the Models API
Try v0-1.5-md and v0-1.5-lg in beta on the v0 Models API, now offering two new model sizes for more flexible performance and accuracy. Ideal for everything from quick responses to deep analysis.

Updates to Apple's On-Device and Server Foundation Language Models
With Apple Intelligence, we're integrating powerful generative AI right into the apps and experiences people use every day, all while…

Newsroom
Discover Claude 4's breakthrough AI capabilities. Experience more reliable, interpretable assistance for complex tasks across work and learning.

Why Human Intent Matters More as AI Capabilities Grow
Remember when AI coding meant tab autocomplete? The human did 95% of the work: navigating the codebase, finding the right files, locating the exact spot to edit, beginning to type, and only then could AI offer a helpful suggestion. The human was the driver, AI was barely a passenger. Today's agentic AI can search codebases, read files, write entire modules, refactor systems, and orchestrate complex changes across multiple files. With tools like Cline, AI has eaten up nearly the entire coding pi
HuggingFace Safetensors Support in PyTorch Distributed Checkpointing

Introducing Evaluations for AI workflows
Distilling the complexity of AI Evaluations frameworks into a practical paradigm

Optimizing LLM-based trip planning

ScreenSuite - The most comprehensive evaluation suite for GUI Agents!
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Mastra Changelog 2025-06-06
Mastra 101, Mastra Auth, and more
Introducing the PyTorch Ecosystem Working Group and Project Spotlights

Try new data visualizations and graphs for finance queries in AI Mode.
Today, we’re starting to roll out interactive chart visualizations in AI Mode in Labs to help bring financial data to life for questions on stocks and mutual funds.Now, …

The latest AI news we announced in May
Here are Google’s latest AI updates from May 2025

Portraits: personalized AI coaching built alongside real experts
Our first Portrait features Kim Scott, bestselling author of “Radical Candor.”

Zooming in: Efficient regional environmental risk assessment with generative AI

Try the latest Gemini 2.5 Pro before general availability.
We’re introducing an upgraded preview of Gemini 2.5 Pro, our most intelligent model yet. Building on the version we released in May and showed at I/O, this model will be…

Beyond Text Compression: Evaluating Tokenizers Across Scales
Tokenizer design significantly impacts language model performance, yet evaluating tokenizer quality remains challenging. While text…

Improve Vision Language Model Chain-of-thought Reasoning
Chain-of-thought (CoT) reasoning in vision language models (VLMs) is crucial for improving interpretability and trustworthiness…

Voice Quality Dimensions as Interpretable Primitives for Speaking Style for Atypical Speech and Affect
Perceptual voice quality dimensions describe key characteristics of atypical speech and other speech modulations. Here we develop and…

Proxy-FDA: Proxy-Based Feature Distribution Alignment for Fine-Tuning Vision Foundation Models Without Forgetting
Vision foundation models pre-trained on massive data encode rich representations of real-world concepts, which can be adapted to downstream…

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes…
Open Source AI is Transforming the Economy—Here’s What the Data Shows

AI breakthroughs are bringing hope to cancer research and treatment
Read Ruth Porat's remarks on AI and cancer research at the American Society of Clinical Oncology.
Build Responsible AI Products with your own Yellow Teaming LLM

The no-nonsense approach to AI agent development
Learn how to build reliable, domain-specific AI agents by simulating tasks manually, structuring logic with code, and optimizing with real-world feedback. A clear, hands-on approach to practical automation.

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025
Apple is sponsoring the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), which will take place in person from June…

Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Input Representations Matter
Cross-lingual transfer is a popular approach to increase the amount of training data for NLP tasks in a low-resource context. However, the…

What Makes a Coding Agent?
When developers first encounter Cline, they often describe it as their "AGI moment" – that pivotal instant when they realize AI has crossed from helpful suggestion tool to genuine coding partner. But what exactly separates a true coding agent from the growing crowd of AI-powered development tools? The answer lies in understanding what the word "agent" actually means. Defining the Agent OpenAI defines an agent as "a system that independently accomplishes tasks on your behalf." Anthropic takes

Why We Built Cline to Never Hold You Hostage
Yesterday, Windsurf users lost access to Claude 3.x models with five days' notice. OpenAI's acquisition of Windsurf created competitive tensions with Anthropic, and developers got caught in the crossfire. Picture this: you're deep into a critical project, and suddenly your AI coding assistant is crippled by corporate politics. Free tier users lost access entirely; paid subscribers face severe capacity constraints. This validates why we built Cline differently from day one. When Corporate War

Cline 3.17.9: Enhanced Claude 4 Support (Experimental), Upgraded Task Timeline & CSV/XLSX Support
Hello Cline community 🫡 We've been burning the candle at both ends to make Cline work as well as possible with the new Claude 4 family of models, and we're excited to share that Cline 3.17.9 includes experimental Claude 4 support that addresses reliability issues, an upgraded task timeline with scrolling navigation, and expanded file upload capabilities for data analysis. Experimental Claude 4 Support If you've been using Claude 4 with Cline, you've probably noticed some frustrating edit fai

KV Cache from scratch in nanoVLM
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

All the Azure news you don’t want to miss from Microsoft Build 2025
We’ve pulled together the top 25 announcements at Microsoft Build 2025 across the Azure business—spanning Azure AI Foundry, Azure infrastructure, Azure app platform, Azure databases and Microsoft Fabric, and our GitHub family. Learn more.

NotebookLM is adding a new way to share your own notebooks publicly.
Many people who use NotebookLM already share their notebooks with classmates, coworkers, students and friends. Today, we're making sharing and curation easier — with pub…

Statement on Anthropic Model Availability
Anthropic deciding to cut off capacity does not change our commitment to providing the best product for our users.

Learning to clarify: Multi-turn conversations with Action-Based Contrastive Self-Training

Distillation Scaling Laws
We propose a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the…

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Introducing the v0 composite model family
Learn how v0's composite AI models combine RAG, frontier LLMs, and AutoFix to build accurate, up-to-date web app code with fewer errors and faster output.

How Google is driving a new era of American innovation in Iowa.
Google is investing an additional $7 billion in Iowa within the next two years in cloud and AI infrastructure, as well as in expanded workforce development programs, mea…

Fluid compute: Evolving serverless for AI workloads
Fluid, our newly announced compute model, eliminates wasted compute by maximizing resource efficiency. Instead of launching a new function for every request, it intelligently reuses available capacity, ensuring that compute isn’t sitting idle.

An Inflection Point for U.S. Government
Windsurf is uniquely positioned to address the U.S. Government’s challenges

Highlights from the Dialogues stage at I/O 2025
The Dialogues stage at Google I/O 2025 brought together Google leaders and visionaries.

CodeAgents + Structure: A Better Way to Execute Actions
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
PyTorch Hangzhou Meetup Recap: Exploring the AI Open Source Ecosystem and Cutting-Edge Technology Practices

🐯 Liger GRPO meets TRL
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Fine-tuning LLMs with user-level differential privacy

Build an AI Agent Powered by MongoDB Atlas for Memory and Vector Search (+ Free Workflow Template)
Build context-aware AI agents in n8n using MongoDB Vector Store & Chat Memory—no code needed. Power assistants with search + memory in one flow.

Dell Enterprise Hub is all you need to build AI on premises
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Tiny Agents in Python: a MCP-powered agent in ~70 lines of code
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Google Research at Google I/O 2025

Listen to a podcast recap of everything we announced at I/O.
At this week’s I/O, we announced our very latest products, tools and research designed to make AI even more helpful with Gemini. The latest episode of the Google AI: Rel…

Why do I need LangGraph Platform for agent deployment?
This blog dives into technical details for why agent deployment is difficult, and how we built a platform to solve those challenges (LangGraph Platform).

Watch our AI talks at I/O 2025
At Google I/O 2025, we shared what we've been working on, the future of AI on the web, and demonstrated how our partners are making use of client-side AI.

The DeepWiki MCP Server
The DeepWiki MCP server is here. Get codebase context and answers about any public GitHub repository in seconds.

From sea to sky: Microsoft’s Aurora AI foundation model goes beyond weather forecasting
Aurora, an AI foundation model revolutionizes weather and environmental forecasting with accuracy, speed and efficiency.

nanoVLM: The simplest repository to train your VLM in pure PyTorch
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

11 Game-Changing Open-Source AI Tools Every Dev Should Know About
11 must-try AI open-source tools that redefined how we build and think about code—prepare to be inspired!

Announcing new innovations for SAP on Microsoft Cloud
This week at SAP Sapphire 2025 we are excited to share several new announcements and key strategic updates from Microsoft and SAP for our joint customers. Learn more.

Introducing the AI Gateway
With the AI Gateway, build with any model instantly. No API keys, no configuration, no vendor lock-in.

How Webtoon Entertainment built agentic workflows with LangGraph to scale story understanding
See how Webtoon is transforming storytelling with agent workflows built on LangGraph for content discovery to help marketing, transation, and recommendation teams.

Better forecasts ahead as Met Office transitions to a supercomputer in Azure cloud
Britain’s national weather and climate service and one of its most trusted institutions— is stepping into the future with high-performance computing in Azure, as well as a host of AI applications. In partnership with Microsoft, the Met Office can now do its forecasting and research with greater safety and resiliency. It’s a scalable platform that they can build the future of their scientific research and customer delivery on.

From skepticism to success: How AI is helping teachers transform classrooms in Peru
See how Microsoft Copilot and the World Bank are helping elevate the educational experience of schools in Lima, Peru with AI.

Introducing NLWeb: Bringing conversational interfaces directly to the web

Microsoft and Hugging Face expand collaboration
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Introducing Codex
Introducing Codex: a cloud-based software engineering agent that can work on many tasks in parallel, powered by codex-1. With Codex, developers can simultaneously deploy multiple agents to independently handle coding tasks such as writing features, answering questions about your codebase, fixing bugs, and proposing pull requests for review.

Addendum to o3 and o4-mini system card: Codex
Codex is a cloud-based coding agent. Codex is powered by codex-1, a version of OpenAI o3 optimized for software engineering. codex-1 was trained using reinforcement learning on real-world coding tasks in a variety of environments to generate code that closely mirrors human style and PR preferences, adheres precisely to instructions, and iteratively runs tests until passing results are achieved.

Cline: Beginner, Intermediate, Advanced – Finding Your Ideal Setup
When you first dive into a powerful tool like Cline, it's natural to wonder, "What's the best way to set this up?" The beauty of Cline is that it's designed to adapt and grow with you. There isn't a one-size-fits-all answer, but rather a spectrum of configurations that can match your immediate needs and the complexity of the tasks you're tackling. This post will walk you through three common setup approaches – Beginner, Intermediate, and Advanced. My goal is to help you find what works best for

AI Engineering Isn't Magic, It's Method: Key Strategies for Building Better Software Faster
If you've been in tech for a while, you've seen the cycles. Skepticism greets nearly every major shift. We've certainly observed some resistance from engineers, a common echo of the "pessimistic streak" that often surfaces in tech hubs when new technologies emerge. It's reminiscent of the doubts that once surrounded Linux, the cloud, and even JavaScript on the server. History tends to rhyme, and the current wave of AI in software development is no different. But here’s the thing: while some deb

How Fern delivers 6M+ monthly views and 80% faster docs with Vercel
Fern used Vercel and Next.js to achieve efficient multi-tenancy, faster development cycles, and 50-80% faster load times
45% faster build initialization
Customers on all plans can now benefit from faster build cache restoration times. We've made architectural improvements to builds to help customers build faster.

How to Run a Local LLM: Complete Guide to Setup & Best Models (2025)
Learn how to run LLMs locally, explore top tools like Ollama & GPT4All, and integrate them with n8n for private, cost-effective AI workflows.

SWE-1: Our First Frontier Models
Introducing our first Frontier Models!

Recap of Interrupt 2025: The AI Agent Conference by LangChain
Hear more about the product launches, keynote themes, and exciting news from our first-ever conference.

Devin 2.1
We’re announcing some major changes in Devin 2.1. Devin now reports its confidence that it can complete tasks using 🟢 🟡 🔴 and is more likely to succeed in large codebases with better codebase context 🧠

The Transformers Library: standardizing model definitions
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

LangGraph Platform is now Generally Available: Deploy & manage long-running, stateful Agents
LangGraph Platform, our infrastructure for deploying and managing agents at scale, is now generally available. Learn how to deploy

Deeper insights into retrieval augmented generation: The role of sufficient context

How Consensys rebuilt MetaMask.io with Vercel and Next.js
Learn how Consensys modernized MetaMask.io using Vercel and Next.js—cutting deployment times, improving collaboration across teams, and unlocking dynamic content with serverless architecture.

AI powers Expedia’s marketing evolution
A conversation with Jochen Koedijk, Chief Marketing Officer of Expedia Group.
PyTorch/XLA 2.7 Release Usability, vLLM boosts, JAX bridge, GPU Build

Improving Hugging Face Model Access for Kaggle Users
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Beyond the tools, adding MCP in VS Code
Bring your own tools to VS Code

Updated v0 pricing
More flexible pricing for v0 that scales with your usage and lets you pay on-demand through credits.

Resources tab allows instant searching and filtering of functions, middleware, and static assets
The Resources tab is replacing the Functions tab for deployments in the Vercel Dashboard. It displays and allows you to search and filter middleware, static assets, and functions.

Proxied responses now cacheable via CDN-Cache-Control headers
Vercel's CDN now supports CDN-Cache-Control headers for external backends, giving you simple, powerful caching control without any configuration changes.

New one-click AI bot managed ruleset
Protect your content from unauthorized AI crawlers with Vercel's new AI bot managed ruleset, offering one-click protection against known AI bots while automatically updating to catch new crawlers without any maintenance.

Differential privacy on trust graphs

Blazingly fast whisper transcriptions with Inference Endpoints
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
MetaShuffling: Accelerating Llama 4 MoE Inference

Self-Hosted Deployment Maintenance Mode
We have decided to place our self-hosted offering in maintenance mode, and offer a new single-tenant hosted offering that can expose our agentic capabilities.

The spring 2025 cohort of Vercel’s Open Source Program
Announcing the spring 2025 cohort of Vercel's Open Source Program. Open source community frameworks, libraries, and tools we rely on every day to build the web,

Bringing 3D shoppable products online with generative AI

Introducing HealthBench
HealthBench is a new evaluation benchmark for AI in healthcare which evaluates models in realistic scenarios. Built with input from 250+ physicians, it aims to provide a shared standard for model performance and safety in health.

Vision Language Models (Better, Faster, Stronger)
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Cline v3.15: Unveiling Task Timeline, Gemini Implicit Caching & New Community Docs
Welcome to Cline v3.15! This update is all about refining the experience of Cline as your daily driver. Let's get right into it. Task Timeline = Story board of your conversation Ever wished you could instantly see the "story" of how Cline tackled a complex task? With v3.15, we're introducing the Task Timeline, a sleek, visual "storyboard" integrated directly into your task header. This isn't just a log; it's an intuitive series of blocks charting every key step Cline takes – from tool calls t

New quick actions in Observability
Copy, filter, or exclude any value in Observability queries with new one-click actions, making it faster to analyze incoming traffic.

April 2025 (version 1.100)
Learn what is new in the Visual Studio Code April 2025 Release (1.100)
CDN origin timeout increased to two minutes
Vercel’s CDN proxy read timeout now increased to 120 seconds across all plans, enabling long-running AI workloads and reducing 504 gateway timeout errors. Available immediately at no cost, including Hobby (free) plans.
Up to 80% pricing reduction for Web Analytics
Hobby and Pro teams on Vercel now have higher usage limits on Web Analytics, including reduced costs and smaller billable increments

New usage dashboard for Enterprise users
We’ve launched a new usage dashboard for Enterprise teams to analyze Vercel usage and costs with detailed breakdowns and export options.

Wave 8: UX Features + Plugins Update
Introducing Wave 8, our biggest batch of updates yet! This is part 3 of 3.

A new light on neural connections

OpenAI Expands Leadership with Fidji Simo
Read the message Sam shared with the company earlier today.

LangSmith Incident on May 1, 2025
Requests to the US LangSmith API from both the web application and SDKs experienced an elevated error rate for 28 minutes on May 1, 2025 (starting at 14:35 UTC and ending at 15:03 UTC). During the incident window, approximately 55% of all API requests failed with a connection

OpenAI’s response to the Department of Energy on AI infrastructure
Why infrastructure is destiny and how the US can seize it.

Introducing data residency in Asia
Data residency builds on OpenAI’s enterprise-grade data privacy, security, and compliance programs supporting customers worldwide.

Introducing the Flags Explorer, first-party integrations, and updates to the Flags SDK
Introducing first-party integrations, the Flags Explorer, and improvements to the Flags SDK to improve feature flag workflow on Vercel.

Faster builds now available with compute upgrades on paid plans
Enhanced Builds can now be enabled on demand per project for Pro and Enterprise teams. These builds offer double the compute. Customers already using Enhanced Builds are seeing, with no action required, up to 25% reductions in build times.

Flags Explorer is now generally available
View and override feature flags in your browser with Flags Explorer – now generally available for all customers

Bot activity and crawler insights now in Observability
Find out which ai crawlers or search engines are scraping your content. Act later on using Vercel Firewall if wanted

MCP server support on Vercel
Run MCP servers on with Next.js or Node.js in your Vercel project with 1st class support for Anthropic's MCP SDK

Wave 8: Cascade Customization Features
Introducing Wave 8, our biggest batch of updates yet! This is part 2 of 3.

The San Antonio Spurs use ChatGPT to scale impact on and off the court
Discover how the San Antonio Spurs are using custom GPTs to enhance fan engagement, streamline operations, and drive innovation across teams.

Lowe’s puts project expertise into every hand
Lowe’s partnered with OpenAI to build Mylow and Mylow Companion, AI-powered tools that bring expert help to both customers and store associates—making complex home improvement projects easier to plan, navigate, and complete.

Introducing OpenAI for Countries
A new initiative to support countries around the world that want to build on democratic AI rails.

Plan Smarter, Code Faster: Cline's Plan & Act is the Paradigm for Agentic Coding
The allure of AI in development often conjures images of instant solutions: a quick prompt, a flash of code, and on to the next task. But as many developers have discovered, this path can quickly become a frustrating loop of near-misses, endless re-prompting, and AI-generated code that feels strangely disconnected from the project's core. This isn't merely inefficient; it represents a fundamental misunderstanding of how to truly harness the power of coding agents. At Cline, we see a more robust

Join the Vercel AI Accelerator
A six-week program to help you scale your AI company offering over $4M in credits from Vercel, v0, AWS, and leading AI platforms

Windsurf Wave 8: Teams & Enterprise Features
Introducing Wave 8, our biggest batch of updates yet! This is part 1 of 3.

Introducing AI stories: daily benefits shine a light on bigger opportunities
Sam Altman has written that we are entering the Intelligence Age, a time when AI will help people become dramatically more capable. The biggest problems of today—across science, medicine, education, national defense—will no longer seem intractable, but will in fact be solvable. New horizons of possibility and prosperity will open up.

Making complex text understandable: Minimally-lossy text simplification with Gemini

Why Patterns Matter: Lessons From Hyper-Growth Startups
I've heard it said that "patterns are what companies talk about when they're dying". This may be true, but it is also what companies talk about once they've hit PMF. Don't fool yourself, do you really need scale? If you haven't experienced a hyper growth startup before, it can be easy to dismiss things like composability, scalability, security, testability, redundancy, CI/CD, and the like as a needless waste of time. It's true that most startups think about these things way too early. Until y

Kevin-32B: Multi-Turn RL for Writing CUDA Kernels
We explore reinforcement learning in a multi-turn setting, using intermediate feedback from the environment, and masking model thoughts to avoid exploding context over multiple turns.

AI helps John Deere transform agriculture
John Deere’s Justin Rose talks about transforming agriculture with AI and shares how the company is scaling innovation to help farmers work smarter, more efficiently, and sustainably.

Track a request's full lifecycle with session tracing
Vercel session tracing is now available on all plans, providing insight into steps Vercel's infrastructure took to serve a request alongside user code spans.

Evolving OpenAI’s structure
An update from the OpenAI board on transitioning its for-profit entity to a Public Benefit Corporation, reinforcing its mission-driven structure under nonprofit oversight while enabling greater impact and long-term alignment with the public good.

Lowe’s leverages AI to power home improvement retail
A conversation with Chandhu Nair, Senior Vice President of Data, AI, and Innovation.

DeepWiki: AI docs for any repo
Turn any public GitHub repo into up-to-date documentation you can talk to, for free.

How Outshift by Cisco achieved a 10x productivity boost with their Agentic AI Platform Engineer
See how Cisco Outshift built an AI Platform Engineer to boost productivity 10x, taking tasks like setting up CI/CD pipelines from a week to under an hour.

Cline v3.14: Improved Gemini Caching, /newrule Command, Enhanced Checkpoints & Key Updates
We're excited to announce Cline v3.14, bringing significant improvements to Gemini caching and cost transparency, more flexible checkpoints, a new command for rule creation, LaTeX support, and a host of quality-of-life updates driven by community feedback. Let's dive into the highlights: Improved Gemini Caching & Transparency We know accurate cost tracking and efficient model usage are crucial. While Gemini's context caching offers potential savings, ensuring its effectiveness and providing
