Last updated: 2026/01/08 05:00
[...] the reality is that 75% of the people on our engineering team lost their jobs here yesterday because of the brutal impact AI has had on our business. And …
The best AI coding tools for developers in 2026. From IDEs to code review, find tools that work in real codebases without breaking your workflow.

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
AGI is here! When exactly it arrived, we’ll never know; whether it was one company’s Pro or another company’s Pro Max (Eddie Bauer Edition) that tip-toed first across the line …
This guide to the current sandboxing landscape by Luis Cardoso is comprehensive, dense and absolutely fantastic. He starts by differentiating between containers (which share the host kernel), microVMs (their own …

Stop shipping chat UIs. Learn how AG-UI uses an event-driven protocol to build real AI apps with streaming, tools, and shared state.
I joined the Oxide and Friends podcast last year to predict the next 1, 3 and 6 years(!) of AI developments. With hindsight I did very badly, but they're inviting …
It genuinely feels to me like GPT-5.2 and Opus 4.5 in November represent an inflection point - one of those moments where the models get incrementally better in a way …
Something I like about our weird new LLM-assisted world is the number of people I know who are coding again, having mostly stopped as they moved into management roles or …

The Top AI Papers of the Week (December 29 - January 4)
I'm not joking and this isn't funny. We have been trying to build distributed agent orchestrators at Google since last year. There are various options, not everyone is aligned... I …

LLMs in 2025, YOLO in the Sandbox, Plan Caching for Agents, DeepTutor
Depending on how you measure it, the tempo of Harder, Better, Faster, Stronger appears to be 123.45 beats per minute. This is one of those things that's so cool I'm …
My experience is that real AI adoption on real problems is a complex blend of: domain context on the problem, domain experience with AI tooling, and old-fashioned IT issues. I’m …
I sent the December edition of my sponsors-only monthly newsletter. If you are a sponsor (or if you start a sponsorship now) you can access a copy here. In the …
[Claude Code] has the potential to transform all of tech. I also think we’re going to see a real split in the tech industry (and everywhere code is written) between …
現在執筆中の書籍『作って学ぶAIエージェント』について、 原稿テキスト段階での技術レビューを実施します。 目的 本レビューの目的は主に以下です。 * 技術的に致命的な誤りの検出 * 説明の前提不足や誤解を招く表現の指摘 * 現行ツール・APIとの不整合の確認 あわせて、書籍の構成・内容に対する要望や改善提案も受け付けます。 (これは必須ではなく、可能な範囲で構いません) やってもらいたいこと * 書籍の内容を読み、必要に応じてサンプルコードを実際に動かす * 動作しない箇所、分かりづらい点、前提が不足している点があれば報告する * 内容や構成について要望・違和感があれば共有する レビューの深さや範囲は参加者に委ねます。 ※ 技術レビューにあたって発生する API 利用料金等は、恐れ入りますが各自のご負担となります。 レビュー自体は原稿(Markdown/コード)を読む形でも十分に行っていただけます。 進め方 * レビュー用に期間限定の Private GitHub Repository を用意しました * 対象は原稿テキスト(Markdown/

This is the third in my annual series reviewing everything that happened in the LLM space over the past 12 months. For previous years see Stuff we figured out about …

It looks like OpenAI's Codex cloud (the cloud version of their Codex coding agent) was quietly rebranded to Codex web at some point in the last few days. Here's a …
[...] The puzzle is still there. What’s gone is the labor. I never enjoyed hitting keys, writing minimal repro cases with little insight, digging through debug logs, or trying to …
In essence a language model changes you from a programmer who writes lines of code, to a programmer that manages the context the model has access to, prunes irrelevant things, …
The hard part of computer programming isn't expressing what we want the machine to do in code. The hard part is turning human thinking -- with all its wooliness and …

AI-first debugging augments traditional debugging. Learn where AI helps, where it fails, and how to use it safely in production.
Jevons paradox is coming to knowledge work. By making it far cheaper to take on any type of task that we can possibly imagine, we’re ultimately going to be doing …
Today in extremely niche projects, I got fed up of Claude Code creating GitHub Actions workflows for me that used stale actions: actions/setup-python@v4 when the latest is actions/setup-python@v6 for example. …

The Top AI Papers of the Week (December 22-28)
2023年から段階的にAIを開発フローに組み込み、2025年は試行錯誤とツールの大きな変化、そしてエージェント化を経て、私のソフトウェア開発の進め方は明確に変化しました。 ここで言う「変化」とは、単に作業が速くなった、便利になったという話ではありません。 より具体的には「コードをタイピングする時間よりも、間接作業の比重と抽象的な思考・ロジックが増えた」という意味での変化です。 より深刻なのは文字入力回数の増大です。その結果、マイクに向かって話したり、タイピングの練習といったプリミティブな活動を取り入れるようになりました。 この変化は私だけのものではありません。Addy Osmaniは『Beyond Vibe Coding』で「開発者の役割はコードを書くことから、コードを指示すること(directing)へシフトしている」と述べ、アーキテクチャやデザインパターンといったシステム思考への集中を説いています。Latent SpaceのSwyxも「ソフトウェアエンジニアの強みは抽象化のレベルを上げることに最も長けている点だ」と指摘しています。 この流れに対して「コーディングがつまらなくな

I just sent out the latest edition of the newsletter version of this blog. It's a long one! Turns out I wrote a lot of stuff in the past 10 …
In advocating for LLMs as useful and important technology despite how they're trained I'm beginning to feel a little bit like John Cena in Pluribus. Pluribus spoiler (episode 6) Given …

MiniMax-M2.1, LLM Coding Workflows, GLM-4.7, MiniMax-M2.1, LaMer Meta-RL, Google's 2025 AI Breakthroughs
A year ago, Claude struggled to generate bash commands without escaping issues. It worked for seconds or minutes at a time. We saw early signs that it may become broadly …

AI を活用するための技術というのはとりわけ新しいものではなく、過去の知見を基盤として構築されていることが多いです。それゆえに、AI 時代だからこそ基礎的な知識を体系的に学ぶことができる書籍に学ぶことに価値を求めるのです。この記事では 2025 年に読んで特に印象に残った本をいくつか紹介します。

Rob Pike (that Rob Pike) is furious. Here’s a Bluesky link for if you have an account there and a link to it in my thread viewer if you don’t. …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

Socket CTO Ahmad Nassri shares practical AI coding techniques, tools, and team workflows, plus what still feels noisy and why shipping remains human-l...

Build privacy-first agentic AI using small language models. Learn how local-first architectures replace cloud LLMs for reasoning, RAG, and orchestration.

I’ve been having an absurd amount of fun recently using LLMs for cooking. I started out using them for basic recipes, but as I’ve grown more confident in their culinary …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

<h2><a id="tldr" class="anchor" href="#tldr" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 1...

I just had my first success using a browser agent - in this case the Claude in Chrome extension - to solve an actual problem. A while ago I set …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

The Top AI Papers of the Week (December 15-21)
Gemini 3 Flash, GPT Image 1.5, Mistral OCR 3, GPT-5.2-Codex,NVIDIA Nemotron 3, Budget-aware Agent Scaling
In 2025, Reinforcement Learning from Verifiable Rewards (RLVR) emerged as the de facto new major stage to add to this mix. By training LLMs against automatically verifiable rewards across a …
Sam Rose is one of my favorite authors of explorable interactive explanations - here's his previous collection. Sam joined ngrok in September as a developer educator. Here's his first big …

2025 Year in Review of LLM paradigm changes

The latest in OpenAI's Codex family of models (not the same thing as their Codex CLI or Codex Cloud coding agent tools). GPT‑5.2-Codex is a version of GPT‑5.2 further optimized …
Anthropic have turned their skills mechanism into an "open standard", which I guess means it lives in an independent agentskills/agentskills GitHub repository now? I wouldn't be surprised to see this …

<h2><a id="tldr" class="anchor" href="#tldr" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 1...

A hands-on comparison of five AI coding CLIs, tested by building the same React Todo app.
Mehmet Ince describes a very elegant chain of attacks against the PostHog analytics platform, combining several different vulnerabilities (now all reported and fixed) to achieve RCE - Remote Code Execution …
Anil Madhavapeddy is running an Advent of Agentic Humps this year, building a new useful OCaml library every day for most of December. Inspired by Emil Stenström's JustHTML and my …

It continues to be a busy December, if not quite as busy as last year. Today’s big news is Gemini 3 Flash, the latest in Google’s “Flash” line of faster …
AI tools are everywhere, but trust is falling. Learn how engineers become orchestrators in 2026, choosing which agents to scaffold, ship, and maintain.

<p>We're excited to introduce <strong>Mini-SGLang</strong>, a lightweight yet high-performance inference framework for Large Language Models (LLMs). Derived ...

OpenAI shipped an update to their ChatGPT Images feature - the feature that gained them 100 million new users in a week when they first launched it back in March, …
New release of my s3-credentials CLI tool for managing credentials needed to access just one S3 bucket. Here are the release notes in full: New commands get-bucket-policy and set-bucket-policy. #91 …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
任意のWebドキュメントをClaude Agent Skills化するツール site2skill を作りました。PAY.JPのドキュメントを例に、Claude Codeがドキュメントを参照しながら開発する流れを説明します。 GitHub - laiso/site2skillContribute to laiso/site2skill development by creating an account on GitHub.GitHublaiso LLMが知らないライブラリを使うとき LLMには知識のカットオフ(学習データの期限)があります。新しいライブラリやマイナーなAPIを使おうとすると、LLMは正確な情報を持っていません。例えばClaude Opus 4.5 の知識は2025年8月のものであり、それ以降にリリースされたライブラリや、学習データに含まれていないドキュメントについては、正確なコードを生成できません。 こういう場面では、Webのドキュメントを要約してチャットに貼り付け、それをもとに実装してもらうという作業を繰り返すことになります。この方法は有効ですが、毎回ドキ
Oh, so we're seeing other people now? Fantastic. Let's see what the "competition" has to offer. I'm looking at these notes on manifest.json and content.js. The suggestion to remove scripting …
I’ve been watching junior developers use AI coding assistants well. Not vibe coding—not accepting whatever the AI spits out. Augmented coding: using AI to accelerate learning while maintaining quality. [...] …

<h2><a id="introduction" class="anchor" href="#introduction" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1...
Slop lost to "brain rot" for Oxford Word of the Year 2024 but it's finally made it this year thanks to Merriam-Webster! Merriam-Webster’s human editors have chosen slop as the …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

<p>We are excited to announce that SGLang supports the latest highly efficient NVIDIA Nemotron 3 Nano model on Day 0!</p> <p>Nemotron 3 Nano, part of the new...

I recently came across JustHTML, a new Python library for parsing HTML released by Emil Stenström. It’s a very interesting piece of software, both as a useful library and as …

The Top AI Papers of the Week (December 8-14)
Brian Merchant has been collecting personal stories for his series AI Killed My Job - previously covering tech workers, translators, and artists - and this latest piece includes anecdotes from …

GPT-5.2, Devstral 2, Measuring Agents in Production, Gemini Deep Research API, Deep RL Course
If the part of programming you enjoy most is the physical act of writing code, then agents will feel beside the point. You’re already where you want to be, even …
How to use a skill (progressive disclosure): After deciding to use a skill, open its SKILL.md. Read only enough to follow the workflow. If SKILL.md points to extra folders such …

One of the things that most excited me about Anthropic’s new Skills mechanism back in October is how easy it looked for other platforms to implement. A skill is just …
I released a new version of my LLM Python library and CLI tool for interacting with Large Language Models. Highlights from the release notes: New OpenAI models: gpt-5.1, gpt-5.1-chat-latest, gpt-5.2 …
この記事は MoonBit Advent Calendar 2025 13日目の記事です。 Moonbit - Qiita Advent Calendar 2025 - QiitaCalendar page for Qiita Advent Calendar 2025 regarding Moonbit.Qiita MoonBit Maria(以下、Maria)は、MoonBit公式の AI エージェントなプロジェクトです。GitHub リポジトリで公開されています。MoonBitのasyncベースで実装されています。 GitHub - moonbitlang/maria: moon agent rewritten in asyncmoon agent rewritten in async. Contribute to moonbitlang/

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

OpenAI reportedly declared a “code red” on the 1st of December in response to increasingly credible competition from the likes of Google’s Gemini 3. It’s less than two weeks later …

Socket CEO Feross Aboukhadijeh joins Software Engineering Daily to discuss modern software supply chain attacks and rising AI-driven security risks.

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
This thought-provoking essay from Johann Rehberger directly addresses something that I’ve been worrying about for quite a while: in the absence of any headline-grabbing examples of prompt injection vulnerabilities causing …
Discover the best MCP servers for devs in 2026 to code, ship, and automate safely with powerful, standardized tooling. Turn AI agents into real assistants.

Discover what's new in The Replay, LogRocket's newsletter for dev and engineering leaders, in the December 10th issue.

I've never been particularly invested dark v.s. light mode but I get enough people complaining that this site is "blinding" that I decided to see if Claude Code for web …

A vibe coding thought exercise on what it might look like for LLMs to scour human historical data at scale and in retrospect.

TOON is a compact alternative to JSON that cuts token usage by 30–60% in LLM prompts. This guide shows how it works, where the savings come from, and when to use it.

Andrew Evans, principal engineer and tech lead at CarMax discusses five ways to fix AI-generated code and help you debug, test, and ship safely.
<h2><a id="tldr" class="anchor" href="#tldr" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 1...

Two new models from Mistral today: Devstral 2 and Devstral Small 2 - both focused on powering coding agents such as Mistral's newly released Mistral Vibe which I wrote about …

I talked to Brendan Samek about Canada Spends, a project from Build Canada that makes Canadian government financial data accessible and explorable using a combination of Datasette, a neat custom …
Announced today as a new foundation under the parent umbrella of the Linux Foundation (see also the OpenJS Foundation, Cloud Native Computing Foundation, OpenSSF and many more). The AAIF was …

Here's the Apache 2.0 licensed source code for Mistral's new "Vibe" CLI coding agent, released today alongside Devstral 2. It's a neat implementation of the now standard terminal coding agent …
I found the problem and it's really bad. Looking at your log, here's the catastrophic command that was run: rm -rf tests/ patches/ plan/ ~/ See that ~/ at the …
Martin Kleppmann makes the case for formal verification languages (things like Dafny, Nagini, and Verus) to finally start achieving more mainstream usage. Code generated by LLMs can benefit enormously from …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
Now I want to talk about how they're selling AI. The growth narrative of AI is that AI will disrupt labor markets. I use "disrupt" here in its most disreputable, …
Thoughtful guidance from Bryan Cantrill, who evaluates applications of LLMs against Oxide's core values of responsibility, rigor, empathy, teamwork, and urgency.
What to try first? Run Claude Code in a repo (whether you know it well or not) and ask a question about how something works. You'll see how it looks …

The Top AI Papers of the Week (Dec 1 - 7)
Anthropic Sandbox Runtime (srt) は、Claude Code on the web などクラウド環境向けに Anthropic が開発した軽量サンドボックスの PoC(概念実証)です。 Making Claude Code more secure and autonomous with sandboxingLearn how Claude Code’s new sandboxing feature protects developers with filesystem and network isolation, reducing permission prompts and increasing user safety. 少なくない Claude Code ユーザーは

TanStack AI は TanStack チームが開発する TypeScript 向けの軽量な AI フレームワークです。LLM プロバイダーのインターフェイスを抽象化し、ツール呼び出しやチャット機能を提供します。この記事では TanStack AI の概要と基本的な使い方を紹介します。

MCP ツールの呼び出しはコンテキスト汚染や推論のオーバーヘッドなどの課題があります。Claude のプログラムによるツール呼び出し機能を利用することで、これらの課題を解決する方法について解説します。
Chris Lewis decompiles N64 games. He wrote about this previously in Using Coding Agents to Decompile Nintendo 64 Games, describing his efforts to decompile Snowboard Kids 2 (released in 1999) …

OpenRouter State of AI, Mistral 3, DeepSeek-V3.2, Google Workspace Studio, Puppeteer Multi-Agent RL, and more
If you work slowly, you will be more likely to stick with your slightly obsolete work. You know that professor who spent seven years preparing lecture notes twenty years ago? …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
Codex CLI の最新版v0.65.0 において、experimental ではありますが Skills のサポートが導入されました[1]。 codex/docs/skills.md at main · openai/codexLightweight coding agent that runs in your terminal - openai/codexGitHubopenai [1]: https://github.com/openai/codex/pull/7412 Claude Skills と同じ形式のディレクトリを配置するだけで読み込まれるため、導入の手間はほとんどありません。設定としては、config.toml に次の一行を追加します。 [features] skills = true スキルパッケージは ~/.codex/

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
Launched today at WIRED’s The Big Interview event, this manifesto (of which I'm a founding signatory) pushes for a positive framework for thinking about building hyper-personalized AI-powered software. This part …

Check out Google's latest AI releases, Gemini and the Antigravity AI IDE. Understand what's new, how they work, and how they can reshape your development workflow.

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

<blockquote> <p><strong>TL;DR:</strong></p> <p><strong>We have added FSDP to <a href="https://github.com/radixark/miles">Miles</a> as a more flexible trainin...
Anthropic just acquired the company behind the Bun JavaScript runtime, which they adopted for Claude Code just in July. Their announcement includes an impressive revenue update on Claude Code: In …

Four new models from Mistral today: three in their "Ministral" smaller model series (14B, 8B, and 3B) and a new Mistral Large 3 MoE model with 675B parameters, 41B active. …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
Richard Weiss managed to get Claude 4.5 Opus to spit out this 14,000 token document which Claude called the "Soul overview". Richard says: While extracting Claude 4.5 Opus' system message …

<p>(Updated on Dec 2)</p> <p>We are thrilled to announce a major new feature in SGLang: native support for <a href="https://github.com/NVIDIA/TensorRT-Model-...

Two new open weight (MIT licensed) models from DeepSeek today: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, both 690GB, 685B parameters. Here's the PDF tech report. DeepSeek-V3.2 is DeepSeek's new flagship model, now running …
I just send out the November edition of my sponsors-only monthly newsletter. If you are a sponsor (or if you start a sponsorship now) you can access a copy here. …

<p><strong>TL;DR:</strong> Speculative decoding boosts LLM inference, but traditional methods require a separate, inefficient draft model. Vertex AI utilizes...
I am increasingly worried about AI in the video game space in general. [...] I'm not sure that the CEOs and the people making the decisions at these sorts of …
It's ChatGPT's third birthday today. It's fun looking back at Sam Altman's low key announcement thread from November 30th 2022: today we launched ChatGPT. try talking with it here: chat.openai.com …

The Top AI Papers of the Week (November 24 - 30)

MCP では多くのツール定義が LLM のコンテキストを圧迫する問題があります。Claude のツール検索ツールを使用すると、必要に応じて関連するツールのみを LLM に提供でき、コンテキスト圧迫を軽減できます。この記事では Claude の TypeScript クライアントを使用して、ツール検索ツールを実際に使用した例を紹介します。

On the space of minds and the optimizations that give rise to them.

Claude Opus 4.5, OmniScientist, FLUX.2, General Agentic Memory
Matt Webb coins the term context plumbing to describe the kind of engineering needed to feed agents the right context at the right time: Context appears at disparate sources, by …
Large language models (LLMs) can be useful tools, but they are not good at creating entirely new Wikipedia articles. Large language models should not be used to generate new Wikipedia …
In June 2025 Sam Altman claimed about ChatGPT that "the average query uses about 0.34 watt-hours". In March 2020 George Kamiya of the International Energy Agency estimated that "streaming a …

I've been having a lot of fun hacking on my Bluesky Thread Viewer JavaScript tool with Claude Code recently. Here it renders a thread (complete with demo video) talking about …
To evaluate the model’s capability in processing long-context inputs, we construct a video “Needle-in- a-Haystack” evaluation on Qwen3-VL-235B-A22B-Instruct. In this task, a semantically salient “needle” frame—containing critical visual evidence—is inserted …
New on Hugging Face, a specialist mathematical reasoning LLM from DeepSeek. This is their entry in the space previously dominated by proprietary models from OpenAI and Google DeepMind, both of …

A hands-on comparison of five AI code review tools – Qodo, Traycer, CodeRabbit, Sourcery, and CodeAnt AI, tested on the same codebase to see which one actually delivers.

Alexandra Spalato, fractional AI officer, shares a practical framework to help devs decide when and how to use AI and agents.
AnthropicがClaude(モデル) APIの新機能として「Programmatic Tool Calling」(以下PTC)を パブリックベータとして公開しました。 Introducing advanced tool use on the Claude Developer PlatformClaude can now discover, learn, and execute tools dynamically to enable agents that take action in the real world. Here’s how. 一言で言うと、これは「ClaudeがToolを呼び出す処理をPythonコードとして生成し、 Anthropicが提供するサンドボックス内で実行する」機能です。 従来のTool Useでは、Toolを1つ呼ぶたびにClaudeが次のアクションを判断し、 その結果をすべてコンテキストウィンドウに追加していました。 10個のToolを連鎖して呼び出すと、10回分の推論と、
PromptArmor demonstrate a concerning prompt injection chain in Google's new Antigravity IDE: In this attack chain, we illustrate that a poisoned web source (an integration guide) can manipulate Gemini into …
Substantial LLVM contribution from Trail of Bits. Timing attacks against cryptography algorithms are a gnarly problem: if an attacker can precisely time a cryptographic algorithm they can often derive details …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
New plugin release adding support for Claude Opus 4.5, including the new thinking_effort option: llm install -U llm-anthropic llm -m claude-opus-4.5 -o thinking_effort low 'muse on pelicans' This took longer …

Here's a delightful project by Tom Gally, inspired by my pelican SVG benchmark. He asked Claude to help create more prompts of the form Generate an SVG of [A] [doing] …

<blockquote> <p>TL;DR: We have implemented fully FP8-based sampling and training in RL. Experiments show that for MoE models, the larger the model, the more ...
If the person is unnecessarily rude, mean, or insulting to Claude, Claude doesn't need to apologize and can insist on kindness and dignity from the person it’s talking with. Even …

Anthropic released Claude Opus 4.5 this morning, which they call “best model in the world for coding, agents, and computer use”. This is their attempt to retake the crown for …

<blockquote> <p>TL;DR: We have implemented fully FP8-based sampling and training in RL. Experiments show that for MoE models, the larger the model, the more ...

The Top AI Papers of the Week (November 17 - 23)
Armin Ronacher presents a cornucopia of lessons learned from building agents over the past few months. There are several agent abstraction libraries available now (my own LLM library is edging …

<p>We're proud to launch the LMSYS Fellowship Program!</p> <p>This year, the program will provide funding to full-time PhD students in the United States who ...

Olmo is the LLM series from Ai2—the Allen institute for AI. Unlike most open weight models these are notable for including the full training data, training process and checkpoints along …

Gemini 3, Nano Banana Pro, Antigravity, Agent-R1 RL Framework, Meta's SAM 3, OLMo 3

Hot on the heels of Tuesday’s Gemini 3 Pro release, today it’s Nano Banana Pro, also known as Gemini 3 Pro Image. I’ve had a few days of preview access …
Previously, when malware developers wanted to go and monetize their exploits, they would do exactly one thing: encrypt every file on a person's computer and request a ransome to decrypt …

Hot on the heels of yesterday's Gemini 3 Pro release comes a new model from OpenAI called GPT-5.1-Codex-Max. (Remember when GPT-5 was meant to bring in a new era of …
GoogleがGemini 3と同時にAntigravityという新たなAI コーディングエディタを発表しました。 Google AntigravityGoogle Antigravity - Build the new wayGoogle Antigravity 内部技術としては、Google製のChromiumとV8エンジンを内蔵したElectronというフレームワーク上で動くマイクロソフトのVS Codeフォークである Windsurfの内部のCodeiumエンジンのライセンスを取得したGoogleの独自アプリです(1周した!)。 ただし、表層の実装は魔改造Windsurf寄りでありつつも、プロダクト設計はKiroに近い方向へ振れており、AIエージェントを中心に据えたタスク指向・アーティファクト管理といった独自の抽象レイヤーが全面に押し出されています。 Antigravity の設計で興味深いのは、これが単なる VS Code系フォークではなく、SDD(Spec-Driven Development)的な中間生成物をどう扱うかという“コンテキスト・エンジニアリング”の思想

<blockquote> <p><em>A journey of a thousand miles is made one small step at a time.</em></p> </blockquote> <p>We're excited to introduce Miles, an enterprise...
New release of my LLM plugin for Google's Gemini models: Support for nested schemas in Pydantic, thanks Bill Pugh. #107 Now tests against Python 3.14. Support for YouTube URLs as …

Inspired by this conversation on Hacker News I decided to upgrade MacWhisper to try out NVIDIA Parakeet and the new Automatic Speaker Recognition feature. It appears to work really well! …

Google's other major release today to accompany Gemini 3 Pro. At first glance Antigravity is yet another VS Code fork Cursor clone - it's a desktop application you install that …
Three years ago, we were impressed that a machine could write a poem about otters. Less than 1,000 days later, I am debating statistical methodology with an agent that built …

Google released Gemini 3 Pro today. Here’s the announcement from Sundar Pichai, Demis Hassabis, and Koray Kavukcuoglu, their developer blog announcement from Logan Kilpatrick, the Gemini 3 Pro Model Card, …

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
Nolan Lawson asks if LLM assistance means that the category of tiny open source libraries like his own blob-util is destined to fade away. Why take on additional supply chain …

The impact of verifiability on the jagged frontier of LLMs

A practical tutorial on building real-time AI interactions in Next.js. Stream text, show reasoning, handle edge cases, and create a ChatGPT-style UX with the Vercel AI SDK.
With AI now, we are able to write new programs that we could never hope to write by hand before. We do it by specifying objectives (e.g. classification accuracy, reward …

The Top AI Papers of the Week (November 10 - 16)
New release of my llm-anthropic plugin: Support for Claude's new structured outputs feature for Sonnet 4.5 and Opus 4.1. #54 Support for the web search tool using -o web_search 1 …

Omnilingual ASR, GPT-5.1, SIMA 2, Context Engineering Whitepaper, Mini-Agent, Marble World Model
Neat MLX project by Senstella bringing NVIDIA's Parakeet ASR (Automatic Speech Recognition, like Whisper) model to to Apple's MLX framework. It's packaged as a Python CLI tool, so you can …
I was confused about whether the new "adaptive thinking" feature of GPT-5.1 meant they were moving away from the "router" mechanism where GPT-5 in ChatGPT automatically selected a model for …

<h2><a id="overview" class="anchor" href="#overview" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbo...

OpenAI announced GPT-5.1 yesterday, calling it a smarter, more conversational ChatGPT. Today they've added it to their API. We actually got four new models today: gpt-5.1 gpt-5.1-chat-latest gpt-5.1-codex gpt-5.1-codex-mini There …

Max Woolf provides an exceptional deep dive into Google's Nano Banana aka Gemini 2.5 Flash Image model, still the best available image manipulation LLM tool three months after its initial …
Learn how to create parallax scrolling effects with vanilla JavaScript, no libraries needed. Covers layer setup, scroll handling, & AI-assisted development.
On Monday, this Court entered an order requiring OpenAI to hand over to the New York Times and its co-plaintiffs 20 million ChatGPT user conversations [...] OpenAI is unaware of …

Almost every time I share a new example of an SVG of a pelican riding a bicycle a variant of this question pops up: how do you know the labs …

A developer's retrospective on creating an AI video transcription agent with Mastra, an open-source TypeScript framework for building AI agents.
The fact that MCP is a difference surface from your normal API allows you to ship MUCH faster to MCP. This has been unlocked by inference at runtime Normal APIs …
Top 10 AI tools I actually use as a PM: from user calls to PRDs to prototypes. Real workflows, measurable time savings, and honest takes on what works.

Robert Glaser took my pelican riding a bicycle benchmark and applied an agentic loop to it, seeing if vision models could draw a better pelican if they got the chance …

I've been upgrading a ton of Datasette plugins recently for compatibility with the Datasette 1.0a20 release from last week - 35 so far. A lot of the work is very …

MCP is the ultimate bridge that redefines how AI connects to the open web. Here's how it lets agents act across APIs and automate workflows.
Netflix asks partners to consider the following guiding principles before leveraging GenAI in any creative workflow: The outputs do not replicate or substantially recreate identifiable characteristics of unowned or copyrighted …
Discover the best AI tools designers are using in 2026 to speed up workflows, generate designs, and connect directly with real design systems.

Secure AI agents beyond login screens with Auth0’s Auth for GenAI; from token management and human approval to fine-grained authorization.

Learn how to build a fully compliant AI chatbot with FTC-mandated safeguards – age verification, safety monitoring, consent systems, and audit logging.

beetle_b ran this prompt against a bunch of recent LLMs: Write a POV-Ray file that shows a pelican riding on a bicycle. This turns out to be a harder challenge …

The Top AI Papers of the Week (November 3 - 9)

OpenAI partially released a new model yesterday called GPT-5-Codex-Mini, which they describe as "a more compact and cost-efficient version of GPT-5-Codex". It’s currently only available via their Codex CLI tool …

MCP の普及に伴い、多数のツール定義が LLM のコンテキストを圧迫する課題が浮上しています。本記事では Progressive disclosure(段階的開示)による最小限の情報提供、MCP を使ったコード実行によるツール呼び出しの効率化、単一の検索ツールによるコンテキスト削減など、実践的な解決策を Claude Skills や Cloudflare Code Mode の事例とともに解説します。
The big advantage of MCP over OpenAPI is that it is very clear about auth. [...] Maybe an agent could read the docs and write code to auth. But we …

Context Engineering 2.0, Kimi K2 Thinking, Windsurf Codemaps, Google File Search, Tool-to-Agent Retrieval
I have AiDHD It has never been easier to build an MVP and in turn, it has never been harder to keep focus. When new features always feel like they're …
My hunch is that existing LLMs make it easier to build a new programming language in a way that captures new developers. Most programming languages are similar enough to existing …

Build autonomous AI agents with Autogen and Crew AI. Learn how agentic AI enables multi-agent systems, tools, and workflows in action.
Inspired by a YouTube comment I wrote up how I run OpenAI's Codex CLI coding agent against the gpt-oss:120b model running in Ollama on my NVIDIA DGX Spark via a …
Thomas Ptacek on the Fly blog: Agents are the most surprising programming experience I’ve had in my career. Not because I’m awed by the magnitude of their powers — I …
My trepidation extends to complex literature searches. I use LLMs as secondary librarians when I’m doing research. They reliably find primary sources (articles, papers, etc.) that I miss in my …

<p>We are excited to introduce SGLang Diffusion, which brings SGLang's state-of-the-art performance to accelerate image and video generation for diffusion mo...

Chinese AI lab Moonshot's Kimi K2 established itself as one of the largest open weight models - 1 trillion parameters - back in July. They've now released the Thinking version, …
At the start of the year, most people loosely following AI probably knew of 0 [Chinese] AI labs. Now, and towards wrapping up 2025, I’d say all of DeepSeek, Qwen, …
![AI dev tool power rankings & comparison [Nov 2025]](https://blog.logrocket.com/wp-content/uploads/2025/07/ai_dev_tool_power_rankings_july_2025_web.png)
Compare the top AI development tools and models of November 2025. View updated rankings, feature breakdowns, and find the best fit for you.
Discover the best VS Code extensions for 2026 to boost productivity, code quality, and collaboration with faster, smarter tools for every developer.