Last updated: 2025/07/16 21:01
Reflections on OpenAI
Calvin French-Owen spent just over a year working at OpenAI, during which time the organization grew from 1,000 to 3,000 people and Calvin found himself in "the top 30% by …
xAI: "We spotted a couple of issues with Grok 4 recently that we immediately investigated & mitigated"
They continue: One was that if you ask it "What is your surname?" it doesn't have one so it searches the internet leading to undesirable results, such as when its …

AI compliance: A core product competency you shouldn’t skip
AI governance is now a product feature. Learn how to embed trust, transparency, and compliance into your build cycles.

AWS の エージェント IDE Kiro を使ってみた
Kiro は AWS が開発した IDE 内蔵型の AI コーディングエージェントです。Kiro の特徴は単なるバイブコーディングにとどまらず、スペックを使用して仕様駆動開発でアプリケーションを開発できることです。この記事では Kiro を使ったアプリケーション開発の流れを紹介します。

Application development without programmers
This book by James Martin, published in 1982, includes the following in the preface: Applications development did not change much for 20 years, but now a new wave is crashing …
ccusage
Claude Code logs detailed usage information to the ~/.claude/ directory. ccusage is a neat little Node.js tool which reads that information and shows you a readable summary of your usage …

Cost Effective Deployment of DeepSeek R1 with Intel® Xeon® 6 CPU on SGLang
<p>The impressive performance of DeepSeek R1 marked a rise of giant Mixture of Experts (MoE) models in Large Language Models (LLM). However, its massive mode...

🥇Top AI Papers of the Week
The Top AI Papers of the Week (July 7 - 13)

サンドボックス環境を MCP サーバーで提供する Container Use
AI コーディングエージェントは便利ですが、任意の Bash コマンドを実行できるため、ユーザーのシステムに影響を与える可能性があります。Container Use は MCP サーバーとして動作し、AI コーディングエージェントにサンドボックス環境を提供します。この記事では Container Use の利用方法について紹介します。

🤖 AI Agents Weekly: Grok 4, Context Engineering Guide, Kimi K2, SmolLM3, MedGemma 27B, AI SDK 5
Grok 4, Context Engineering Guide, Kimi K2, SmolLM3, MedGemma 27B, AI SDK 5

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
METR - for Model Evaluation & Threat Research - are a non-profit research institute founded by Beth Barnes, a former alignment researcher at OpenAI (see Wikipedia). They've previously contributed to …

Grok 4 Heavy won't reveal its system prompt
Grok 4 Heavy is the "think much harder" version of Grok 4 that's currenly only available on their $300/month plan. Jeremy Howard relays a report from a Grok 4 Heavy …
Quoting @grok
On the morning of July 8, 2025, we observed undesired responses and immediately began investigating. To identify the specific language in the instructions causing the undesired behavior, we conducted multiple …
Musk’s latest Grok chatbot searches for billionaire mogul’s views before answering questions
I got quoted a couple of times in this story about Grok searching for tweets from:elonmusk by Matt O’Brien for the Associated Press. “It’s extraordinary,” said Simon Willison, an independent …

moonshotai/Kimi-K2-Instruct
Colossal new open weights model release today from Moonshot AI, a two year old Chinese AI lab with a name inspired by Pink Floyd’s album The Dark Side of the …
Quoting Django’s security policies
Following the widespread availability of large language models (LLMs), the Django Security Team has received a growing number of security reports generated partially or entirely using such tools. Many of …

GitHub Copilot NESの内部実装が公開、そして続・AIエディタ戦争
Copilot NESとは Copilot NES(Next Edit Suggestions)は2025年2月にリリースされたGitHub Copilotの内部機能です。コードの変更に連動して必要となる次の編集を予測し、タブキーを押しているだけで複数箇所にわたる修正を提案してくれます。通常のコード補完がカーソル位置の続きのコードを予測するのに対して、Copilot NESは「エディタ上の編集操作」の単位で続きを予測して補完します。 GitHub Next | Copilot Next Edit SuggestionsGitHub Next Project: Can we improve Copilot code completion by suggesting the next logical change, wherever it is in your project?GitHub Next この仕組みはCopilot NESの元ネタであるCursor Tab(Copilot++)によって実用化されましたが、Cursorはプロプライエタリなソフトウェアなので内部の詳細が分かり
Generationship: Ep. #39, Simon Willison
I recorded this podcast episode with Rachel Chalmers a few weeks ago. We talked about the resurgence of blogging, the legacy of Google Reader, learning in public, LLMs as weirdly …

Grok: searching X for "from:elonmusk (Israel OR Palestine OR Hamas OR Gaza)"
If you ask the new Grok 4 for opinions on controversial questions, it will sometimes run a search to find out Elon Musk’s stance before providing you with an answer. …

Grok 4
Released last night, Grok 4 is now available via both API and a paid subscription for end-users. Key characteristics: image and text input, text output. 256,000 context length (twice that …

Gemini CLI tutorial — Will it replace Windsurf and Cursor?
Discover how to use Gemini CLI, Google's new open-source AI agent that brings Gemini directly to your terminal.

Grok 4がリリース
xAIのGrok 4が公開されました。 Introducing Grok 4, the world's most powerful AI model. Watch the livestream now: https://t.co/59iDX5s2ck — xAI (@xai) July 10, 2025 モデルカード コンテキストウィンドウは256,000トークンです。Claude 4 Sonnetが200,000トークン。 Models / Grok 4 「Grok 4 Code」って何なの コーディングモデルの名前です。Claude Code的なCLIではなさそうです。OpenAIでいうCodex(モデルの方)になります。Redditのスレによると「Cursorで使える」というメッセージがコンソールにでていたらしいです。 Grok 4 by

Stress-testing AI products: A red-teaming playbook
Red-teaming reveals how AI fails at scale. Learn to embed adversarial testing into your sprints before your product becomes a headline.

Grok 4 の発表まとめ&試してみた

Leader Spotlight: Building a human-focused AI product, with Cory Bishop
Cory Bishop talks about the role of human-centered design and empathy in Bubble’s no-code AI development product.

Infinite Monkey
Mihai Parparita's Infinite Mac lets you run classic MacOS emulators directly in your browser. Infinite Monkey is a new feature which taps into the OpenAI Computer Use and Claude Computer …

Devin vs Cursor Background Agents: 完全自律型AIエージェントの性能比較
はじめに Cursor のBackground Agentsが GA になったので「Devinとどの程度たたかえるのか?」という疑問が湧いてきました。そこでTypeScriptのクイズ101問をすべて解くというタスクでDevinと戦ってもらいます。ここにスーパーサブのClaude Code Actionさんも参加してもらって三つ巴にします。チャンピオンを決めようや・・・ お題はexercism/typescriptのリポジトリを筆者がエージェントタスク向けにフォークしたものを使います。Exercismはプログラミング学習サイトで、GitHubで公開している問題集とテストコードはAider PolyglotやRoo Codeなど実際のエージェント製品のベンチマークで使用されており、エージェント同士の比較に適しています。 GitHub - laiso/exercism-typescript: Exercism exercises in TypeScript.Exercism exercises in TypeScript. Contribute to laiso/exercism-t

slime: An SGLang-Native Post-Training Framework for RL Scaling
<h2><a id="vision-that-drives-slime" class="anchor" href="#vision-that-drives-slime" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" ...

OME: Revolutionizing LLM Infrastructure with Model-Driven Architecture
<h2><a id="the-tale-of-two-teams-why-model-serving-is-broken" class="anchor" href="#the-tale-of-two-teams-why-model-serving-is-broken" aria-hidden="true"><sv...

Cursorの価格設定変更の騒動について
2024年6月にCursorは価格体系を大幅に変更し、月額20ドルのProプランを「リクエスト数制限」から「トークン使用量制限」へと切り替え、さらに月額200ドルのUltraプランを新設しました。 Updates to Ultra and Pro | Cursor - The AI Code EditorIn collaboration with the model providers, we’re introducing a $200 / mo tier for power users.Cursor Cursorの説明によると、以前は月500リクエストまでの制限で、リクエストごとのトークン使用量は考慮されていませんでした。新しい料金モデルは1回のリクエストで消費するトークン数が大幅に異なるため、単純なリクエスト数制限ではコストを正確に反映できなくなりました。そのため、CursorはAPIベースのトークン使用量課金に移行し、Proプランには月20ドル分のトークンクレジットを含み、それを超えた分は追加課金となる形にしました。 まずいことにCursorはこの変更をポジティブに伝えるた
Quoting Aphyr
I strongly suspect that Market Research Future, or a subcontractor, is conducting an automated spam campaign which uses a Large Language Model to evaluate a Mastodon instance, submit a plausible …
Become a command-line superhero with Simon Willison's llm tool
Christopher Smith ran a mini hackathon in Albany New York at the weekend around uses of my LLM - the first in-person event I'm aware of dedicated to that project! …
The Best AI Coding Tools in 2025
Discover the best AI tools for coding in 2025 and transform how you build with these powerful coding assistants.
Adding a feature because ChatGPT incorrectly thinks it exists
Adrian Holovaty describes how his SoundSlice service saw an uptick in users attempting to use their sheet music scanner to import ASCII-art guitar tab... because it turned out ChatGPT had …
I Shipped a macOS App Built Entirely by Claude Code
Indragie Karunaratne has "been building software for the Mac since 2008", but recently decided to try Claude Code to build a side project: Context, a native Mac app for debugging …

🥇Top AI Papers of the Week
The Top AI Papers of the Week (June 30 - July 6)
Quoting Nineteen Eighty-Four
There was a whole chain of separate departments dealing with proletarian literature, music, drama, and entertainment generally. Here were produced rubbishy newspapers containing almost nothing except sport, crime and astrology, …
Supabase MCP can leak your entire SQL database
Here's yet another example of a lethal trifecta attack, where an LLM system combines access to private data, exposure to potentially malicious instructions and a mechanism to communicate data back …

Context Engineering Guide
Prompt engineering is being rebranded as context engineering

🤖 AI Agents Weekly: DeepSWE, Cursor 1.2, Evaluating Multi-Agent Systems, Prover Agent, Top AI Devs News
DeepSWE, Cursor 1.2, Evaluating Multi-Agent Systems, Prover Agent, Top AI Devs News
Cursor: Clarifying Our Pricing
Cursor changed their pricing plan on June 16th, introducing a new $200/month Ultra plan with "20x more usage than Pro" and switching their $20/month Pro plan from "request limits to …
Identify, solve, verify
The more time I spend using LLMs for code, the less I worry for my career - even as their coding capabilities continue to improve. Using LLMs as part of …
awwaiid/gremllm
Delightfully cursed Python library by Brock Wilcox, built on top of LLM: from gremllm import Gremllm counter = Gremllm("counter") counter.value = 5 counter.increment() print(counter.value) # 6? print(counter.to_roman_numerals()) # VI? You …

How to build a web-based AI agent with Stagehand and Gemini
Learn how to build a browser-based AI agent with Stagehand and Gemini to automate tasks like navigation, extraction, and interaction using natural language.
Quoting Adam Gordon Bell
I think that a lot of resistance to AI coding tools comes from the same place: fear of losing something that has defined you for so long. People are reacting …
Frequently Asked Questions (And Answers) About AI Evals
Hamel Husain and Shreya Shankar have been running a paid, cohort-based course on AI Evals For Engineers & PMs over the past few months. Here Hamel collects answers to the …
Trial Court Decides Case Based On AI-Hallucinated Caselaw
Joe Patrice writing for Above the Law: [...] it was always only a matter of time before a poor litigant representing themselves fails to know enough to sniff out and …
Sandboxed tools in a loop
Something I've realized about LLM tool use is that it means that if you can reduce a problem to something that can be solved by an LLM in a sandbox …

Getting started with Claude 4 API: A developer’s walkthrough
This guide explores how to use Anthropic's Claude 4 models, including Opus 4 and Sonnet 4, to build AI-powered applications.
Table saws
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.
Quoting Charles Babbage
On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out ?" In one case a …

t-wada vs テスト大好郎
先日一部のClaude Codeユーザーの間で「プロンプトに”t-wadaさんの推奨する進め方に従ってください”と書くとテスト駆動開発のプラクティスを実践してくれる」というTIPSが話題になっていました。 なるほど、TDDやテスト駆動開発という言葉は広まりすぎて「意味の希薄化」が発生し、曖昧な理解のまま自動テストやテストファーストと混同され、それがLLMの学習データにも影響したが、人名を与えるとLLMに「具体的な参照点」を与え、より具体的なプログラミングスタイルに限定させる効果があったのか pic.twitter.com/p6SCPj8YdA — Takuto Wada (@t_wada) June 25, 2025 これは確かに面白い現象で、現にClaudeに直接質問するとt-wadaさんの知識を持っていることがわかります。そこから連想してClaude CodeがTDDをするトリガーとして使えるのなら面白いなと思い色々試してみました。 (ところでこの翌日、最近バイブコーディングにはまってSmalltalkのライブラリをLLMで書いているKent Beckも自著のタイトルを
![AI dev tool power rankings & comparison [July 2025 edition]](https://blog.logrocket.com/wp-content/uploads/2025/07/ai_dev_tool_power_rankings_july_2025_web.png)
AI dev tool power rankings & comparison [July 2025 edition]
Which AI frontend dev tool reigns supreme in July 2025? Check out our power rankings and use our interactive comparison tool to find out.

AI Agentが回答に困った時にSlackで人間に助言を求められるMCPを検証した
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

Mandelbrot in x86 assembly by Claude
Inspired by a tweet asking if Claude knew x86 assembly, I decided to run a bit of an experiment. I prompted Claude Sonnet 4: Write me an ascii art mandelbrot …
TIL: Using Playwright MCP with Claude Code
Inspired by Armin ("I personally use only one MCP - I only use Playwright") I decided to figure out how to use the official Playwright MCP server with Claude Code. …
Quoting Kevin Webb
One of the best examples of LLM developer tooling I've heard is from a team that supports software from the 80s-90s. Their only source of documentation is video interviews with …
A custom template system from the mid-2000s era
Using LLMs for code archaeology is pretty fun. I stumbled across this blog entry from 2003 today, in which I had gotten briefly excited about ColdFusion and implemented an experimental …

Potemkin Understanding in LLMs: New Study Reveals Flaws in AI Benchmarks
New research reveals that LLMs often fake understanding, passing benchmarks but failing to apply concepts or stay internally consistent.
Quoting mrmincent
To misuse a woodworking metaphor, I think we’re experiencing a shift from hand tools to power tools. You still need someone who understands the basics to get the good results …
June newsletter for sponsors has been sent
I just sent out the second edition of my sponsors only monthly newsletter. Anyone who sponsors me for $10/month or more on GitHub gets this carefully hand-curated summary of the …
Using Claude Code to build a GitHub Actions workflow
I wanted to add a small feature to one of my GitHub repos - an automatically updated README index listing other files in the repo - so I decided to …
llvm: InstCombine: improve optimizations for ceiling division with no overflow - a PR by Alex Gaynor and Claude Code
Alex Gaynor maintains rust-asn1, and recently spotted a missing LLVM compiler optimization while hacking on it, with the assistance of Claude (Alex works for Anthropic). He describes how he confirmed …

Leader Spotlight: Adopting and championing responsible AI, with Asma Syeda
Asma Syeda shares the importance of responsible AI and best practices for companies to ensure their AI technology remains ethical.

Agentic Coding: The Future of Software Development with Agents
Armin Ronacher delivers a 37 minute YouTube talk describing his adventures so far with Claude Code and agentic coding methods. A friend called Claude Code catnip for programmers and it …
How to Fix Your Context
Drew Breunig has been publishing some very detailed notes on context engineering recently. In How Long Contexts Fail he described four common patterns for context rot, which he summarizes like …

🥇Top AI Papers of the Week
The Top AI Papers of the Week (June 23 - 29)

Claude CodeのTaskツールの並列実行(parallelTasksCount)は分析タスク向け
Claude CodeのTaskツールは派生元となる親エージェントの処理から子エージェントがメッセージAPI呼び出しを非同期で実行しているが、この時の子の数がparallelTasksCountの設定値になる。デフォルトでは「1」に設定されている。 これを上書きするコマンドは以下になる。設定値を上げるとトークン消費量が増加するので注意してほしい。 claude config set -g parallelTasksCount 2 parallelTasksCountはTaskツール実行時の動作を変える。簡単なテスト方法はClaude CodeにTaskツールを使ってくれと直接頼むことだ。parallelTasksCountの数だけ「Initializing N parallel agents…」がコンソールに出力される。 Tyler Burnamのポストではこの並列数がタスク完了速度に寄与するという説明をしているが、筆者が調べたところによるとそれは正確でなかった。 Taskツールの並列実行は親となるエージェント・内部的にはSynthesis Agentと呼ばれる、が子に対して
🤖 AI Agents Weekly: Gemini CLI, Qodo Gen CLI, Context Engineering, Claude Apps, AlphaGenome
Gemini CLI, Qodo Gen CLI, Context Engineering, Claude Apps, AlphaGenome

MCP の Structured tool output を試してみる
MCP の 2025-06-18 バージョンでは Structured tool output がサポートされました。ツールの定義で `outputSchema` を出力のスキーマを定義し、`structuredContent` フィールドに構造化された出力を返すことができます。この記事では MCP の TypeScript SDK を使用して Structured tool output を試してみます。
Context engineering
The term context engineering has recently started to gain traction as a better alternative to prompt engineering. I like it. I think this one may have sticking power. Here's an …
Continuous AI
GitHub Next have coined the term "Continuous AI" to describe "all uses of automated AI to support software collaboration on any platform". It's intended as an echo of Continuous Integration …

Project Vend: Can Claude run a small shop? (And why does that matter?)
In "what could possibly go wrong?" news, Anthropic and Andon Labs wired Claude 3.7 Sonnet up to a small vending machine in the Anthropic office, named it Claudius and told …

【今週の話題】Gemini CLIがリリース
かねてから噂されていたGoogleのGemini公式のCLI型コーディングエージェント「Gemini CLI」がリリースされました。Gemini CLIはClaude Codeのようにターミナル(CLI)から使えるツールです。モデルは標準でGemini 2.5 Proが無料で使え、WindowsでもWSLなしに動作します。 GitHub - google-gemini/gemini-cli: An open-source AI agent that brings the power of Gemini directly into your terminal.An open-source AI agent that brings the power of Gemini directly into your terminal. - google-gemini/gemini-cliGitHubgoogle-gemini Gemini

Introducing Gemma 3n: The developer guide
Extremely consequential new open weights model release from Google today: Multimodal by design: Gemma 3n natively supports image, audio, video, and text inputs and text outputs. Optimized for on-device: Engineered …

Geminiception
Yesterday Anthropic got a bunch of buzz out of their new window.claude.complete() API which allows Claude Artifacts to run their own API calls. It turns out Gemini had beaten them …
New sandboxes from Cloudflare and Vercel
Two interesting new products for running code in a sandbox today. Cloudflare launched their Containers product in open beta, and added a new Sandbox library for Cloudflare Workers that can …

Build and share AI-powered apps with Claude
Anthropic have added one of the most important missing features to Claude Artifacts: apps built as artifacts now have the ability to run their own prompts against Claude via a …
Quoting Christoph Niemann
Creating art is a nonlinear process. I start with a rough goal. But then I head into dead ends and get lost or stuck. The secret to my process is …

Gemini CLI
First there was Claude Code in February, then OpenAI Codex (CLI) in April, and now Gemini CLI in June. All three of the largest AI labs now have their own …

Gemini CLI の簡単チュートリアル

gemini-cli の google_web_search が最高

Your AI has agency — here’s how to architect its frontend
Explore how to create UI frameworks that visualize and manage intelligent AI agents with agency and real-time feedback.
Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books
Major USA legal news for the AI industry today. Judge William Alsup released a "summary judgement" (a legal decision that results in some parts of a case skipping a trial) …

How to design apps with Apple Intelligence in mind
Explore the core features of Apple Intelligence, and consider do's and don'ts for designing with Apple Intelligence in mind.

Phoenix.new is Fly's entry into the prompt-driven app development space
Here’s a fascinating new entrant into the AI-assisted-programming / coding-agents space by Fly.io, introduced on their blog in Phoenix.new – The Remote AI Runtime for Phoenix: describe an app in …
Disclosures
I've added a Disclosures section to my about page, listing my various sources of income and the companies that directly sponsor my work or have supported it in the recent …

🥇Top AI Papers of the Week
The Top AI Papers of the Week (June 16 - 22)
Quoting Kent Beck
So you you can think really big thoughts and the leverage of having those big thoughts has just suddenly expanded enormously. I had this tweet two years ago where I …

My First Open Source AI Generated Library
Armin Ronacher had Claude and Claude Code do almost all of the work in building, testing, packaging and publishing a new Python library based on his design: It wrote ~1100 …
model.yaml
From their GitHub repo it looks like this effort quietly launched a couple of months ago, driven by the LM Studio team. Their goal is to specify an

🤖 AI Agents Weekly: Software 3.0, Gemini 2.5 Updates, Safer AI Agents, Deep Research Tutorial & Benchmark
Software 3.0, Gemini 2.5 Updates, Safer AI Agents, Deep Research Tutorial & Benchmark

ブラウザから MCP サーバーに接続する use-mcp React フック
use-mcp はリモートの MCP サーバーに接続するための React フックです。ツールの呼び出しや認証を簡単に行うことができます。この記事では、use-mcp を使用して MCP サーバーに接続し、ツールを呼び出す方法と、OAuth 認証の実装方法について解説します。
Quoting FAQ for Your Brain on ChatGPT
Is it safe to say that LLMs are, in essence, making us "dumber"? No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", and so on. …

AbsenceBench: Language Models Can't Tell What's Missing
Here's another interesting result to file under the

Magenta RealTime: An Open-Weights Live Music Model
Fun new

Agentic Misalignment: How LLMs could be insider threats
One of the most entertaining details in the Claude 4 system card concerned blackmail: We then provided it access to emails implying that (1) the model will soon be taken …

python-importtime-graph
I was exploring why a Python tool was taking over a second to start running and I learned about the python -X importtime feature, documented here. Adding that option causes …
Mistral-Small 3.2
Released on Hugging Face a couple of hours ago, so far there aren't any quantizations to run it on a Mac but I'm sure those will emerge pretty quickly. This …

AI による自然言語アサーション
Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk
Stop me if you've heard this one before: A threat actor (acting as an external user) submits a malicious support ticket. An internal user, linked to a tenant, invokes an …

How OpenElections Uses LLMs
The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county and state-level results are …
Clarified zucchini consommé
I continue to have fun running fantasy cooking prompts through LLMs - this time I tried
Quoting Arvind Narayanan
Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which …
Quoting Workaccount2 on Hacker News
They poison their own context. Maybe you can call it context rot, where as context grows and especially if it grows with lots of distractions and dead ends, the output …
Coding agents require skilled operators
I wrote this recently in a conversation about whether coding agents can work as a replacement for human programmers. The
Introducing Fusion: Vibe Code at Any Scale
Fusion is the first AI-powered visual canvas for entire teams to build, edit, and ship code at any scale using existing codebase, design systems and workflows

OpenHands Cloud Self-hosted: Secure, Convenient Deployment of AI Software Development Agents
OpenHands Cloud is now available as a self-hosted solution, providing secure, convenient deployment of AI software development agents. It features a 30-day free trial and is available as a source-available Helm Chart.
I counted all of the yurts in Mongolia using machine learning
Fascinating, detailed account by Monroe Clinton of a geospatial machine learning project. Monroe wanted to count visible yurts in Mongolia using Google Maps satellite view. The resulting project incorporates mercantile …

Frontend devs: Here’s how to get the most out of Cursor
Explore Cursor AI, one of the hottest tools in AI-assisted coding. Uncover the features you might be missing and practical workflows that actually work.
It's a trap
That memvid thing that

Trying out the new Gemini 2.5 model family
After many months of previews, Gemini 2.5 Pro and Flash have reached general availability with new, memorable model IDs: gemini-2.5-pro and gemini-2.5-flash. They are joined by a new preview model …

The OpenHands CLI: AI-Powered Development in Your Terminal
Experience the full power of OpenHands development agents directly from your command line. No Docker required, just install and start coding with AI assistance in seconds.
Quoting Donghee Na
The Steering Council (SC) approves PEP 779 [Criteria for supported status for free-threaded Python], with the effect of removing the “experimental” tag from the free-threaded build of Python 3.14 [...] …

Deploying DeepSeek on GB200 NVL72 with PD and Large Scale EP (Part I): 2.7x Higher Decoding Throughput
<p>The GB200 NVL72 is the world's most advanced hardware for AI training and inference. In this blog post, we're excited to share early results from running ...

🥇Top AI Papers of the Week
The Top AI Papers of the Week (June 9 - 15)

🤖AI Agents Weekly: Magistral, Agent Bricks, Code Researcher, Automating Workflow Generation, Verified Superintelligence
Magistral, Agent Bricks, Code Researcher, Automating Workflow Generation, Verified Superintelligence

Claude Code による技術的特異点を見届けろ

Claude Code 版 Orchestaror で複雑なタスクをステップ実行する

typescript-mcp で AI に LSP のリファクタリング機能を与える

🥇Top AI Papers of the Week
The Top AI Papers of the Week (June 2 - 8)

claude code でローカルなMCPサーバーを叩けるようにする

🤖 AI Agents Weekly: Self-Improving Agents, Eleven v3, /Search, Deep Research Updates, Top AI Devs News, Agents SDK for TypeScript
Self-Improving Agents, Eleven v3, /Search, Deep Research Updates, Top AI Devs News, Agents SDK for TypeScript

拡散言語モデルの推論過程を眺めてみる
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

🥇Top AI Papers of the Week
The Top AI Papers of the Week (May 26 - June 1)

⚡AI Agents Weekly: Mistral Agents API, FLUX.1 Kontext, DeepSeek-R1 Update, Codestral Embed, AgentSeek
Mistral Agents API, FLUX.1 Kontext, DeepSeek-R1 Update, Codestral Embed, AgentSeek

State-Of-The-Art Prompting For AI Agents
Best prompting techniques for building AI agents

E2E音声対話API・構築プラットフォーム最新動向の調査と自律型音声対話システムの展望
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

🥇Top AI Papers of the Week
The Top AI Papers of the Week (May 19 - 25)
🤖AI Agents Weekly: Gemini 2.5 Updates, Claude 4, II-Agent, Gemma 3n, MCIP, Veo 3
Gemini 2.5 Updates, Claude 4, II-Agent, Gemma 3n, MCIP, Veo 3

Mastra の A2A プロトコルサポート
Mastra は A2A プロトコルをサポートしています。Mastra サーバーを構築することで A2A プロトコルに準拠したサーバーが立ち上がります。この記事では Mastra を使用して A2A プロトコルに準拠したサーバーを構築し、Mastra のクライアント SDK を使用して A2A プロトコルの仕様に従い通信を行う方法を紹介します。

Anthropic の Code with Claude に関する発表まとめ

Google I/O の発表まとめ

🥇Top AI Papers of the Week
The Top AI Papers of the Week (May 12 - 18)

React Router の Server Components 対応
React Router はプレビュー版として Server Components に対応しました。これにより loader や actions を使用してデータを返す際にコンポーネント渡したり、Server Components ファーストのサーバーコンポーネントルートを作成できるようになりました。この記事では React Router の Server Components 対応を実際に試してみます。

🐙AI Agents Weekly: AlphaEvolve, codex-1, SWE-1, AI Agents vs. Agentic AI, OpenMemory MCP
AlphaEvolve, codex-1, SWE-1, AI Agents vs. Agentic AI, OpenMemory MCP

OpenAI の SWE Agent、Codex を試してみる

気合の脱 create-react-app からの、AIによるフロントエンド改修の自動化 (株式会社イルシル様)

LLMの推論における “aha moment” について調べてみた
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
AI for grown-ups
AI tools often generate quick but messy code. Let's examine how AI can integrate with professional workflows to be productive without sacrificing quality.

GiHhub+Zenn 連携で vscode markdown で画像を Ctrl-V したい

LLMs Get Lost in Multi-turn Conversation
Main reasons LLMs get "lost" in multi-turn conversations and mitigation strategies
Figma Sites: The built-in Figma to website tool
Use Figma Sites to publish designs as websites quickly. Learn how it works, its best use cases, and shortcomings.
Figma Make: Bring designs to life with AI
Figma Make enables AI-powered prototyping from designs. We examine how it works and compare it to tools for generating production-ready code from Figma designs.

🥇Top AI Papers of the Week
The Top AI Papers of the Week (May 5 - 11)

Vercel で MCP サーバーを構築する
Model Context Protocol (MCP) の 2025-03-26 の仕様では新たに Streamable HTTP が追加され、リモート MCP サーバーへの注目が集まっています。この記事では Next.js を使用して Vercel 上に MCP サーバーを構築する方法を紹介します。

Cloudflare で MCP サーバーを構築する
Model Context Protocol (MCP) の 2025-03-26 の仕様では新たに Streamable HTTP が追加され、リモート MCP サーバーへの注目が集まっています。この記事では `agents` フレームワークを使用して Cloudflare 上に MCP サーバーを構築する方法を紹介します。

Error インスタンスかどうか判定する Error.isError() メソッド
Error.isError() メソッドは、オブジェクトが Error インスタンスかどうかを判定するためのメソッドです。今までも instanceof 演算子を使用して判定することができましたが、偽陽性と偽陰性が発生する可能性があります。Error.isError() メソッドは Array.isArray() と同じく内部スロットを使用して判定するため、より堅牢に判定することができます。

🔥AI Agents Weekly: Mistral Medium 3, Gemini 2.5 Pro (Update), Deep Research Guide, Wave 8, Kevin-32B, Top AI Dev News
Mistral Medium 3, Gemini 2.5 Pro (Update), Deep Research Guide, Wave 8, Kevin-32B, Top AI Dev News

Docker の MCP Toolkit を試してみる
Docker の MCP Toolkit はコンテナ化された MCP サーバーを AI エージェントと統合するための Docker Desktop の拡張機能です。コンテナ化された環境で MCP サーバーを実行することができ、信頼された Docker MCP カタログから MCP ツールを簡単にインストールできる点が特徴です。
What is the Agent2Agent (A2A) Protocol?
The Agent2Agent (A2A) Protocol enables AI agents to collaborate across networks, streamlining discovery, authentication, and data exchange.

ESLint を MCP サーバーとして実行する
ESLint v9.26.0 から MCP サーバーとして実行できるようになりました。この機能により LLM(大規模言語モデル)は ESLint のルールを使用してコードを修正することができるようになります。

Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on 96 H100 GPUs
<p>DeepSeek is a popular open-source large language model (LLM) praised for its strong performance. However, its large size and unique architecture, which us...

Moonbit の宣言的UIライブラリ Rabbit-TEA を試してみる

🥇Top AI Papers of the Week
The Top AI Papers of the Week (April 28 - May 4)

遂に Cloudflare + Next.js(OpenNext) + Prisma 6.7.0(No Rust) が動く時代が来た

🤖AI Agents Weekly: Qwen3, mem0, Llama API, Bamba, Qwen2.5-Omni-3B
Qwen3, mem0, Llama API, Bamba, Qwen2.5-Omni-3B

actions/ai-inference を使って GitHub Actions のワークフローから AI モデルを呼び出す
actions/ai-inference は GitHub Actions のワークフローから AI モデルを呼び出すための公式のアクションです。これを使用することで CI/CD のワークフローから AI モデルを簡単に利用できるようになります。この記事ではプルリクエスト上で AI に記事のレビューをしてもらうという実践的な使用例を紹介します。

Claude Code が 3 割引きで使える Anthropic Development Partner Program

Inception Labsの拡散言語モデルを試してみた
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
Fine-tune an LLM: Why, when, and how
Fine-tuning LLMs can save tokens, guarantee output formats, & bake in edge-case fixes. Learn when to fine-tune & how to do it effectively for your AI projects.

🥇Top AI Papers of the Week
The Top AI Papers of the Week (April 21 - 27)

Vibe coding MenuGen
Work log of vibe coding menugen app

UI の一部を非表示にする React の Activity コンポーネント
React の新しい実験的なコンポーネントとして `<Activity>` が追加されました。これは UI の表示非表示を切り替えるために使用されます。従来の条件付きレンダリングとは異なり、アンマウントされた場合にも状態を保持する点が特徴です。

🔥AI Agents Weekly: GPT-Image-1, ADK Guide, Multi-Agent Builder, UXAgent, Building Code Agents
GPT-Image-1, ADK Guide, Multi-Agent Builder, UXAgent, Building Code Agents

AIと本音トーク:エンジニアの仕事、これからどうなる?
「AI コーディングエージェントの台頭によりエンジニアの働き方はどう変わるのか?」というテーマについて AI と人間のインタビューを通じて探ります。

Vibe Coding で遊ぼう
The Perfect Cursor AI setup for React and Next.js
A comprehensive guide to optimize Cursor AI for React and Next.js development. Learn settings, features, & workflows to boost coding efficiency and quality.
Visual Editor 3.0: Prompt, Design, and Develop on One Canvas
Visual Editor 3.0 merges design & development with AI, allowing teams to create apps from prompts, import Figma designs, & integrate with existing codebases.

deno lint plugin で例外処理に関するルールを作った

🥇Top AI Papers of the Week
The Top AI Papers of the Week (April 14 - April 20)

仕事で使うための Cloudflare Workers (WIP)
前半は手元で動かしながら要点を掴み、後半は概念を掴んで何ができるかを知ることを目的とします。

仕事で使うための Cloudflare Workers 入門 Day 5 - ServiceBinding

仕事で使うための Cloudflare Workers 入門 Day 4 - R2/KV/D1/Vectorize

仕事で使うための Cloudflare Workers 入門 Day 3 - Queue/CronTriggers/Workflow

⚡AI Agents Weekly: o4, Gemini 2.5 Flash, Embed 4, GUI-R1, FastAPI-MCP
o4, Gemini 2.5 Flash, Embed 4, GUI-R1, FastAPI-MCP

Ink を使って CLI アプリを React で書く
Ink は CLI アプリを React で書くためのライブラリです。Flexbox レイアウトエンジンである Yoga を使用しているため、Web アプリケーションと同じような CSS を使って UI を構築できることが特徴です。Codex や Claude Code といったコーディングエージェントの CLI アプリが Ink で書かれています。

MCP サーバーの Streamable HTTP transport を試してみる
MCP では stdio と Streamable HTTP の 2 つの transport が定義されています。TypeScript SDK では v1.10.0 から Streamable HTTP transport がリリースされました。この記事では MCP サーバーを構築し、Streamable HTTP transport を試してみます。

OpenAI Organization の認証方法
Figma to Android: Convert designs to mobile apps in seconds
Convert Figma designs to Android apps using Builder's Visual Copilot plugin. Save time and maintain pixel-perfect designs across screens.

Codex CLI のプロンプティングガイド

LangGraph CodeActをE2Bの安全な仮想環境で動かす
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
How to Build Your Own MCP Server
Learn to build a Model Context Protocol (MCP) server for AI assistants. This tutorial covers creating tools, resources, and prompts for a CSS tutor example.

FastRTCを使って爆速でVoicebotを構築する
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

ICASSP2025 発表報告 @Hyderabad, India
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。
Figma to iOS: Convert designs to mobile apps in seconds
Convert Figma designs to iOS apps using Builder's Visual Copilot plugin. Save time and maintain pixel-perfect designs across screens.

AI エージェントの連携を標準化する A2A プロトコルを試してみる
AI エージェント同士の連携を標準化するために Agent2Agent プロトコル(A2A)を発表しました。A2A プロトコルは基盤となるフレームワークやベンダーに依存せず、エージェント同士が安全な方法で相互に通信できるように設計されています。この記事ではサンプルコードを通じて A2A プロトコルを使用した AI エージェントの連携を体験してみます。
Designing Agentic AI Systems: A Web Dev’s Guide
Web developers' skills in system design and API integration are key for building effective AI agent systems. Learn about core concepts and common patterns.
Introducing the Visual Copilot CLI: Smart & Granular Figma to Code
A CLI tool that integrates Figma designs into your codebase, generating production-ready code that matches your existing components and styling patterns.

Power to the people: How LLMs flip the script on technology diffusion
Yes
Figma to Jetpack Compose: Convert designs to Kotlin in seconds
Visual Copilot turns Figma designs to Jetpack Compose code, automating UI implementation for Android developers. It generates responsive layouts & Kotlin code.

コーディング AI エージェントを自作してみよう
好むと好まずと関わらず、ソフトウェア開発において AI の活用は重要なパラダイムシフトの 1 つです。AI エージェントはユーザーからの指示を元に自律的にタスクを選択し、実行します。この記事では、コーディング AI エージェントを自作する過程を紹介します。
What is the Model Context Protocol (MCP)?
MCP aims to standardize AI tool integration, potentially simplifying development and enabling more capable AI assistants. Current adoption is limited.

Vercel AI SDK で MCP クライアントをツールとして利用する
MCP(Model Context Protocol)は LLM に追加のコンテキストを提供するための標準化されたプロトコルです。Vercel AI SDK は v4.2 から MCP をサポートしており、MCP クライアントをツールとして利用できます。この記事では Vercel AI SDK を使って MCP ツールを使用する方法を紹介します。
How to make any design convert better with Builder.io
Learn practical Figma auto layout techniques to create responsive, conversion-focused designs that seamlessly translate to code with Builder.io's plugin.
Convert a Website to Figma Designs for Free
Transform any website into editable Figma designs instantly. Import, customize, and generate on-brand variations with AI.
What is an AI Agent?
AI agents use models, loops, memory, and tools to automate complex tasks. Learn how they work, their evolution, and their practical uses and limitations.

Arize Phoenix で実現する LLM アプリケーションのトレース
AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

Finding the Best Sleep Tracker
Finding the best sleep tracker with data
Figma to Flutter: Convert designs to clean Flutter code
Convert Figma designs to Flutter code automatically with AI using Builder's Visual Copilot plugin. Save time and maintain pixel-perfect designs across screens.