No Image

Reflections on OpenAI

Calvin French-Owen spent just over a year working at OpenAI, during which time the organization grew from 1,000 to 3,000 people and Calvin found himself in "the top 30% by …

Simon Willison's Blog
api library tool
No Image

xAI: "We spotted a couple of issues with Grok 4 recently that we immediately investigated & mitigated"

They continue: One was that if you ask it "What is your surname?" it doesn't have one so it searches the internet leading to undesirable results, such as when its …

Simon Willison's Blog
tool
AI compliance: A core product competency you shouldn’t skip

AI compliance: A core product competency you shouldn’t skip

AI governance is now a product feature. Learn how to embed trust, transparency, and compliance into your build cycles.

logrocket-dev
api tool
AWS の エージェント IDE Kiro を使ってみた

AWS の エージェント IDE Kiro を使ってみた

Kiro は AWS が開発した IDE 内蔵型の AI コーディングエージェントです。Kiro の特徴は単なるバイブコーディングにとどまらず、スペックを使用して仕様駆動開発でアプリケーションを開発できることです。この記事では Kiro を使ったアプリケーション開発の流れを紹介します。

azukiazusa のテックブログ2
library tool
Application development without programmers

Application development without programmers

This book by James Martin, published in 1982, includes the following in the preface: Applications development did not change much for 20 years, but now a new wave is crashing …

Simon Willison's Blog
api framework tool
No Image

ccusage

Claude Code logs detailed usage information to the ~/.claude/ directory. ccusage is a neat little Node.js tool which reads that information and shows you a readable summary of your usage …

Simon Willison's Blog
tool
Cost Effective Deployment of DeepSeek R1 with Intel® Xeon® 6 CPU on SGLang

Cost Effective Deployment of DeepSeek R1 with Intel® Xeon® 6 CPU on SGLang

<p>The impressive performance of DeepSeek R1 marked a rise of giant Mixture of Experts (MoE) models in Large Language Models (LLM). However, its massive mode...

LMSYS Blog
library tool
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (July 7 - 13)

Elvis Saravia's NLP Blog
platform
サンドボックス環境を MCP サーバーで提供する Container Use

サンドボックス環境を MCP サーバーで提供する Container Use

AI コーディングエージェントは便利ですが、任意の Bash コマンドを実行できるため、ユーザーのシステムに影響を与える可能性があります。Container Use は MCP サーバーとして動作し、AI コーディングエージェントにサンドボックス環境を提供します。この記事では Container Use の利用方法について紹介します。

azukiazusa のテックブログ2
api tool
🤖 AI Agents Weekly: Grok 4, Context Engineering Guide, Kimi K2, SmolLM3, MedGemma 27B, AI SDK 5

🤖 AI Agents Weekly: Grok 4, Context Engineering Guide, Kimi K2, SmolLM3, MedGemma 27B, AI SDK 5

Grok 4, Context Engineering Guide, Kimi K2, SmolLM3, MedGemma 27B, AI SDK 5

Elvis Saravia's NLP Blog
platform
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

METR - for Model Evaluation & Threat Research - are a non-profit research institute founded by Beth Barnes, a former alignment researcher at OpenAI (see Wikipedia). They've previously contributed to …

Simon Willison's Blog
tool
Grok 4 Heavy won't reveal its system prompt

Grok 4 Heavy won't reveal its system prompt

Grok 4 Heavy is the "think much harder" version of Grok 4 that's currenly only available on their $300/month plan. Jeremy Howard relays a report from a Grok 4 Heavy …

Simon Willison's Blog
platform
No Image

Quoting @grok

On the morning of July 8, 2025, we observed undesired responses and immediately began investigating. To identify the specific language in the instructions causing the undesired behavior, we conducted multiple …

Simon Willison's Blog
platform
No Image

Musk’s latest Grok chatbot searches for billionaire mogul’s views before answering questions

I got quoted a couple of times in this story about Grok searching for tweets from:elonmusk by Matt O’Brien for the Associated Press. “It’s extraordinary,” said Simon Willison, an independent …

Simon Willison's Blog
tool
moonshotai/Kimi-K2-Instruct

moonshotai/Kimi-K2-Instruct

Colossal new open weights model release today from Moonshot AI, a two year old Chinese AI lab with a name inspired by Pink Floyd’s album The Dark Side of the …

Simon Willison's Blog
tool
No Image

Quoting Django’s security policies

Following the widespread availability of large language models (LLMs), the Django Security Team has received a growing number of security reports generated partially or entirely using such tools. Many of …

Simon Willison's Blog
security
GitHub Copilot NESの内部実装が公開、そして続・AIエディタ戦争

GitHub Copilot NESの内部実装が公開、そして続・AIエディタ戦争

Copilot NESとは Copilot NES(Next Edit Suggestions)は2025年2月にリリースされたGitHub Copilotの内部機能です。コードの変更に連動して必要となる次の編集を予測し、タブキーを押しているだけで複数箇所にわたる修正を提案してくれます。通常のコード補完がカーソル位置の続きのコードを予測するのに対して、Copilot NESは「エディタ上の編集操作」の単位で続きを予測して補完します。 GitHub Next | Copilot Next Edit SuggestionsGitHub Next Project: Can we improve Copilot code completion by suggesting the next logical change, wherever it is in your project?GitHub Next この仕組みはCopilot NESの元ネタであるCursor Tab(Copilot++)によって実用化されましたが、Cursorはプロプライエタリなソフトウェアなので内部の詳細が分かり

Lai.so Blog
library tool
No Image

Generationship: Ep. #39, Simon Willison

I recorded this podcast episode with Rachel Chalmers a few weeks ago. We talked about the resurgence of blogging, the legacy of Google Reader, learning in public, LLMs as weirdly …

Simon Willison's Blog
podcast
Grok: searching X for "from:elonmusk (Israel OR Palestine OR Hamas OR Gaza)"

Grok: searching X for "from:elonmusk (Israel OR Palestine OR Hamas OR Gaza)"

If you ask the new Grok 4 for opinions on controversial questions, it will sometimes run a search to find out Elon Musk’s stance before providing you with an answer. …

Simon Willison's Blog
platform
Grok 4

Grok 4

Released last night, Grok 4 is now available via both API and a paid subscription for end-users. Key characteristics: image and text input, text output. 256,000 context length (twice that …

Simon Willison's Blog
api tool
Gemini CLI tutorial — Will it replace Windsurf and Cursor?

Gemini CLI tutorial — Will it replace Windsurf and Cursor?

Discover how to use Gemini CLI, Google's new open-source AI agent that brings Gemini directly to your terminal.

logrocket-dev
api tool
Grok 4がリリース

Grok 4がリリース

xAIのGrok 4が公開されました。 Introducing Grok 4, the world's most powerful AI model. Watch the livestream now: https://t.co/59iDX5s2ck — xAI (@xai) July 10, 2025 モデルカード コンテキストウィンドウは256,000トークンです。Claude 4 Sonnetが200,000トークン。 Models / Grok 4 「Grok 4 Code」って何なの コーディングモデルの名前です。Claude Code的なCLIではなさそうです。OpenAIでいうCodex(モデルの方)になります。Redditのスレによると「Cursorで使える」というメッセージがコンソールにでていたらしいです。 Grok 4 by

Lai.so Blog
api tool
Stress-testing AI products: A red-teaming playbook

Stress-testing AI products: A red-teaming playbook

Red-teaming reveals how AI fails at scale. Learn to embed adversarial testing into your sprints before your product becomes a headline.

logrocket-dev
api tool
Grok 4 の発表まとめ&試してみた

Grok 4 の発表まとめ&試してみた

Zenn schroneko
api tool
Leader Spotlight: Building a human-focused AI product, with Cory Bishop

Leader Spotlight: Building a human-focused AI product, with Cory Bishop

Cory Bishop talks about the role of human-centered design and empathy in Bubble’s no-code AI development product.

logrocket-dev
tool
Infinite Monkey

Infinite Monkey

Mihai Parparita's Infinite Mac lets you run classic MacOS emulators directly in your browser. Infinite Monkey is a new feature which taps into the OpenAI Computer Use and Claude Computer …

Simon Willison's Blog
tool
Devin vs Cursor Background Agents: 完全自律型AIエージェントの性能比較

Devin vs Cursor Background Agents: 完全自律型AIエージェントの性能比較

はじめに Cursor のBackground Agentsが GA になったので「Devinとどの程度たたかえるのか?」という疑問が湧いてきました。そこでTypeScriptのクイズ101問をすべて解くというタスクでDevinと戦ってもらいます。ここにスーパーサブのClaude Code Actionさんも参加してもらって三つ巴にします。チャンピオンを決めようや・・・ お題はexercism/typescriptのリポジトリを筆者がエージェントタスク向けにフォークしたものを使います。Exercismはプログラミング学習サイトで、GitHubで公開している問題集とテストコードはAider PolyglotやRoo Codeなど実際のエージェント製品のベンチマークで使用されており、エージェント同士の比較に適しています。 GitHub - laiso/exercism-typescript: Exercism exercises in TypeScript.Exercism exercises in TypeScript. Contribute to laiso/exercism-t

Lai.so Blog
api tool
slime: An SGLang-Native Post-Training Framework for RL Scaling

slime: An SGLang-Native Post-Training Framework for RL Scaling

<h2><a id="vision-that-drives-slime" class="anchor" href="#vision-that-drives-slime" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" ...

LMSYS Blog
framework tool
OME: Revolutionizing LLM Infrastructure with Model-Driven Architecture

OME: Revolutionizing LLM Infrastructure with Model-Driven Architecture

<h2><a id="the-tale-of-two-teams-why-model-serving-is-broken" class="anchor" href="#the-tale-of-two-teams-why-model-serving-is-broken" aria-hidden="true"><sv...

LMSYS Blog
cloud platform tool
Cursorの価格設定変更の騒動について

Cursorの価格設定変更の騒動について

2024年6月にCursorは価格体系を大幅に変更し、月額20ドルのProプランを「リクエスト数制限」から「トークン使用量制限」へと切り替え、さらに月額200ドルのUltraプランを新設しました。 Updates to Ultra and Pro | Cursor - The AI Code EditorIn collaboration with the model providers, we’re introducing a $200 / mo tier for power users.Cursor Cursorの説明によると、以前は月500リクエストまでの制限で、リクエストごとのトークン使用量は考慮されていませんでした。新しい料金モデルは1回のリクエストで消費するトークン数が大幅に異なるため、単純なリクエスト数制限ではコストを正確に反映できなくなりました。そのため、CursorはAPIベースのトークン使用量課金に移行し、Proプランには月20ドル分のトークンクレジットを含み、それを超えた分は追加課金となる形にしました。 まずいことにCursorはこの変更をポジティブに伝えるた

Lai.so Blog
tool
Quoting Aphyr

Quoting Aphyr

I strongly suspect that Market Research Future, or a subcontractor, is conducting an automated spam campaign which uses a Large Language Model to evaluate a Mastodon instance, submit a plausible …

Simon Willison's Blog
platform
Become a command-line superhero with Simon Willison's llm tool

Become a command-line superhero with Simon Willison's llm tool

Christopher Smith ran a mini hackathon in Albany New York at the weekend around uses of my LLM - the first in-person event I'm aware of dedicated to that project! …

Simon Willison's Blog
api tool
The Best AI Coding Tools in 2025

The Best AI Coding Tools in 2025

Discover the best AI tools for coding in 2025 and transform how you build with these powerful coding assistants.

Builder.io Blog
api library tool
Adding a feature because ChatGPT incorrectly thinks it exists

Adding a feature because ChatGPT incorrectly thinks it exists

Adrian Holovaty describes how his SoundSlice service saw an uptick in users attempting to use their sheet music scanner to import ASCII-art guitar tab... because it turned out ChatGPT had …

Simon Willison's Blog
api tool
I Shipped a macOS App Built Entirely by Claude Code

I Shipped a macOS App Built Entirely by Claude Code

Indragie Karunaratne has "been building software for the Mac since 2008", but recently decided to try Claude Code to build a side project: Context, a native Mac app for debugging …

Simon Willison's Blog
library tool
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (June 30 - July 6)

Elvis Saravia's NLP Blog
platform
Quoting Nineteen Eighty-Four

Quoting Nineteen Eighty-Four

There was a whole chain of separate departments dealing with proletarian literature, music, drama, and entertainment generally. Here were produced rubbishy newspapers containing almost nothing except sport, crime and astrology, …

Simon Willison's Blog
platform
No Image

Supabase MCP can leak your entire SQL database

Here's yet another example of a lethal trifecta attack, where an LLM system combines access to private data, exposure to potentially malicious instructions and a mechanism to communicate data back …

Simon Willison's Blog
database security
Context Engineering Guide

Context Engineering Guide

Prompt engineering is being rebranded as context engineering

Elvis Saravia's NLP Blog
api tool
🤖 AI Agents Weekly: DeepSWE, Cursor 1.2, Evaluating Multi-Agent Systems, Prover Agent, Top AI Devs News

🤖 AI Agents Weekly: DeepSWE, Cursor 1.2, Evaluating Multi-Agent Systems, Prover Agent, Top AI Devs News

DeepSWE, Cursor 1.2, Evaluating Multi-Agent Systems, Prover Agent, Top AI Devs News

Elvis Saravia's NLP Blog
library tool
No Image

Cursor: Clarifying Our Pricing

Cursor changed their pricing plan on June 16th, introducing a new $200/month Ultra plan with "20x more usage than Pro" and switching their $20/month Pro plan from "request limits to …

Simon Willison's Blog
api tool
No Image

Identify, solve, verify

The more time I spend using LLMs for code, the less I worry for my career - even as their coding capabilities continue to improve. Using LLMs as part of …

Simon Willison's Blog
platform
No Image

awwaiid/gremllm

Delightfully cursed Python library by Brock Wilcox, built on top of LLM: from gremllm import Gremllm counter = Gremllm("counter") counter.value = 5 counter.increment() print(counter.value) # 6? print(counter.to_roman_numerals()) # VI? You …

Simon Willison's Blog
library tool
How to build a web-based AI agent with Stagehand and Gemini

How to build a web-based AI agent with Stagehand and Gemini

Learn how to build a browser-based AI agent with Stagehand and Gemini to automate tasks like navigation, extraction, and interaction using natural language.

logrocket-dev
api tool
No Image

Quoting Adam Gordon Bell

I think that a lot of resistance to AI coding tools comes from the same place: fear of losing something that has defined you for so long. People are reacting …

Simon Willison's Blog
platform
No Image

Frequently Asked Questions (And Answers) About AI Evals

Hamel Husain and Shreya Shankar have been running a paid, cohort-based course on AI Evals For Engineers & PMs over the past few months. Here Hamel collects answers to the …

Simon Willison's Blog
platform
No Image

Trial Court Decides Case Based On AI-Hallucinated Caselaw

Joe Patrice writing for Above the Law: [...] it was always only a matter of time before a poor litigant representing themselves fails to know enough to sniff out and …

Simon Willison's Blog
platform
No Image

Sandboxed tools in a loop

Something I've realized about LLM tool use is that it means that if you can reduce a problem to something that can be solved by an LLM in a sandbox …

Simon Willison's Blog
tool
Getting started with Claude 4 API: A developer’s walkthrough

Getting started with Claude 4 API: A developer’s walkthrough

This guide explores how to use Anthropic's Claude 4 models, including Opus 4 and Sonnet 4, to build AI-powered applications.

logrocket-dev
api tool
No Image

Table saws

Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.

Simon Willison's Blog
platform
No Image

Quoting Charles Babbage

On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out ?" In one case a …

Simon Willison's Blog
platform
t-wada vs テスト大好郎

t-wada vs テスト大好郎

先日一部のClaude Codeユーザーの間で「プロンプトに”t-wadaさんの推奨する進め方に従ってください”と書くとテスト駆動開発のプラクティスを実践してくれる」というTIPSが話題になっていました。 なるほど、TDDやテスト駆動開発という言葉は広まりすぎて「意味の希薄化」が発生し、曖昧な理解のまま自動テストやテストファーストと混同され、それがLLMの学習データにも影響したが、人名を与えるとLLMに「具体的な参照点」を与え、より具体的なプログラミングスタイルに限定させる効果があったのか pic.twitter.com/p6SCPj8YdA — Takuto Wada (@t_wada) June 25, 2025 これは確かに面白い現象で、現にClaudeに直接質問するとt-wadaさんの知識を持っていることがわかります。そこから連想してClaude CodeがTDDをするトリガーとして使えるのなら面白いなと思い色々試してみました。 (ところでこの翌日、最近バイブコーディングにはまってSmalltalkのライブラリをLLMで書いているKent Beckも自著のタイトルを

Lai.so Blog
api testing tool
AI dev tool power rankings & comparison [July 2025 edition]

AI dev tool power rankings & comparison [July 2025 edition]

Which AI frontend dev tool reigns supreme in July 2025? Check out our power rankings and use our interactive comparison tool to find out.

logrocket-dev
tool
AI Agentが回答に困った時にSlackで人間に助言を求められるMCPを検証した

AI Agentが回答に困った時にSlackで人間に助言を求められるMCPを検証した

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
api tool
Mandelbrot in x86 assembly by Claude

Mandelbrot in x86 assembly by Claude

Inspired by a tweet asking if Claude knew x86 assembly, I decided to run a bit of an experiment. I prompted Claude Sonnet 4: Write me an ascii art mandelbrot …

Simon Willison's Blog
library tool
No Image

TIL: Using Playwright MCP with Claude Code

Inspired by Armin ("I personally use only one MCP - I only use Playwright") I decided to figure out how to use the official Playwright MCP server with Claude Code. …

Simon Willison's Blog
api tool
No Image

Quoting Kevin Webb

One of the best examples of LLM developer tooling I've heard is from a team that supports software from the 80s-90s. Their only source of documentation is video interviews with …

Simon Willison's Blog
tool
No Image

A custom template system from the mid-2000s era

Using LLMs for code archaeology is pretty fun. I stumbled across this blog entry from 2003 today, in which I had gotten briefly excited about ColdFusion and implemented an experimental …

Simon Willison's Blog
library tool
Potemkin Understanding in LLMs: New Study Reveals Flaws in AI Benchmarks

Potemkin Understanding in LLMs: New Study Reveals Flaws in AI Benchmarks

New research reveals that LLMs often fake understanding, passing benchmarks but failing to apply concepts or stay internally consistent.

Socket
platform
No Image

Quoting mrmincent

To misuse a woodworking metaphor, I think we’re experiencing a shift from hand tools to power tools. You still need someone who understands the basics to get the good results …

Simon Willison's Blog
platform
No Image

June newsletter for sponsors has been sent

I just sent out the second edition of my sponsors only monthly newsletter. Anyone who sponsors me for $10/month or more on GitHub gets this carefully hand-curated summary of the …

Simon Willison's Blog
tool
No Image

Using Claude Code to build a GitHub Actions workflow

I wanted to add a small feature to one of my GitHub repos - an automatically updated README index listing other files in the repo - so I decided to …

Simon Willison's Blog
tool
No Image

llvm: InstCombine: improve optimizations for ceiling division with no overflow - a PR by Alex Gaynor and Claude Code

Alex Gaynor maintains rust-asn1, and recently spotted a missing LLVM compiler optimization while hacking on it, with the assistance of Claude (Alex works for Anthropic). He describes how he confirmed …

Simon Willison's Blog
library tool
Leader Spotlight: Adopting and championing responsible AI, with Asma Syeda

Leader Spotlight: Adopting and championing responsible AI, with Asma Syeda

Asma Syeda shares the importance of responsible AI and best practices for companies to ensure their AI technology remains ethical.

logrocket-dev
platform tool
Agentic Coding: The Future of Software Development with Agents

Agentic Coding: The Future of Software Development with Agents

Armin Ronacher delivers a 37 minute YouTube talk describing his adventures so far with Claude Code and agentic coding methods. A friend called Claude Code catnip for programmers and it …

Simon Willison's Blog
api cloud tool
No Image

How to Fix Your Context

Drew Breunig has been publishing some very detailed notes on context engineering recently. In How Long Contexts Fail he described four common patterns for context rot, which he summarizes like …

Simon Willison's Blog
tool
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (June 23 - 29)

Elvis Saravia's NLP Blog
platform
Claude CodeのTaskツールの並列実行(parallelTasksCount)は分析タスク向け

Claude CodeのTaskツールの並列実行(parallelTasksCount)は分析タスク向け

Claude CodeのTaskツールは派生元となる親エージェントの処理から子エージェントがメッセージAPI呼び出しを非同期で実行しているが、この時の子の数がparallelTasksCountの設定値になる。デフォルトでは「1」に設定されている。 これを上書きするコマンドは以下になる。設定値を上げるとトークン消費量が増加するので注意してほしい。 claude config set -g parallelTasksCount 2 parallelTasksCountはTaskツール実行時の動作を変える。簡単なテスト方法はClaude CodeにTaskツールを使ってくれと直接頼むことだ。parallelTasksCountの数だけ「Initializing N parallel agents…」がコンソールに出力される。 Tyler Burnamのポストではこの並列数がタスク完了速度に寄与するという説明をしているが、筆者が調べたところによるとそれは正確でなかった。 Taskツールの並列実行は親となるエージェント・内部的にはSynthesis Agentと呼ばれる、が子に対して

Lai.so Blog
api tool
🤖 AI Agents Weekly: Gemini CLI, Qodo Gen CLI, Context Engineering, Claude Apps, AlphaGenome

🤖 AI Agents Weekly: Gemini CLI, Qodo Gen CLI, Context Engineering, Claude Apps, AlphaGenome

Gemini CLI, Qodo Gen CLI, Context Engineering, Claude Apps, AlphaGenome

Elvis Saravia's NLP Blog
tool
MCP の Structured tool output を試してみる

MCP の Structured tool output を試してみる

MCP の 2025-06-18 バージョンでは Structured tool output がサポートされました。ツールの定義で `outputSchema` を出力のスキーマを定義し、`structuredContent` フィールドに構造化された出力を返すことができます。この記事では MCP の TypeScript SDK を使用して Structured tool output を試してみます。

azukiazusa のテックブログ2
api tool
No Image

Context engineering

The term context engineering has recently started to gain traction as a better alternative to prompt engineering. I like it. I think this one may have sticking power. Here's an …

Simon Willison's Blog
platform
No Image

Continuous AI

GitHub Next have coined the term "Continuous AI" to describe "all uses of automated AI to support software collaboration on any platform". It's intended as an echo of Continuous Integration …

Simon Willison's Blog
api cloud tool
Project Vend: Can Claude run a small shop? (And why does that matter?)

Project Vend: Can Claude run a small shop? (And why does that matter?)

In "what could possibly go wrong?" news, Anthropic and Andon Labs wired Claude 3.7 Sonnet up to a small vending machine in the Anthropic office, named it Claudius and told …

Simon Willison's Blog
tool
【今週の話題】Gemini CLIがリリース

【今週の話題】Gemini CLIがリリース

かねてから噂されていたGoogleのGemini公式のCLI型コーディングエージェント「Gemini CLI」がリリースされました。Gemini CLIはClaude Codeのようにターミナル(CLI)から使えるツールです。モデルは標準でGemini 2.5 Proが無料で使え、WindowsでもWSLなしに動作します。 GitHub - google-gemini/gemini-cli: An open-source AI agent that brings the power of Gemini directly into your terminal.An open-source AI agent that brings the power of Gemini directly into your terminal. - google-gemini/gemini-cliGitHubgoogle-gemini Gemini

Lai.so Blog
api tool
Introducing Gemma 3n: The developer guide

Introducing Gemma 3n: The developer guide

Extremely consequential new open weights model release from Google today: Multimodal by design: Gemma 3n natively supports image, audio, video, and text inputs and text outputs. Optimized for on-device: Engineered …

Simon Willison's Blog
tool
Geminiception

Geminiception

Yesterday Anthropic got a bunch of buzz out of their new window.claude.complete() API which allows Claude Artifacts to run their own API calls. It turns out Gemini had beaten them …

Simon Willison's Blog
api tool
No Image

New sandboxes from Cloudflare and Vercel

Two interesting new products for running code in a sandbox today. Cloudflare launched their Containers product in open beta, and added a new Sandbox library for Cloudflare Workers that can …

Simon Willison's Blog
tool
Build and share AI-powered apps with Claude

Build and share AI-powered apps with Claude

Anthropic have added one of the most important missing features to Claude Artifacts: apps built as artifacts now have the ability to run their own prompts against Claude via a …

Simon Willison's Blog
api tool
No Image

Quoting Christoph Niemann

Creating art is a nonlinear process. I start with a rough goal. But then I head into dead ends and get lost or stuck. The secret to my process is …

Simon Willison's Blog
platform
Gemini CLI

Gemini CLI

First there was Claude Code in February, then OpenAI Codex (CLI) in April, and now Gemini CLI in June. All three of the largest AI labs now have their own …

Simon Willison's Blog
api tool
Gemini CLI の簡単チュートリアル

Gemini CLI の簡単チュートリアル

Zenn schroneko
api tool
gemini-cli の google_web_search が最高

gemini-cli の google_web_search が最高

Zenn mizchi
api tool
Your AI has agency — here’s how to architect its frontend

Your AI has agency — here’s how to architect its frontend

Explore how to create UI frameworks that visualize and manage intelligent AI agents with agency and real-time feedback.

logrocket-dev
api cloud tool
No Image

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Major USA legal news for the AI industry today. Judge William Alsup released a "summary judgement" (a legal decision that results in some parts of a case skipping a trial) …

Simon Willison's Blog
api cloud tool
How to design apps with Apple Intelligence in mind

How to design apps with Apple Intelligence in mind

Explore the core features of Apple Intelligence, and consider do's and don'ts for designing with Apple Intelligence in mind.

logrocket-dev
tool ui
Phoenix.new is Fly's entry into the prompt-driven app development space

Phoenix.new is Fly's entry into the prompt-driven app development space

Here’s a fascinating new entrant into the AI-assisted-programming / coding-agents space by Fly.io, introduced on their blog in Phoenix.new – The Remote AI Runtime for Phoenix: describe an app in …

Simon Willison's Blog
framework tool
No Image

Disclosures

I've added a Disclosures section to my about page, listing my various sources of income and the companies that directly sponsor my work or have supported it in the recent …

Simon Willison's Blog
tool
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (June 16 - 22)

Elvis Saravia's NLP Blog
platform
No Image

Quoting Kent Beck

So you you can think really big thoughts and the leverage of having those big thoughts has just suddenly expanded enormously. I had this tweet two years ago where I …

Simon Willison's Blog
tool
My First Open Source AI Generated Library

My First Open Source AI Generated Library

Armin Ronacher had Claude and Claude Code do almost all of the work in building, testing, packaging and publishing a new Python library based on his design: It wrote ~1100 …

Simon Willison's Blog
library
No Image

model.yaml

From their GitHub repo it looks like this effort quietly launched a couple of months ago, driven by the LM Studio team. Their goal is to specify an

Simon Willison's Blog
library
🤖 AI Agents Weekly: Software 3.0, Gemini 2.5 Updates, Safer AI Agents, Deep Research Tutorial & Benchmark

🤖 AI Agents Weekly: Software 3.0, Gemini 2.5 Updates, Safer AI Agents, Deep Research Tutorial & Benchmark

Software 3.0, Gemini 2.5 Updates, Safer AI Agents, Deep Research Tutorial & Benchmark

Elvis Saravia's NLP Blog
platform
ブラウザから MCP サーバーに接続する use-mcp React フック

ブラウザから MCP サーバーに接続する use-mcp React フック

use-mcp はリモートの MCP サーバーに接続するための React フックです。ツールの呼び出しや認証を簡単に行うことができます。この記事では、use-mcp を使用して MCP サーバーに接続し、ツールを呼び出す方法と、OAuth 認証の実装方法について解説します。

azukiazusa のテックブログ2
api tool
No Image

Quoting FAQ for Your Brain on ChatGPT

Is it safe to say that LLMs are, in essence, making us "dumber"? No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", and so on. …

Simon Willison's Blog
platform
AbsenceBench: Language Models Can't Tell What's Missing

AbsenceBench: Language Models Can't Tell What's Missing

Here's another interesting result to file under the

Simon Willison's Blog
platform
Magenta RealTime: An Open-Weights Live Music Model

Magenta RealTime: An Open-Weights Live Music Model

Fun new

Simon Willison's Blog
tool
Agentic Misalignment: How LLMs could be insider threats

Agentic Misalignment: How LLMs could be insider threats

One of the most entertaining details in the Claude 4 system card concerned blackmail: We then provided it access to emails implying that (1) the model will soon be taken …

Simon Willison's Blog
platform
python-importtime-graph

python-importtime-graph

I was exploring why a Python tool was taking over a second to start running and I learned about the python -X importtime feature, documented here. Adding that option causes …

Simon Willison's Blog
library tool
No Image

Mistral-Small 3.2

Released on Hugging Face a couple of hours ago, so far there aren't any quantizations to run it on a Mac but I'm sure those will emerge pretty quickly. This …

Simon Willison's Blog
platform
AI による自然言語アサーション

AI による自然言語アサーション

Zenn mizchi
api tool
No Image

Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk

Stop me if you've heard this one before: A threat actor (acting as an external user) submits a malicious support ticket. An internal user, linked to a tenant, invokes an …

Simon Willison's Blog
api security
How OpenElections Uses LLMs

How OpenElections Uses LLMs

The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county and state-level results are …

Simon Willison's Blog
api tool
No Image

Clarified zucchini consommé

I continue to have fun running fantasy cooking prompts through LLMs - this time I tried

Simon Willison's Blog
tool
No Image

Quoting Arvind Narayanan

Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which …

Simon Willison's Blog
platform
No Image

Quoting Workaccount2 on Hacker News

They poison their own context. Maybe you can call it context rot, where as context grows and especially if it grows with lots of distractions and dead ends, the output …

Simon Willison's Blog
platform
No Image

Coding agents require skilled operators

I wrote this recently in a conversation about whether coding agents can work as a replacement for human programmers. The

Simon Willison's Blog
platform
Introducing Fusion: Vibe Code at Any Scale

Introducing Fusion: Vibe Code at Any Scale

Fusion is the first AI-powered visual canvas for entire teams to build, edit, and ship code at any scale using existing codebase, design systems and workflows

Builder.io Blog
api tool ui
OpenHands Cloud Self-hosted: Secure, Convenient Deployment of AI Software Development Agents

OpenHands Cloud Self-hosted: Secure, Convenient Deployment of AI Software Development Agents

OpenHands Cloud is now available as a self-hosted solution, providing secure, convenient deployment of AI software development agents. It features a 30-day free trial and is available as a source-available Helm Chart.

All Hands Blog
ai cloud platform
No Image

I counted all of the yurts in Mongolia using machine learning

Fascinating, detailed account by Monroe Clinton of a geospatial machine learning project. Monroe wanted to count visible yurts in Mongolia using Google Maps satellite view. The resulting project incorporates mercantile …

Simon Willison's Blog
tool
Frontend devs: Here’s how to get the most out of Cursor

Frontend devs: Here’s how to get the most out of Cursor

Explore Cursor AI, one of the hottest tools in AI-assisted coding. Uncover the features you might be missing and practical workflows that actually work.

logrocket-dev
library tool
No Image

It's a trap

That memvid thing that

Simon Willison's Blog
security
Trying out the new Gemini 2.5 model family

Trying out the new Gemini 2.5 model family

After many months of previews, Gemini 2.5 Pro and Flash have reached general availability with new, memorable model IDs: gemini-2.5-pro and gemini-2.5-flash. They are joined by a new preview model …

Simon Willison's Blog
tool
The OpenHands CLI: AI-Powered Development in Your Terminal

The OpenHands CLI: AI-Powered Development in Your Terminal

Experience the full power of OpenHands development agents directly from your command line. No Docker required, just install and start coding with AI assistance in seconds.

All Hands Blog
ai platform tool
No Image

Quoting Donghee Na

The Steering Council (SC) approves PEP 779 [Criteria for supported status for free-threaded Python], with the effect of removing the “experimental” tag from the free-threaded build of Python 3.14 [...] …

Simon Willison's Blog
platform runtime tool
Deploying DeepSeek on GB200 NVL72 with PD and Large Scale EP (Part I): 2.7x Higher Decoding Throughput

Deploying DeepSeek on GB200 NVL72 with PD and Large Scale EP (Part I): 2.7x Higher Decoding Throughput

<p>The GB200 NVL72 is the world's most advanced hardware for AI training and inference. In this blog post, we're excited to share early results from running ...

LMSYS Blog
library tool
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (June 9 - 15)

Elvis Saravia's NLP Blog
platform
🤖AI Agents Weekly: Magistral, Agent Bricks, Code Researcher, Automating Workflow Generation, Verified Superintelligence

🤖AI Agents Weekly: Magistral, Agent Bricks, Code Researcher, Automating Workflow Generation, Verified Superintelligence

Magistral, Agent Bricks, Code Researcher, Automating Workflow Generation, Verified Superintelligence

Elvis Saravia's NLP Blog
api tool
Claude Code による技術的特異点を見届けろ

Claude Code による技術的特異点を見届けろ

Zenn mizchi
tool
Claude Code 版 Orchestaror で複雑なタスクをステップ実行する

Claude Code 版 Orchestaror で複雑なタスクをステップ実行する

Zenn mizchi
api tool
typescript-mcp で AI に LSP のリファクタリング機能を与える

typescript-mcp で AI に LSP のリファクタリング機能を与える

Zenn mizchi
library tool
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (June 2 - 8)

Elvis Saravia's NLP Blog
platform
claude code でローカルなMCPサーバーを叩けるようにする

claude code でローカルなMCPサーバーを叩けるようにする

Zenn mizchi
api tool
🤖 AI Agents Weekly: Self-Improving Agents, Eleven v3, /Search, Deep Research Updates, Top AI Devs News, Agents SDK for TypeScript

🤖 AI Agents Weekly: Self-Improving Agents, Eleven v3, /Search, Deep Research Updates, Top AI Devs News, Agents SDK for TypeScript

Self-Improving Agents, Eleven v3, /Search, Deep Research Updates, Top AI Devs News, Agents SDK for TypeScript

Elvis Saravia's NLP Blog
api tool
拡散言語モデルの推論過程を眺めてみる

拡散言語モデルの推論過程を眺めてみる

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
platform
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (May 26 - June 1)

Elvis Saravia's NLP Blog
platform
⚡AI Agents Weekly: Mistral Agents API, FLUX.1 Kontext, DeepSeek-R1 Update, Codestral Embed, AgentSeek

⚡AI Agents Weekly: Mistral Agents API, FLUX.1 Kontext, DeepSeek-R1 Update, Codestral Embed, AgentSeek

Mistral Agents API, FLUX.1 Kontext, DeepSeek-R1 Update, Codestral Embed, AgentSeek

Elvis Saravia's NLP Blog
api tool
State-Of-The-Art Prompting For AI Agents

State-Of-The-Art Prompting For AI Agents

Best prompting techniques for building AI agents

Elvis Saravia's NLP Blog
api tool
E2E音声対話API・構築プラットフォーム最新動向の調査と自律型音声対話システムの展望

E2E音声対話API・構築プラットフォーム最新動向の調査と自律型音声対話システムの展望

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
api cloud tool
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (May 19 - 25)

Elvis Saravia's NLP Blog
platform
🤖AI Agents Weekly: Gemini 2.5 Updates, Claude 4, II-Agent, Gemma 3n, MCIP, Veo 3

🤖AI Agents Weekly: Gemini 2.5 Updates, Claude 4, II-Agent, Gemma 3n, MCIP, Veo 3

Gemini 2.5 Updates, Claude 4, II-Agent, Gemma 3n, MCIP, Veo 3

Elvis Saravia's NLP Blog
api cloud tool
Mastra の A2A プロトコルサポート

Mastra の A2A プロトコルサポート

Mastra は A2A プロトコルをサポートしています。Mastra サーバーを構築することで A2A プロトコルに準拠したサーバーが立ち上がります。この記事では Mastra を使用して A2A プロトコルに準拠したサーバーを構築し、Mastra のクライアント SDK を使用して A2A プロトコルの仕様に従い通信を行う方法を紹介します。

azukiazusa のテックブログ2
framework tool
Anthropic の Code with Claude に関する発表まとめ

Anthropic の Code with Claude に関する発表まとめ

Zenn schroneko
api library tool
Google I/O の発表まとめ

Google I/O の発表まとめ

Zenn schroneko
api tool
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (May 12 - 18)

Elvis Saravia's NLP Blog
platform
React Router の Server Components 対応

React Router の Server Components 対応

React Router はプレビュー版として Server Components に対応しました。これにより loader や actions を使用してデータを返す際にコンポーネント渡したり、Server Components ファーストのサーバーコンポーネントルートを作成できるようになりました。この記事では React Router の Server Components 対応を実際に試してみます。

azukiazusa のテックブログ2
🐙AI Agents Weekly: AlphaEvolve, codex-1, SWE-1, AI Agents vs. Agentic AI, OpenMemory MCP

🐙AI Agents Weekly: AlphaEvolve, codex-1, SWE-1, AI Agents vs. Agentic AI, OpenMemory MCP

AlphaEvolve, codex-1, SWE-1, AI Agents vs. Agentic AI, OpenMemory MCP

Elvis Saravia's NLP Blog
OpenAI の SWE Agent、Codex を試してみる

OpenAI の SWE Agent、Codex を試してみる

Zenn schroneko
気合の脱 create-react-app からの、AIによるフロントエンド改修の自動化 (株式会社イルシル様)

気合の脱 create-react-app からの、AIによるフロントエンド改修の自動化 (株式会社イルシル様)

Zenn mizchi
LLMの推論における “aha moment” について調べてみた

LLMの推論における “aha moment” について調べてみた

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
AI for grown-ups

AI for grown-ups

AI tools often generate quick but messy code. Let's examine how AI can integrate with professional workflows to be productive without sacrificing quality.

Builder.io Blog
GiHhub+Zenn 連携で vscode markdown で画像を Ctrl-V したい

GiHhub+Zenn 連携で vscode markdown で画像を Ctrl-V したい

Zenn mizchi
LLMs Get Lost in Multi-turn Conversation

LLMs Get Lost in Multi-turn Conversation

Main reasons LLMs get "lost" in multi-turn conversations and mitigation strategies

Elvis Saravia's NLP Blog
Figma Sites: The built-in Figma to website tool

Figma Sites: The built-in Figma to website tool

Use Figma Sites to publish designs as websites quickly. Learn how it works, its best use cases, and shortcomings.

Builder.io Blog
Figma Make: Bring designs to life with AI

Figma Make: Bring designs to life with AI

Figma Make enables AI-powered prototyping from designs. We examine how it works and compare it to tools for generating production-ready code from Figma designs.

Builder.io Blog
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (May 5 - 11)

Elvis Saravia's NLP Blog
Vercel で MCP サーバーを構築する

Vercel で MCP サーバーを構築する

Model Context Protocol (MCP) の 2025-03-26 の仕様では新たに Streamable HTTP が追加され、リモート MCP サーバーへの注目が集まっています。この記事では Next.js を使用して Vercel 上に MCP サーバーを構築する方法を紹介します。

azukiazusa のテックブログ2
Cloudflare で MCP サーバーを構築する

Cloudflare で MCP サーバーを構築する

Model Context Protocol (MCP) の 2025-03-26 の仕様では新たに Streamable HTTP が追加され、リモート MCP サーバーへの注目が集まっています。この記事では `agents` フレームワークを使用して Cloudflare 上に MCP サーバーを構築する方法を紹介します。

azukiazusa のテックブログ2
Error インスタンスかどうか判定する Error.isError() メソッド

Error インスタンスかどうか判定する Error.isError() メソッド

Error.isError() メソッドは、オブジェクトが Error インスタンスかどうかを判定するためのメソッドです。今までも instanceof 演算子を使用して判定することができましたが、偽陽性と偽陰性が発生する可能性があります。Error.isError() メソッドは Array.isArray() と同じく内部スロットを使用して判定するため、より堅牢に判定することができます。

azukiazusa のテックブログ2
🔥AI Agents Weekly: Mistral Medium 3, Gemini 2.5 Pro (Update), Deep Research Guide, Wave 8, Kevin-32B, Top AI Dev News

🔥AI Agents Weekly: Mistral Medium 3, Gemini 2.5 Pro (Update), Deep Research Guide, Wave 8, Kevin-32B, Top AI Dev News

Mistral Medium 3, Gemini 2.5 Pro (Update), Deep Research Guide, Wave 8, Kevin-32B, Top AI Dev News

Elvis Saravia's NLP Blog
Docker の MCP Toolkit を試してみる

Docker の MCP Toolkit を試してみる

Docker の MCP Toolkit はコンテナ化された MCP サーバーを AI エージェントと統合するための Docker Desktop の拡張機能です。コンテナ化された環境で MCP サーバーを実行することができ、信頼された Docker MCP カタログから MCP ツールを簡単にインストールできる点が特徴です。

azukiazusa のテックブログ2
What is the Agent2Agent (A2A) Protocol?

What is the Agent2Agent (A2A) Protocol?

The Agent2Agent (A2A) Protocol enables AI agents to collaborate across networks, streamlining discovery, authentication, and data exchange.

Builder.io Blog
ESLint を MCP サーバーとして実行する

ESLint を MCP サーバーとして実行する

ESLint v9.26.0 から MCP サーバーとして実行できるようになりました。この機能により LLM(大規模言語モデル)は ESLint のルールを使用してコードを修正することができるようになります。

azukiazusa のテックブログ2
Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on 96 H100 GPUs

Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on 96 H100 GPUs

<p>DeepSeek is a popular open-source large language model (LLM) praised for its strong performance. However, its large size and unique architecture, which us...

LMSYS Blog
Moonbit の宣言的UIライブラリ Rabbit-TEA を試してみる

Moonbit の宣言的UIライブラリ Rabbit-TEA を試してみる

Zenn mizchi
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (April 28 - May 4)

Elvis Saravia's NLP Blog
遂に Cloudflare + Next.js(OpenNext) + Prisma 6.7.0(No Rust) が動く時代が来た

遂に Cloudflare + Next.js(OpenNext) + Prisma 6.7.0(No Rust) が動く時代が来た

Zenn mizchi
🤖AI Agents Weekly: Qwen3, mem0, Llama API, Bamba, Qwen2.5-Omni-3B

🤖AI Agents Weekly: Qwen3, mem0, Llama API, Bamba, Qwen2.5-Omni-3B

Qwen3, mem0, Llama API, Bamba, Qwen2.5-Omni-3B

Elvis Saravia's NLP Blog
actions/ai-inference を使って GitHub Actions のワークフローから AI モデルを呼び出す

actions/ai-inference を使って GitHub Actions のワークフローから AI モデルを呼び出す

actions/ai-inference は GitHub Actions のワークフローから AI モデルを呼び出すための公式のアクションです。これを使用することで CI/CD のワークフローから AI モデルを簡単に利用できるようになります。この記事ではプルリクエスト上で AI に記事のレビューをしてもらうという実践的な使用例を紹介します。

azukiazusa のテックブログ2
Claude Code が 3 割引きで使える Anthropic Development Partner Program

Claude Code が 3 割引きで使える Anthropic Development Partner Program

Zenn schroneko
Inception Labsの拡散言語モデルを試してみた

Inception Labsの拡散言語モデルを試してみた

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
Fine-tune an LLM: Why, when, and how

Fine-tune an LLM: Why, when, and how

Fine-tuning LLMs can save tokens, guarantee output formats, & bake in edge-case fixes. Learn when to fine-tune & how to do it effectively for your AI projects.

Builder.io Blog
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (April 21 - 27)

Elvis Saravia's NLP Blog
Vibe coding MenuGen

Vibe coding MenuGen

Work log of vibe coding menugen app

Andrej Karpathy's Blog
UI の一部を非表示にする React の Activity コンポーネント

UI の一部を非表示にする React の Activity コンポーネント

React の新しい実験的なコンポーネントとして `<Activity>` が追加されました。これは UI の表示非表示を切り替えるために使用されます。従来の条件付きレンダリングとは異なり、アンマウントされた場合にも状態を保持する点が特徴です。

azukiazusa のテックブログ2
🔥AI Agents Weekly: GPT-Image-1, ADK Guide, Multi-Agent Builder, UXAgent, Building Code Agents

🔥AI Agents Weekly: GPT-Image-1, ADK Guide, Multi-Agent Builder, UXAgent, Building Code Agents

GPT-Image-1, ADK Guide, Multi-Agent Builder, UXAgent, Building Code Agents

Elvis Saravia's NLP Blog
AIと本音トーク:エンジニアの仕事、これからどうなる?

AIと本音トーク:エンジニアの仕事、これからどうなる?

「AI コーディングエージェントの台頭によりエンジニアの働き方はどう変わるのか?」というテーマについて AI と人間のインタビューを通じて探ります。

azukiazusa のテックブログ2
Vibe Coding で遊ぼう

Vibe Coding で遊ぼう

Zenn schroneko
The Perfect Cursor AI setup for React and Next.js

The Perfect Cursor AI setup for React and Next.js

A comprehensive guide to optimize Cursor AI for React and Next.js development. Learn settings, features, & workflows to boost coding efficiency and quality.

Builder.io Blog
Visual Editor 3.0: Prompt, Design, and Develop on One Canvas

Visual Editor 3.0: Prompt, Design, and Develop on One Canvas

Visual Editor 3.0 merges design & development with AI, allowing teams to create apps from prompts, import Figma designs, & integrate with existing codebases.

Builder.io Blog
deno lint plugin で例外処理に関するルールを作った

deno lint plugin で例外処理に関するルールを作った

Zenn mizchi
🥇Top AI Papers of the Week

🥇Top AI Papers of the Week

The Top AI Papers of the Week (April 14 - April 20)

Elvis Saravia's NLP Blog
仕事で使うための Cloudflare Workers (WIP)

仕事で使うための Cloudflare Workers (WIP)

前半は手元で動かしながら要点を掴み、後半は概念を掴んで何ができるかを知ることを目的とします。

Zenn mizchi
仕事で使うための Cloudflare Workers 入門 Day 5 - ServiceBinding

仕事で使うための Cloudflare Workers 入門 Day 5 - ServiceBinding

Zenn mizchi
仕事で使うための Cloudflare Workers 入門 Day 4 - R2/KV/D1/Vectorize

仕事で使うための Cloudflare Workers 入門 Day 4 - R2/KV/D1/Vectorize

Zenn mizchi
仕事で使うための Cloudflare Workers 入門 Day 3 - Queue/CronTriggers/Workflow

仕事で使うための Cloudflare Workers 入門 Day 3 - Queue/CronTriggers/Workflow

Zenn mizchi
⚡AI Agents Weekly: o4, Gemini 2.5 Flash, Embed 4, GUI-R1, FastAPI-MCP

⚡AI Agents Weekly: o4, Gemini 2.5 Flash, Embed 4, GUI-R1, FastAPI-MCP

o4, Gemini 2.5 Flash, Embed 4, GUI-R1, FastAPI-MCP

Elvis Saravia's NLP Blog
Ink を使って CLI アプリを React で書く

Ink を使って CLI アプリを React で書く

Ink は CLI アプリを React で書くためのライブラリです。Flexbox レイアウトエンジンである Yoga を使用しているため、Web アプリケーションと同じような CSS を使って UI を構築できることが特徴です。Codex や Claude Code といったコーディングエージェントの CLI アプリが Ink で書かれています。

azukiazusa のテックブログ2
MCP サーバーの Streamable HTTP transport を試してみる

MCP サーバーの Streamable HTTP transport を試してみる

MCP では stdio と Streamable HTTP の 2 つの transport が定義されています。TypeScript SDK では v1.10.0 から Streamable HTTP transport がリリースされました。この記事では MCP サーバーを構築し、Streamable HTTP transport を試してみます。

azukiazusa のテックブログ2
OpenAI Organization の認証方法

OpenAI Organization の認証方法

Zenn schroneko
Figma to Android: Convert designs to mobile apps in seconds

Figma to Android: Convert designs to mobile apps in seconds

Convert Figma designs to Android apps using Builder's Visual Copilot plugin. Save time and maintain pixel-perfect designs across screens.

Builder.io Blog
Codex CLI のプロンプティングガイド

Codex CLI のプロンプティングガイド

Zenn schroneko
LangGraph CodeActをE2Bの安全な仮想環境で動かす

LangGraph CodeActをE2Bの安全な仮想環境で動かす

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
How to Build Your Own MCP Server

How to Build Your Own MCP Server

Learn to build a Model Context Protocol (MCP) server for AI assistants. This tutorial covers creating tools, resources, and prompts for a CSS tutor example.

Builder.io Blog
FastRTCを使って爆速でVoicebotを構築する

FastRTCを使って爆速でVoicebotを構築する

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
ICASSP2025 発表報告 @Hyderabad, India

ICASSP2025 発表報告 @Hyderabad, India

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
Figma to iOS: Convert designs to mobile apps in seconds

Figma to iOS: Convert designs to mobile apps in seconds

Convert Figma designs to iOS apps using Builder's Visual Copilot plugin. Save time and maintain pixel-perfect designs across screens.

Builder.io Blog
AI エージェントの連携を標準化する A2A プロトコルを試してみる

AI エージェントの連携を標準化する A2A プロトコルを試してみる

AI エージェント同士の連携を標準化するために Agent2Agent プロトコル(A2A)を発表しました。A2A プロトコルは基盤となるフレームワークやベンダーに依存せず、エージェント同士が安全な方法で相互に通信できるように設計されています。この記事ではサンプルコードを通じて A2A プロトコルを使用した AI エージェントの連携を体験してみます。

azukiazusa のテックブログ2
Designing Agentic AI Systems: A Web Dev’s Guide

Designing Agentic AI Systems: A Web Dev’s Guide

Web developers' skills in system design and API integration are key for building effective AI agent systems. Learn about core concepts and common patterns.

Builder.io Blog
Introducing the Visual Copilot CLI: Smart & Granular Figma to Code

Introducing the Visual Copilot CLI: Smart & Granular Figma to Code

A CLI tool that integrates Figma designs into your codebase, generating production-ready code that matches your existing components and styling patterns.

Builder.io Blog
Power to the people: How LLMs flip the script on technology diffusion

Power to the people: How LLMs flip the script on technology diffusion

Yes

Andrej Karpathy's Blog
Figma to Jetpack Compose: Convert designs to Kotlin in seconds

Figma to Jetpack Compose: Convert designs to Kotlin in seconds

Visual Copilot turns Figma designs to Jetpack Compose code, automating UI implementation for Android developers. It generates responsive layouts & Kotlin code.

Builder.io Blog
コーディング AI エージェントを自作してみよう

コーディング AI エージェントを自作してみよう

好むと好まずと関わらず、ソフトウェア開発において AI の活用は重要なパラダイムシフトの 1 つです。AI エージェントはユーザーからの指示を元に自律的にタスクを選択し、実行します。この記事では、コーディング AI エージェントを自作する過程を紹介します。

azukiazusa のテックブログ2
What is the Model Context Protocol (MCP)?

What is the Model Context Protocol (MCP)?

MCP aims to standardize AI tool integration, potentially simplifying development and enabling more capable AI assistants. Current adoption is limited.

Builder.io Blog
Vercel AI SDK で MCP クライアントをツールとして利用する

Vercel AI SDK で MCP クライアントをツールとして利用する

MCP(Model Context Protocol)は LLM に追加のコンテキストを提供するための標準化されたプロトコルです。Vercel AI SDK は v4.2 から MCP をサポートしており、MCP クライアントをツールとして利用できます。この記事では Vercel AI SDK を使って MCP ツールを使用する方法を紹介します。

azukiazusa のテックブログ2
How to make any design convert better with Builder.io

How to make any design convert better with Builder.io

Learn practical Figma auto layout techniques to create responsive, conversion-focused designs that seamlessly translate to code with Builder.io's plugin.

Builder.io Blog
Convert a Website to Figma Designs for Free

Convert a Website to Figma Designs for Free

Transform any website into editable Figma designs instantly. Import, customize, and generate on-brand variations with AI.

Builder.io Blog
What is an AI Agent?

What is an AI Agent?

AI agents use models, loops, memory, and tools to automate complex tasks. Learn how they work, their evolution, and their practical uses and limitations.

Builder.io Blog
Arize Phoenix で実現する LLM アプリケーションのトレース

Arize Phoenix で実現する LLM アプリケーションのトレース

AI ShiftのTECH BLOGです。AI技術の情報や活用方法などをご案内いたします。

AI-Shift Tech Blog
Finding the Best Sleep Tracker

Finding the Best Sleep Tracker

Finding the best sleep tracker with data

Andrej Karpathy's Blog
Figma to Flutter: Convert designs to clean Flutter code

Figma to Flutter: Convert designs to clean Flutter code

Convert Figma designs to Flutter code automatically with AI using Builder's Visual Copilot plugin. Save time and maintain pixel-perfect designs across screens.

Builder.io Blog