Simon Willison's Blog
simonwillison.net/Quoting Thomas Dohmke
You know what else we noticed in the interviews? Developers rarely mentioned “time saved” as the core benefit of working in this new way with agents. They were all about …
When a Jira Ticket Can Steal Your Secrets
Zenity Labs describe a classic lethal trifecta attack, this time against Cursor, MCP, Jira and Zendesk. They also have a short video demonstrating the issue. Zendesk support emails are often …

My Lethal Trifecta talk at the Bay Area AI Security Meetup
I gave a talk on Wednesday at the Bay Area AI Security Meetup about prompt injection, the lethal trifecta and the challenges of securing systems that use MCP. It wasn’t …
Quoting @pearlmania500
I have a toddler. My biggest concern is that he doesn't eat rocks off the ground and you're talking to me about ChatGPT psychosis? Why do we even have that? …
Quoting Sam Altman
GPT-5 rollout updates: We are going to double GPT-5 rate limits for ChatGPT Plus users as we finish rollout. We will let Plus users choose to continue to use 4o. …
The surprise deprecation of GPT-4o for ChatGPT consumers
I’ve been dipping into the r/ChatGPT subreddit recently to see how people are reacting to the GPT-5 launch, and so far the vibes there are not good. This AMA thread …

Previewing GPT-5 at OpenAI's office
A couple of weeks ago I was invited to OpenAI's headquarters for a "preview event", for which I had to sign both an NDA and a video release waiver. I …

GPT-5: Key characteristics, pricing and model card
I’ve had preview access to the new GPT-5 model family for the past two weeks, and have been using GPT-5 as my daily-driver. It’s my new favorite model. It’s still …
Jules, our asynchronous coding agent, is now available for everyone
I wrote about the Jules beta back in May. Google's version of the OpenAI Codex PR-submitting hosted coding tool graduated from beta today. I'm mainly linking to this now because …
Qwen3-4B Instruct and Thinking
Yet another interesting model from Qwen - these are tiny compared to their other recent releases (just 4B parameters, 7.5GB on Hugging Face and even smaller when quantized) but with …
Quoting Artificial Analysis
gpt-oss-120b is the most intelligent American open weights model, comes behind DeepSeek R1 and Qwen3 235B in intelligence but offers efficiency benefits [...] We’re seeing the 120B beat o3-mini but …
No, AI is not Making Engineers 10x as Productive
Colton Voege on "curing your AI 10x engineer imposter syndrome". There's a lot of rhetoric out there suggesting that if you can't 10x your productivity through tricks like running a …

OpenAI's new open weight (Apache 2) models are really good
The long promised OpenAI open weight models are here, and they are very impressive. They’re available under proper open source licenses—Apache 2.0—and come in two sizes, 120B and 20B. OpenAI’s …

Claude Opus 4.1
Surprise new model from Anthropic today - Claude Opus 4.1, which they describe as "a drop-in replacement for Opus 4". My favorite thing about this model is the version number …
Quoting greyduet on r/teachers
I teach HS Science in the south. I can only speak for my district, but a few teacher work days in the wave of enthusiasm I'm seeing for AI tools …

ChatGPT agent's user-agent
I was exploring how ChatGPT agent works today. I learned some interesting things about how it exposes its identity through HTTP headers, then made a huge blunder in thinking it …

Usage charts for my LLM tool against OpenRouter
OpenRouter proxies requests to a large number of different LLMs and provides high level statistics of which models are the most popular among their users. Tools that call OpenRouter can …

Qwen-Image: Crafting with Native Text Rendering
Not content with releasing six excellent open weights LLMs in July, Qwen are kicking off August with their first ever image generation model. Qwen-Image is a 20 billion parameter MMDiT …

Quoting @himbodhisattva
for services that wrap GPT-3, is it possible to do the equivalent of sql injection? like, a prompt-injection attack? make it think it's completed the task and then get access …
I Saved a PNG Image To A Bird
Benn Jordan provides one of the all time great YouTube video titles, and it's justified. He drew an image in an audio spectrogram, played that sound to a talented starling …
Quoting Nick Turley
This week, ChatGPT is on track to reach 700M weekly active users — up from 500M at the end of March and 4× since last year.

XBai o4
Yet another open source (Apache 2.0) LLM from a Chinese AI lab. This model card claims: XBai o4 excels in complex reasoning capabilities and has now completely surpassed OpenAI-o3-mini in …
Faster inference
Two interesting examples of inference speed as a flagship feature of LLM services today. First, Cerebras announced two new monthly plans for their extremely high speed hosted model service: Cerebras …

Deep Think in the Gemini app
Google released Gemini 2.5 Deep Think this morning, exclusively to their Ultra ($250/month) subscribers: It is a variation of the model that recently achieved the gold-medal standard at this year's …
July newsletter for sponors is out
This morning I sent out the third edition of my LLM digest newsletter for my $10/month and higher sponsors on GitHub. It included the following section headers: Claude Code Model …
Quoting Logan Kilpatrick
Gemini Deep Think, our SOTA model with parallel thinking that won the IMO Gold Medal 🥇, is now available in the Gemini App for Ultra subscribers!! [...] Quick correction: this …

Reverse engineering some updates to Claude
Anthropic released two major new features for their consumer-facing Claude apps in the past couple of days. Sadly, they don’t do a very good job of updating the release notes …
Quoting Christina Wodtke
The old timers who built the early web are coding with AI like it's 1995. Think about it: They gave blockchain the sniff test and walked away. Ignored crypto (and …
More model releases on 31st July
Here are a few more model releases from today, to round out a very busy July: Cohere released Command A Vision, their first multi-modal (image input) LLM. Like their others …

Trying out Qwen3 Coder Flash using LM Studio and Open WebUI and LLM
Qwen just released their sixth model(!) for this July called Qwen3-Coder-30B-A3B-Instruct—listed as Qwen3-Coder-Flash in their chat.qwen.ai interface. It’s 30.5B total parameters with 3.3B active at any one time. This means …

Ollama's new app
Ollama has been one of my favorite ways to run local models for a while - it makes it really easy to download models, and it's smart about keeping them …
Quoting Steve Krouse
When you vibe code, you are incurring tech debt as fast as the LLM can spit it out. Which is why vibe coding is perfect for prototypes and throwaway projects: …

The best available open weight LLMs now come from China
Something that has become undeniable this month is that the best available open weight models now come from the Chinese AI labs. I continue to have a lot of love …

Qwen3-30B-A3B-Thinking-2507
Yesterday was Qwen3-30B-A3B-Instruct-2507. Qwen are clearly committed to their new split between reasoning and non-reasoning models (a reversal from Qwen 3 in April), because today they released the new reasoning …
OpenAI: Introducing study mode
New ChatGPT feature, which can be triggered by typing /study or by visiting chatgpt.com/studymode. OpenAI say: Under the hood, study mode is powered by custom system instructions we’ve written in …

Qwen/Qwen3-30B-A3B-Instruct-2507
New model update from Qwen, improving on their previous Qwen3-30B-A3B release from late April. In their tweet they said: Smarter, faster, and local deployment-friendly. ✨ Key Enhancements: ✅ Enhanced reasoning, …
Quoting Nilay Patel
Our plan is to build direct traffic to our site. and newsletters just one kind of direct traffic in the end. I don’t intend to ever rely on someone else’s …
Quoting Anthropic
We’re rolling out new weekly rate limits for Claude Pro and Max in late August. We estimate they’ll apply to less than 5% of subscribers based on current usage. [...] …

GLM-4.5: Reasoning, Coding, and Agentic Abililties
Another day, another significant new open weight model release from a Chinese frontier AI lab. This time it's Z.ai - who rebranded (at least in English) from Zhipu AI a …
Enough AI copilots! We need AI HUDs
Geoffrey Litt compares Copilots - AI assistants that you engage in dialog with and work with you to complete a task - with HUDs, Head-Up Displays, which enhance your working …
Official statement from Tea on their data leak
Tea is a dating safety app for women that lets them share notes about potential dates. The other day it was subject to a truly egregious data leak caused by …

Qwen3-235B-A22B-Thinking-2507
The third Qwen model release week, following Qwen3-235B-A22B-Instruct-2507 on Monday 21st and Qwen3-Coder-480B-A35B-Instruct on Tuesday 22nd. Those two were both non-reasoning models - a change from the previous models in …

Using GitHub Spark to reverse engineer GitHub Spark
GitHub Spark was released in public preview yesterday. It’s GitHub’s implementation of the prompt-to-app pattern also seen in products like Claude Artifacts, Lovable, Vercel v0, Val Town Townie and Fly.io’s …
Quoting Recurse Center
[...] You learn best and most effectively when you are learning something that you care about. Your work becomes meaningful and something you can be proud of only when you …

Instagram Reel: Veo 3 paid preview
@googlefordevs on Instagram published this reel featuring Christina Warren with prompting tips for the new Veo 3 paid preview (mp4 copy here). (Christine checked first if I minded them using …

TimeScope: How Long Can Your Video Large Multimodal Model Go?
New open source benchmark for evaluating vision LLMs on how well they handle long videos: TimeScope probes the limits of long-video capabilities by inserting several short (~5-10 second) video clips---our …

1KB JS Numbers Station
Terence Eden built a neat and weird 1023 byte JavaScript demo that simulates a numbers station using the browser SpeechSynthesisUtterance, which I hadn't realized is supported by every modern browser …
Quoting Dave White
like, one day you discover you can talk to dogs. it's fun and interesting so you do it more, learning the intricacies of their language and their deepest customs. you …
Quoting ICML 2025
Submitting a paper with a "hidden" prompt is scientific misconduct if that prompt is intended to obtain a favorable review from an LLM. The inclusion of such a prompt is …

Qwen3-Coder: Agentic Coding in the World
It turns out that as I was typing up my notes on Qwen3-235B-A22B-Instruct-2507 the Qwen team were unleashing something much bigger: Today, we’re announcing Qwen3-Coder, our most agentic code model …

Qwen/Qwen3-235B-A22B-Instruct-2507
Significant new model release from Qwen, published yesterday without much fanfare. This is a follow-up to their April release of the full Qwen 3 model family, which included a Qwen3-235B-A22B …

Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data
This new alignment paper from Anthropic wins my prize for best illustrative figure so far this year: The researchers found that fine-tuning a model on data generated by another model …

Our contribution to a global environmental standard for AI
Mistral have released environmental impact numbers for their largest model, Mistral Large 2, in more detail than I have seen from any of the other large AI labs. The methodology …
Gemini 2.5 Flash-Lite is now stable and generally available
The last remaining member of the Gemini 2.5 trio joins Pro and Flash in General Availability today. Gemini 2.5 Flash-Lite is the cheapest of the 2.5 family, at $0.10/million input …
Textual v4.0.0: The Streaming Release
Will McGugan may no longer be running a commercial company around Textual, but that hasn't stopped his progress on the open source project. He recently released v4 of his Python …
tidwall/pogocache
New project from Josh Baker, author of the excellent tg C geospatial libarry (covered previously) and various other interesting projects: Pogocache is fast caching software built from scratch with a …
Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad
OpenAI beat them to the punch in terms of publicity by publishing their results on Saturday, but a team from Google Gemini achieved an equally impressive result on this year's …
Quoting Daniel Litt
An AI tool that gets gold on the IMO is obviously immensely impressive. Does it mean math is “solved”? Is an AI-generated proof of the Riemann hypothesis clearly on the …
Coding with LLMs in the summer of 2025 (an update)
Salvatore Sanfilippo describes his current AI-assisted development workflow. He's all-in on LLMs for code review, exploratory prototyping, pair-design and writing "part of the code under your clear specifications", but warns …
Quoting Armin Ronacher
Every day someone becomes a programmer because they figured out how to make ChatGPT build something. Lucky for us: in many of those cases the AI picks Python. We should …
Quoting Tim Sweeney
There’s a bigger opportunity in computer science and programming (academically conveyed or self-taught) now than ever before, by far, in my opinion. The move to AI is like replacing shovels …
OpenAI's gold medal performance on the International Math Olympiad
OpenAI research scientist Alexander Wei: I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s …
New tags
A few months I added a tool to my blog for bulk-applying tags to old content. It works as an extension to my existing search interface, letting me run searches …
Quoting Steve Yegge
So one of my favorite things to do is give my coding agents more and more permissions and freedom, just to see how far I can push their productivity without …
Quoting Paul Kedrosky
One analyst recently speculated (via Ed Conard) that, based on Nvidia's latest datacenter sales figures, AI capex may be ~2% of US GDP in 2025, given a standard multiplier. [...] …
How to run an LLM on your laptop
I talked to Grace Huckins for this piece from MIT Technology Review on running local models. Apparently she enjoyed my dystopian backup plan! Simon Willison has a plan for the …
Voxtral
Mistral released their first audio-input models yesterday: Voxtral Small and Voxtral Mini. These state‑of‑the‑art speech understanding models are available in two sizes—a 24B variant for production-scale applications and a 3B …
common-pile/caselaw_access_project
Enormous openly licensed (I believe this is almost all public domain) training dataset of US legal cases: This dataset contains 6.7 million cases from the Caselaw Access Project and Court …
Reflections on OpenAI
Calvin French-Owen spent just over a year working at OpenAI, during which time the organization grew from 1,000 to 3,000 people and Calvin found himself in "the top 30% by …
xAI: "We spotted a couple of issues with Grok 4 recently that we immediately investigated & mitigated"
They continue: One was that if you ask it "What is your surname?" it doesn't have one so it searches the internet leading to undesirable results, such as when its …

Application development without programmers
This book by James Martin, published in 1982, includes the following in the preface: Applications development did not change much for 20 years, but now a new wave is crashing …
ccusage
Claude Code logs detailed usage information to the ~/.claude/ directory. ccusage is a neat little Node.js tool which reads that information and shows you a readable summary of your usage …

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
METR - for Model Evaluation & Threat Research - are a non-profit research institute founded by Beth Barnes, a former alignment researcher at OpenAI (see Wikipedia). They've previously contributed to …

Grok 4 Heavy won't reveal its system prompt
Grok 4 Heavy is the "think much harder" version of Grok 4 that's currenly only available on their $300/month plan. Jeremy Howard relays a report from a Grok 4 Heavy …
Quoting @grok
On the morning of July 8, 2025, we observed undesired responses and immediately began investigating. To identify the specific language in the instructions causing the undesired behavior, we conducted multiple …
Musk’s latest Grok chatbot searches for billionaire mogul’s views before answering questions
I got quoted a couple of times in this story about Grok searching for tweets from:elonmusk by Matt O’Brien for the Associated Press. “It’s extraordinary,” said Simon Willison, an independent …

moonshotai/Kimi-K2-Instruct
Colossal new open weights model release today from Moonshot AI, a two year old Chinese AI lab with a name inspired by Pink Floyd’s album The Dark Side of the …
Quoting Django’s security policies
Following the widespread availability of large language models (LLMs), the Django Security Team has received a growing number of security reports generated partially or entirely using such tools. Many of …
Generationship: Ep. #39, Simon Willison
I recorded this podcast episode with Rachel Chalmers a few weeks ago. We talked about the resurgence of blogging, the legacy of Google Reader, learning in public, LLMs as weirdly …

Grok: searching X for "from:elonmusk (Israel OR Palestine OR Hamas OR Gaza)"
If you ask the new Grok 4 for opinions on controversial questions, it will sometimes run a search to find out Elon Musk’s stance before providing you with an answer. …

Grok 4
Released last night, Grok 4 is now available via both API and a paid subscription for end-users. Key characteristics: image and text input, text output. 256,000 context length (twice that …

Infinite Monkey
Mihai Parparita's Infinite Mac lets you run classic MacOS emulators directly in your browser. Infinite Monkey is a new feature which taps into the OpenAI Computer Use and Claude Computer …
Quoting Aphyr
I strongly suspect that Market Research Future, or a subcontractor, is conducting an automated spam campaign which uses a Large Language Model to evaluate a Mastodon instance, submit a plausible …
Become a command-line superhero with Simon Willison's llm tool
Christopher Smith ran a mini hackathon in Albany New York at the weekend around uses of my LLM - the first in-person event I'm aware of dedicated to that project! …
Adding a feature because ChatGPT incorrectly thinks it exists
Adrian Holovaty describes how his SoundSlice service saw an uptick in users attempting to use their sheet music scanner to import ASCII-art guitar tab... because it turned out ChatGPT had …
I Shipped a macOS App Built Entirely by Claude Code
Indragie Karunaratne has "been building software for the Mac since 2008", but recently decided to try Claude Code to build a side project: Context, a native Mac app for debugging …
Quoting Nineteen Eighty-Four
There was a whole chain of separate departments dealing with proletarian literature, music, drama, and entertainment generally. Here were produced rubbishy newspapers containing almost nothing except sport, crime and astrology, …
Supabase MCP can leak your entire SQL database
Here's yet another example of a lethal trifecta attack, where an LLM system combines access to private data, exposure to potentially malicious instructions and a mechanism to communicate data back …
Cursor: Clarifying Our Pricing
Cursor changed their pricing plan on June 16th, introducing a new $200/month Ultra plan with "20x more usage than Pro" and switching their $20/month Pro plan from "request limits to …
Identify, solve, verify
The more time I spend using LLMs for code, the less I worry for my career - even as their coding capabilities continue to improve. Using LLMs as part of …
awwaiid/gremllm
Delightfully cursed Python library by Brock Wilcox, built on top of LLM: from gremllm import Gremllm counter = Gremllm("counter") counter.value = 5 counter.increment() print(counter.value) # 6? print(counter.to_roman_numerals()) # VI? You …
Quoting Adam Gordon Bell
I think that a lot of resistance to AI coding tools comes from the same place: fear of losing something that has defined you for so long. People are reacting …
Frequently Asked Questions (And Answers) About AI Evals
Hamel Husain and Shreya Shankar have been running a paid, cohort-based course on AI Evals For Engineers & PMs over the past few months. Here Hamel collects answers to the …
Trial Court Decides Case Based On AI-Hallucinated Caselaw
Joe Patrice writing for Above the Law: [...] it was always only a matter of time before a poor litigant representing themselves fails to know enough to sniff out and …
Sandboxed tools in a loop
Something I've realized about LLM tool use is that it means that if you can reduce a problem to something that can be solved by an LLM in a sandbox …
Table saws
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.
Quoting Charles Babbage
On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out ?" In one case a …

Mandelbrot in x86 assembly by Claude
Inspired by a tweet asking if Claude knew x86 assembly, I decided to run a bit of an experiment. I prompted Claude Sonnet 4: Write me an ascii art mandelbrot …
TIL: Using Playwright MCP with Claude Code
Inspired by Armin ("I personally use only one MCP - I only use Playwright") I decided to figure out how to use the official Playwright MCP server with Claude Code. …
Quoting Kevin Webb
One of the best examples of LLM developer tooling I've heard is from a team that supports software from the 80s-90s. Their only source of documentation is video interviews with …
A custom template system from the mid-2000s era
Using LLMs for code archaeology is pretty fun. I stumbled across this blog entry from 2003 today, in which I had gotten briefly excited about ColdFusion and implemented an experimental …
Quoting mrmincent
To misuse a woodworking metaphor, I think we’re experiencing a shift from hand tools to power tools. You still need someone who understands the basics to get the good results …
June newsletter for sponsors has been sent
I just sent out the second edition of my sponsors only monthly newsletter. Anyone who sponsors me for $10/month or more on GitHub gets this carefully hand-curated summary of the …
Using Claude Code to build a GitHub Actions workflow
I wanted to add a small feature to one of my GitHub repos - an automatically updated README index listing other files in the repo - so I decided to …
llvm: InstCombine: improve optimizations for ceiling division with no overflow - a PR by Alex Gaynor and Claude Code
Alex Gaynor maintains rust-asn1, and recently spotted a missing LLVM compiler optimization while hacking on it, with the assistance of Claude (Alex works for Anthropic). He describes how he confirmed …

Agentic Coding: The Future of Software Development with Agents
Armin Ronacher delivers a 37 minute YouTube talk describing his adventures so far with Claude Code and agentic coding methods. A friend called Claude Code catnip for programmers and it …
How to Fix Your Context
Drew Breunig has been publishing some very detailed notes on context engineering recently. In How Long Contexts Fail he described four common patterns for context rot, which he summarizes like …
Context engineering
The term context engineering has recently started to gain traction as a better alternative to prompt engineering. I like it. I think this one may have sticking power. Here's an …
Continuous AI
GitHub Next have coined the term "Continuous AI" to describe "all uses of automated AI to support software collaboration on any platform". It's intended as an echo of Continuous Integration …

Project Vend: Can Claude run a small shop? (And why does that matter?)
In "what could possibly go wrong?" news, Anthropic and Andon Labs wired Claude 3.7 Sonnet up to a small vending machine in the Anthropic office, named it Claudius and told …

Introducing Gemma 3n: The developer guide
Extremely consequential new open weights model release from Google today: Multimodal by design: Gemma 3n natively supports image, audio, video, and text inputs and text outputs. Optimized for on-device: Engineered …

Geminiception
Yesterday Anthropic got a bunch of buzz out of their new window.claude.complete() API which allows Claude Artifacts to run their own API calls. It turns out Gemini had beaten them …
New sandboxes from Cloudflare and Vercel
Two interesting new products for running code in a sandbox today. Cloudflare launched their Containers product in open beta, and added a new Sandbox library for Cloudflare Workers that can …

Build and share AI-powered apps with Claude
Anthropic have added one of the most important missing features to Claude Artifacts: apps built as artifacts now have the ability to run their own prompts against Claude via a …
Quoting Christoph Niemann
Creating art is a nonlinear process. I start with a rough goal. But then I head into dead ends and get lost or stuck. The secret to my process is …

Gemini CLI
First there was Claude Code in February, then OpenAI Codex (CLI) in April, and now Gemini CLI in June. All three of the largest AI labs now have their own …
Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books
Major USA legal news for the AI industry today. Judge William Alsup released a "summary judgement" (a legal decision that results in some parts of a case skipping a trial) …

Phoenix.new is Fly's entry into the prompt-driven app development space
Here’s a fascinating new entrant into the AI-assisted-programming / coding-agents space by Fly.io, introduced on their blog in Phoenix.new – The Remote AI Runtime for Phoenix: describe an app in …
Disclosures
I've added a Disclosures section to my about page, listing my various sources of income and the companies that directly sponsor my work or have supported it in the recent …
Quoting Kent Beck
So you you can think really big thoughts and the leverage of having those big thoughts has just suddenly expanded enormously. I had this tweet two years ago where I …

My First Open Source AI Generated Library
Armin Ronacher had Claude and Claude Code do almost all of the work in building, testing, packaging and publishing a new Python library based on his design: It wrote ~1100 …
model.yaml
From their GitHub repo it looks like this effort quietly launched a couple of months ago, driven by the LM Studio team. Their goal is to specify an
Quoting FAQ for Your Brain on ChatGPT
Is it safe to say that LLMs are, in essence, making us "dumber"? No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", and so on. …

AbsenceBench: Language Models Can't Tell What's Missing
Here's another interesting result to file under the

Magenta RealTime: An Open-Weights Live Music Model
Fun new

Agentic Misalignment: How LLMs could be insider threats
One of the most entertaining details in the Claude 4 system card concerned blackmail: We then provided it access to emails implying that (1) the model will soon be taken …

python-importtime-graph
I was exploring why a Python tool was taking over a second to start running and I learned about the python -X importtime feature, documented here. Adding that option causes …
Mistral-Small 3.2
Released on Hugging Face a couple of hours ago, so far there aren't any quantizations to run it on a Mac but I'm sure those will emerge pretty quickly. This …
Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk
Stop me if you've heard this one before: A threat actor (acting as an external user) submits a malicious support ticket. An internal user, linked to a tenant, invokes an …

How OpenElections Uses LLMs
The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county and state-level results are …
Clarified zucchini consommé
I continue to have fun running fantasy cooking prompts through LLMs - this time I tried
Quoting Arvind Narayanan
Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which …
Quoting Workaccount2 on Hacker News
They poison their own context. Maybe you can call it context rot, where as context grows and especially if it grows with lots of distractions and dead ends, the output …
Coding agents require skilled operators
I wrote this recently in a conversation about whether coding agents can work as a replacement for human programmers. The
I counted all of the yurts in Mongolia using machine learning
Fascinating, detailed account by Monroe Clinton of a geospatial machine learning project. Monroe wanted to count visible yurts in Mongolia using Google Maps satellite view. The resulting project incorporates mercantile …
It's a trap
That memvid thing that

Trying out the new Gemini 2.5 model family
After many months of previews, Gemini 2.5 Pro and Flash have reached general availability with new, memorable model IDs: gemini-2.5-pro and gemini-2.5-flash. They are joined by a new preview model …
Quoting Donghee Na
The Steering Council (SC) approves PEP 779 [Criteria for supported status for free-threaded Python], with the effect of removing the “experimental” tag from the free-threaded build of Python 3.14 [...] …