AIの能力は判断を超えている

2026年06月15日 #AI

AIがチームに膨大なコード生成能力を与える一方、判断する能力が追い付いていない。

Operational and reliability teamsがこの問題を抱えており、AIが民主化した特定の知識やL1 cacheは、価値を失っている。

AI技術の進化に伴い、能力と意図のバランスが問われる時代が到来している。米国企業のエンジニア責任者であるデイブ・カーニー氏が、AI開発の現状と今後の課題を語る。

能力は進化したが、意図は遅れている

AIはコード生成の能力を大幅に高めたが、その能力が意図を上回っている。誰でもコードを書けるようになったが、それが本当に必要かという判断が欠如している。運用チームはこの判断の結果に直面している。

専門知識の民主化

過去には専門知識が必要だったが、AIによってその知識が誰にでもアクセス可能になった。監視の専門家は、単なる知識の蓄積ではなく、価値を提供する方法を模索する必要がある。

信頼とは検証の問題

ソフトウェアの信頼は検証の問題であり、信頼を置くべき相手がいない。過去のインシデントは、信頼の仕組みが破綻していることを示している。代替案として、外部の検証サービスを活用する方法が提案されている。

まとめ

AI技術の進化は、能力と意図のバランスを問う時代へと導いている。今後の課題は、信頼の検証と、本当に必要な技術の選定にありそうだ。

原文の冒頭を表示（英語・3段落のみ）

About Dave:Dave is currently VP of Engineering at Astronomer, where he drives Reliability, Automation, and what he calls “general good sense across Engineering.” Before that: VP of Engineering at Twilio leading SRE, Sr. Director at Elastic building and scaling Elastic Cloud, and a 17-year career at Google (2004–2021) spanning SRE engineer to Engineering Director — including global lead for Storage and Databases SRE and head of Engineering at Google Ireland.He authored Chapter 29 of the canonical Google SRE Book (”Dealing with Interrupts”), ran the first two Reddit AMAs ever done by Google’s SRE team, and has spoken at SREcon Europe multiple times (2015, 2021, 2022, 2023, 2024).Dave isn’t a starry-eyed AI hype merchant. He comes from the discipline that has to clean up what the code generators leave behind — reliability, platform engineering, on-call. That’s a rare lens.Timeline01:49 — Capability outruns intent02:28 — Expertise gets democratized04:11 — Heuristics and T-shaped skills win08:57 — Supply chain is about verifying, not trusting12:22 — MTTR is overrated21:19 — We’re in the pets.com phase25:38 — “Good enough” beats “better”41:42 — The dopamine and dissatisfaction trap44:43 — The missed signal: providers get expensiveShare AI Builder Series 1. Capability is now unmatched with intent Dave’s central thesis: AI has handed teams enormous code-generation capability, but that capability has outrun judgment. People can build almost anything now, but few stop to ask whether they should. Operational and reliability teams are “bearing the brunt” downstream of all the decisions being made too fast.2. The democratization of obscure expertise Operationally-focused teams historically guarded access and expertise, niche knowledge of YAML dialects, PromQL, obscure config languages. AI has democratized that “L1 cache” knowledge, so the value of being “the person who knows the config language” is collapsing. The monitoring guru should get ahead of it and find where they add value beyond rote knowledge.3. Heuristics and T-shaped skills matter more, not less The mental models, problem decomposition, stakeholder management and architecture-level thinking, the “boring” translation of intent into capability, are becoming the crucial differentiators. The worry: the tech and remote-work shifts (like COVID) are disabusing junior engineers of the slog that builds senior judgment, making it unclear how the next generation matures.4. Supply chain trust is a verification problem, not a trust problem “Trust” is the wrong frame. There’s no human in the loop to trust, just people abdicating the responsibility to verify. Incidents like the LiteLLM/Trivy vulnerabilities showed attackers acting in minutes (even mid-key-rotation), signalling that current methods of trusting software and doing incident response are breaking. He leans toward buying this capability (Chainguard, Cloudsmith) rather than every company building it.5. MTTR is one of the least interesting metrics to optimize Driving down Mean Time To Recovery just plugs leaks faster while ignoring how many leaks you have. More interesting are mean time between failures and, crucially, postmortems aimed at systemic change so failures recur in novel rather than boring, repetitive ways. “Just get everyone to not do that” is a non-solution, since people churn and newcomers repeat the same mistakes.Share AI Builder Series 6. You can’t turn everything to 11 Ask any CEO their target for SLOs, CVEs, customer satisfaction and they’ll say “100%, zero, perfect.” But security, reliability, and satisfaction can’t all be maxed simultaneously. It’s fundamentally a management trade-off requiring human judgment about how much the company actually cares about each vector.7. We’re in the pets.com phase Dave draws a direct parallel to the late-90s dot-com boom: money is being thrown around to “see what sticks.” A reckoning will come, and what survives will be what’s foundationally useful to humans, just as Google emerged from the ashes by simply making things findable. His advice: “run towards the fire,” solve the genuinely hard problems that persist regardless of how AI shakes out.8. “Is it good enough?” beats “Is it better?” The frontier models aren’t as architecturally different as marketing implies. Much of the differentiation is branding. With per-token pricing (e.g. Copilot’s shift, ~5x between cheaper and premium models) and examples like Harvey reportedly finding an open model far cheaper per task, the real question for businesses isn’t whether a model is better but whether it’s good enough for cheaper.9. The dopamine and dissatisfaction problem Engineers became engineers to build and enjoy flow state. AI has replaced deep, single-problem focus with 10 to 15 parallel sessions and constant context-switching, and humans are “bad machines” at that. The output may be great but the experience is unsatisfying; people want to think about one problem for more than five minutes. Leaders must operationalize intent and teach the discipline of no, understanding what “non-work” is.10. The signal people are missing: model providers are about to get expensive The big bet Dave sees coming: major providers will get significantly pricier, and how companies respond will define the next few years. He’s excited about self-hosting “good enough,” ring-fenced, focused models a company can own, and about the post-adjustment wave of genuinely human-centric products that filter noise, surface the one or two things worth a human’s judgment, and “make life easier” rather than pretending to converse and making you anxious.Leave a commentSponsored by Kerno

※ 著作権に配慮し、引用は冒頭3段落までです。続きは元記事をご覧ください。

— 元記事を読む ↗

元記事を読む ↗