<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title>AI - Category - Botmonster Tech</title><link>https://botmonster.com/ai/</link><description>AI - Category - Botmonster Tech</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Fri, 15 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://botmonster.com/ai/" rel="self" type="application/rss+xml"/><item><title>Claude Code in CI/CD: Automate PR Reviews and Issue Fixes with GitHub Actions</title><link>https://botmonster.com/ai/claude-code-ci-cd-automate-pr-reviews-github-actions/</link><pubDate>Fri, 15 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/claude-code-ci-cd-automate-pr-reviews-github-actions/</guid><description><![CDATA[<div class="featured-image">
                <img src="/claude-code-ci-cd-automate-pr-reviews-github-actions.png" referrerpolicy="no-referrer">
            </div><p>Anthropic ships <a href="https://github.com/anthropics/claude-code-action" target="_blank" rel="noopener noreferrer ">claude-code-action</a>
, an official GitHub Action that runs the full <a href="https://code.claude.com/" target="_blank" rel="noopener noreferrer ">Claude Code</a>
 runtime inside your CI/CD pipeline. It reviews pull requests, builds features from issues when someone types <code>@claude</code>, writes tests, updates docs, and drafts release notes. It also respects your repo&rsquo;s <code>CLAUDE.md</code> coding rules. The runtime runs on a GitHub Actions runner, with tool use, file reads, and multi-step reasoning.</p>
<p>It ships with four auth backends: Anthropic API, AWS Bedrock, Google Vertex AI, and Microsoft Foundry. It also has a sister <code>claude-code-security-review</code> action for vuln scans, native GitLab CI/CD support, and real deployments. <a href="https://derivai.substack.com/p/automated-security-code-reviews-claude-code-github-actions" target="_blank" rel="noopener noreferrer ">Deriv</a>
 runs it across 700+ repos, handling 100+ PRs per week. So this has moved past the demo stage. Teams now wire it into merge gates next to linters and test suites.</p>]]></description></item><item><title>OpenClaw Texted My Ex and Why iMessage Access Is a Trap</title><link>https://botmonster.com/ai/openclaw-imessage-access-trap/</link><pubDate>Thu, 14 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/openclaw-imessage-access-trap/</guid><description><![CDATA[<div class="featured-image">
                <img src="/openclaw-imessage-access-trap.png" referrerpolicy="no-referrer">
            </div><p>The viral <a href="https://www.reddit.com/r/ChatGPT/comments/1sng426/my_openclaw_texted_my_ex/" target="_blank" rel="noopener noreferrer ">r/ChatGPT &ldquo;my OpenClaw texted my ex&rdquo; post</a>
 reads like a joke, but the comments treat it as a warning sign. Keep <a href="https://openclaw.ai/" target="_blank" rel="noopener noreferrer ">OpenClaw&rsquo;s</a>
 iMessage, SMS, and contacts skills off your personal Mac. Wait until LTS ships and the <a href="https://openclaw.ai/blog/openclaw-rough-week" target="_blank" rel="noopener noreferrer ">founder&rsquo;s &ldquo;rough week&rdquo; supply-chain fixes</a>
 land. Scope write-access skills to a disposable VPS instead.</p>
<section class="key-takeaways">
  <h2 id="key-takeaways" class="key-takeaways-title">
    <svg class="key-takeaways-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
      <rect x="4" y="3" width="16" height="18" rx="2" ry="2"></rect>
      <path d="M8 8h5"></path>
      <path d="M8 13h8"></path>
      <path d="M8 17h8"></path>
      <path d="M15.5 7l1.2 1.2L19 6"></path>
    </svg>
    <span>Key Takeaways</span>
  </h2>
  <div class="key-takeaways-body">
<ul>
<li>The viral &ldquo;texted my ex&rdquo; post is a leading indicator, not just a meme.</li>
<li>iMessage, SMS, and contacts are write-heavy skills that touch your real social graph.</li>
<li>Forgetful agents plus unsupervised cron jobs turn wrong-recipient sends into expected behavior.</li>
<li>Run write-heavy OpenClaw skills on a disposable VPS, not your personal Mac.</li>
<li>Wait for the LTS release before treating OpenClaw as personal-machine infrastructure.</li>
</ul>

  </div>
</section>

<h2 id="the-viral-openclaw-meme-is-not-just-a-meme">The viral OpenClaw meme is not just a meme</h2>
<p>A screenshot of OpenClaw happily reporting that it had texted the OP&rsquo;s ex hit 4.8K upvotes and 176 comments on r/ChatGPT in about three weeks. The top replies are jokes (&ldquo;Of all the things that didn&rsquo;t happen, this happened the didn&rsquo;test&rdquo;). The serious comments point at a real safety category that is forming in real time.</p>]]></description></item><item><title>AI Code Review in 2026: Why Human Review Skills Matter More Than Ever</title><link>https://botmonster.com/ai/ai-code-review-2026-human-review-skills-ai-generated-commits/</link><pubDate>Wed, 13 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/ai-code-review-2026-human-review-skills-ai-generated-commits/</guid><description><![CDATA[<div class="featured-image">
                <img src="/ai-code-review-2026-human-review-skills-ai-generated-commits.png" referrerpolicy="no-referrer">
            </div><p>AI writes about 41% of all committed code in 2026, and some teams report well above 50%. AI review tools have cut PR cycle times by as much as 59%. Yet when <a href="https://www.sonarsource.com/" target="_blank" rel="noopener noreferrer ">Sonar</a>
 asked 1,149 developers for their <a href="https://www.sonarsource.com/blog/state-of-code-developer-survey-report-the-current-reality-of-ai-coding" target="_blank" rel="noopener noreferrer ">2026 State of Code report</a>
, 47% ranked &ldquo;reviewing and validating AI-generated code for quality and security&rdquo; the top skill in the AI era, above prompting at 42%. The paradox: the more code AI writes, the more vital human review becomes.</p>]]></description></item><item><title>Ditching Claude Opus for GLM 5.1 in OpenClaw at $18/Mo</title><link>https://botmonster.com/ai/openclaw-glm-claude-opus-cheap-stack/</link><pubDate>Wed, 13 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/openclaw-glm-claude-opus-cheap-stack/</guid><description><![CDATA[<div class="featured-image">
                <img src="/openclaw-glm-claude-opus-cheap-stack.png" referrerpolicy="no-referrer">
            </div><p>Anthropic&rsquo;s third-party tool rules priced agent users off Claude Opus 4.6. The cheapest working <a href="https://openclaw.ai" target="_blank" rel="noopener noreferrer ">OpenClaw</a>
 stack now is <a href="https://z.ai" target="_blank" rel="noopener noreferrer ">Z.ai&rsquo;s</a>
 $18/mo GLM 5 Turbo plan. Next rungs: <a href="https://ollama.com" target="_blank" rel="noopener noreferrer ">Ollama-cloud&rsquo;s</a>
 $20/mo GLM 5.1, then <a href="https://www.minimax.io/pricing" target="_blank" rel="noopener noreferrer ">MiniMax&rsquo;s</a>
 $40/mo highspeed tier. Kimi 2.6 stays API-only since local setup needs about 750 GB of RAM.</p>
<section class="key-takeaways">
  <h2 id="key-takeaways" class="key-takeaways-title">
    <svg class="key-takeaways-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
      <rect x="4" y="3" width="16" height="18" rx="2" ry="2"></rect>
      <path d="M8 8h5"></path>
      <path d="M8 13h8"></path>
      <path d="M8 17h8"></path>
      <path d="M15.5 7l1.2 1.2L19 6"></path>
    </svg>
    <span>Key Takeaways</span>
  </h2>
  <div class="key-takeaways-body">
<ul>
<li>Z.ai&rsquo;s $18/mo plan running GLM 5 Turbo is the cheapest OpenClaw backend that actually works.</li>
<li>MiniMax highspeed at $40/mo handles heavier workloads without the four-figure surprise bills.</li>
<li>Kimi 2.6 needs around 750 GB of RAM to self-host, so almost everyone runs it through the API.</li>
<li>Keep Claude on the planner role; route scheduled jobs to the cheap backends.</li>
<li>China-hosted models trade dollars for privacy on iMessage, contacts, and email skills.</li>
</ul>

  </div>
</section>

<h2 id="why-1500mo-opus-bills-pushed-users-to-glm">Why $1,500/mo Opus Bills Pushed Users to GLM</h2>
<p>The pressure here is simple. Once Anthropic&rsquo;s third-party tool rules kicked in, OpenClaw users on the Claude Pro CLI got nudged onto pay-per-token API access. At Opus 4.6 list pricing of $15 per million input tokens and $75 per million output tokens, agent loops add up fast. The OP of the <a href="https://www.reddit.com/r/openclaw/comments/1svmq20/psa_anthropic_clarified_the_openclaw_ban_you_can/" target="_blank" rel="noopener noreferrer ">r/openclaw PSA thread</a>
 tracked his own bill at about $1,500/mo before he switched. That figure is the anchor most cost threads on the sub now cite. The pricing pain did not ease with the next model either: the <a href="/ai/claude-opus-4-7-x-reddit-reception/" rel="">community reception of Opus 4.7</a>
 leaned on token-burn complaints from power users hitting caps in minutes, which is exactly the pattern that turns an OpenClaw cron fleet into a four-figure surprise.</p>]]></description></item><item><title>OpenClaw vs Hermes and Why Memory Kills Agent Loyalty</title><link>https://botmonster.com/ai/openclaw-vs-hermes-memory-problem/</link><pubDate>Tue, 12 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/openclaw-vs-hermes-memory-problem/</guid><description><![CDATA[<div class="featured-image">
                <img src="/openclaw-vs-hermes-memory-problem.png" referrerpolicy="no-referrer">
            </div><p><a href="https://github.com/NousResearch" target="_blank" rel="noopener noreferrer ">Hermes Agent</a>
, built by Nous Research, has taken about 30% of <a href="https://openclaw.ai" target="_blank" rel="noopener noreferrer ">OpenClaw&rsquo;s</a>
 user base by fixing one failure: memory. The <a href="https://kilo.ai/openclaw/vs-hermes" target="_blank" rel="noopener noreferrer ">Kilo.ai synthesis of 1,300+ r/openclaw comments</a>
 confirms the figure. OpenClaw still wins on multi-agent breadth and 100+ skills. The right answer depends on which failure mode hurts you more.</p>
<section class="key-takeaways">
  <h2 id="key-takeaways" class="key-takeaways-title">
    <svg class="key-takeaways-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
      <rect x="4" y="3" width="16" height="18" rx="2" ry="2"></rect>
      <path d="M8 8h5"></path>
      <path d="M8 13h8"></path>
      <path d="M8 17h8"></path>
      <path d="M15.5 7l1.2 1.2L19 6"></path>
    </svg>
    <span>Key Takeaways</span>
  </h2>
  <div class="key-takeaways-body">
<ul>
<li>About 30% of r/openclaw users have switched to Hermes Agent, mainly for memory reliability.</li>
<li>Memory failures, not features, are the top reason people leave OpenClaw.</li>
<li>Hermes ships with memory that works by default; OpenClaw needs heavy prompt-engineering to behave.</li>
<li>OpenClaw still wins for multi-bot setups across Telegram, Slack, and Discord.</li>
<li>A growing minority skip both and use OpenAI Codex business-tier instead.</li>
</ul>

  </div>
</section>

<h2 id="why-ropenclaw-is-migrating-to-hermes">Why r/openclaw Is Migrating to Hermes</h2>
<p>The most-cited migration thread on the subreddit is the 167-comment <a href="https://www.reddit.com/r/openclaw/comments/1swc620/openclaw_vs_hermes/" target="_blank" rel="noopener noreferrer ">OpenClaw vs Hermes thread</a>
. The top-voted answer to &ldquo;is Hermes worth a look&rdquo; reads as a clean defection notice. The poster ran OpenClaw for weeks on the same workload, then switched in an afternoon:</p>]]></description></item><item><title>AI Web Search Backends: Who Owns, Who Rents</title><link>https://botmonster.com/ai/ai-web-search-backends-who-owns-who-rents/</link><pubDate>Mon, 11 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/ai-web-search-backends-who-owns-who-rents/</guid><description><![CDATA[<div class="featured-image">
                <img src="/ai-web-search-backends-who-owns-who-rents.png" referrerpolicy="no-referrer">
            </div><p>Only Google Gemini and Microsoft Copilot run on a search index their parent company crawls itself. Anthropic Claude rents <a href="https://search.brave.com/" target="_blank" rel="noopener noreferrer ">Brave Search</a>
, Mistral Le Chat rents Brave too, OpenAI ChatGPT rents <a href="https://www.bing.com/" target="_blank" rel="noopener noreferrer ">Bing</a>
 plus its own crawler, and Meta AI rents both. The key clue: Claude&rsquo;s <code>web_search</code> tool exposes a literal <code>BraveSearchParams</code> field, and citation overlap with Brave runs about 86.7%.</p>
<section class="key-takeaways">
  <h2 id="key-takeaways" class="key-takeaways-title">
    <svg class="key-takeaways-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
      <rect x="4" y="3" width="16" height="18" rx="2" ry="2"></rect>
      <path d="M8 8h5"></path>
      <path d="M8 13h8"></path>
      <path d="M8 17h8"></path>
      <path d="M15.5 7l1.2 1.2L19 6"></path>
    </svg>
    <span>Key Takeaways</span>
  </h2>
  <div class="key-takeaways-body">
<ul>
<li>Only Google and Microsoft own a web-scale search index.</li>
<li>Claude and Mistral both reportedly run on the Brave Search API.</li>
<li>ChatGPT uses Bing, OpenAI&rsquo;s own crawler, and publisher deals.</li>
<li>IndexNow helps Bing-backed AI products, not Brave or Google.</li>
<li>Brave now acts as AI&rsquo;s third search pole beside Google and Bing.</li>
</ul>

  </div>
</section>

<h2 id="only-five-companies-actually-crawl-the-open-web">Only Five Companies Actually Crawl the Open Web</h2>
<p>Before mapping each AI lab to its backend, the key constraint is simple: only five operators crawl the open web at scale. Everything else sold as a &ldquo;search engine&rdquo; resells one of those indexes. The five are Google, Microsoft Bing, Yandex, Baidu, and Brave Search, with <a href="https://www.mojeek.com/" target="_blank" rel="noopener noreferrer ">Mojeek</a>
 as a much smaller niche sixth.</p>]]></description></item><item><title>Claude Code vs COBOL: The AI Migration Controversy That Crashed IBM's Stock 13%</title><link>https://botmonster.com/ai/claude-code-vs-cobol-ai-migration-controversy-ibm-stock/</link><pubDate>Mon, 11 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/claude-code-vs-cobol-ai-migration-controversy-ibm-stock/</guid><description><![CDATA[<div class="featured-image">
                <img src="/claude-code-vs-cobol-ai-migration-controversy-ibm-stock.png" referrerpolicy="no-referrer">
            </div><p>On February 23, 2026, Anthropic published a blog post titled <a href="https://claude.com/blog/how-ai-helps-break-cost-barrier-cobol-modernization" target="_blank" rel="noopener noreferrer ">&ldquo;How AI Helps Break the Cost Barrier to COBOL Modernization&rdquo;</a>
. It shipped with a <a href="https://resources.anthropic.com/code-modernization-playbook" target="_blank" rel="noopener noreferrer ">Code Modernization Playbook</a>
. By market close, IBM&rsquo;s stock had fallen 13.2% to $223.35 per share. That was IBM&rsquo;s worst single day since October 2000. More than $31 billion in market cap vanished. Accenture fell 6.5%. Cognizant dropped 6%. One blog post had shaken the whole legacy migration sector.</p>]]></description></item><item><title>OpenClaw on Your $20 Claude Sub After Anthropic Banned It</title><link>https://botmonster.com/ai/openclaw-claude-sub-after-anthropic-ban/</link><pubDate>Mon, 11 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/openclaw-claude-sub-after-anthropic-ban/</guid><description><![CDATA[<div class="featured-image">
                <img src="/openclaw-claude-sub-after-anthropic-ban.png" referrerpolicy="no-referrer">
            </div><p>OpenClaw&rsquo;s bundled <code>claude-cli</code> backend is officially sanctioned by Anthropic, while OAuth-token extraction tools stay blocked. The carve-out works because shelling out to <code>claude -p</code> preserves prompt caching, so a $20 Pro or $200 Max sub routes through OpenClaw without four-figure API bills. The catch: a roughly 5-hour cap that cron jobs exhaust in minutes.</p>
<section class="key-takeaways">
  <h2 id="key-takeaways" class="key-takeaways-title">
    <svg class="key-takeaways-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
      <rect x="4" y="3" width="16" height="18" rx="2" ry="2"></rect>
      <path d="M8 8h5"></path>
      <path d="M8 13h8"></path>
      <path d="M8 17h8"></path>
      <path d="M15.5 7l1.2 1.2L19 6"></path>
    </svg>
    <span>Key Takeaways</span>
  </h2>
  <div class="key-takeaways-body">
<ul>
<li>OpenClaw&rsquo;s CLI backend is allowed by Anthropic; the older OAuth-token tools are not.</li>
<li>The reason it is allowed: it preserves Anthropic&rsquo;s prompt caching exactly like Claude Code does.</li>
<li>Pro and Max plans cap usage near 5 hours per window, so cron jobs need a cheaper backup.</li>
<li>Use Claude for planning and chat, route automated tasks to GLM, MiniMax, or Codex.</li>
<li>Setup is three commands and one config edit on any Mac or Linux host running Claude Code.</li>
</ul>

  </div>
</section>

<h2 id="what-changed-in-anthropics-third-party-tool-policy">What Changed in Anthropic&rsquo;s Third-Party Tool Policy?</h2>
<p>Most users found out about the policy change when their Anthropic bill jumped, not from a press release. Heavy agentic workflows that previously billed against <a href="/posts/is-claude-max-worth-200-month-developer-cost-analysis/" rel="">a flat Pro or Max subscription</a>
 suddenly tracked toward $1,500 a month on Opus 4.6 once Anthropic forced third-party orchestrators onto the pay-per-token API. The original concern was narrower than the community read it as. Anthropic&rsquo;s target was a specific class of tool that extracts the OAuth token from a local <a href="https://www.anthropic.com/claude-code" target="_blank" rel="noopener noreferrer ">Claude Code</a>
 install and calls the Anthropic API directly under that identity. That pattern bypasses Anthropic&rsquo;s prompt caching and pushes load to the API tier without the caching benefit Anthropic gets when Claude Code itself runs the request.</p>]]></description></item><item><title>1,000 OpenClaw Deploys Later</title><link>https://botmonster.com/ai/openclaw-1000-deploys-news-digests-only/</link><pubDate>Sun, 10 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/openclaw-1000-deploys-news-digests-only/</guid><description><![CDATA[<div class="featured-image">
                <img src="/openclaw-1000-deploys-news-digests-only.png" referrerpolicy="no-referrer">
            </div><p>After publishing a 7-minute <a href="https://openclaw.ai/" target="_blank" rel="noopener noreferrer ">OpenClaw</a>
 deploy video and watching roughly 1,000 isolated VMs spin up afterward, one r/LocalLLaMA cloud-infra operator concluded the only OpenClaw workflow that survives unsupervised execution is a daily news digest. Memory is the load-bearing failure mode, not a fixable bug. OpenClaw sits at 370K+ GitHub stars, but the working-workflow count has barely moved.</p>
<section class="key-takeaways">
  <h2 id="key-takeaways" class="key-takeaways-title">
    <svg class="key-takeaways-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
      <rect x="4" y="3" width="16" height="18" rx="2" ry="2"></rect>
      <path d="M8 8h5"></path>
      <path d="M8 13h8"></path>
      <path d="M8 17h8"></path>
      <path d="M15.5 7l1.2 1.2L19 6"></path>
    </svg>
    <span>Key Takeaways</span>
  </h2>
  <div class="key-takeaways-body">
<ul>
<li>A cloud-infra operator watched roughly 1,000 OpenClaw deploys and found one reliable use case.</li>
<li>Memory unreliability is built into how the agent works, not a bug a patch can fix.</li>
<li>Daily news digests are the exception because they keep no state between runs.</li>
<li>The same digest can be built with a cron job and any LLM API in about ten lines.</li>
<li>OpenClaw&rsquo;s founder admitted that recent releases were a &ldquo;rough week&rdquo;.</li>
</ul>

  </div>
</section>

<h2 id="the-1000-deploy-post-that-broke-the-consensus">The 1,000-Deploy Post That Broke the Consensus</h2>
<p>The contrarian thesis is anchored to one specific source: an r/LocalLLaMA post titled <a href="https://www.reddit.com/r/LocalLLaMA/comments/1skce14/openclaw_has_250k_github_stars_the_only_reliable/" target="_blank" rel="noopener noreferrer ">&ldquo;OpenClaw has 250K GitHub stars. The only reliable use case I&rsquo;ve found is daily news digests&rdquo;</a>
, with 335 comments and 891 votes. The OP is not a casual skeptic. He runs cloud infrastructure where strangers spin up Linux VMs, published a deploy walkthrough that took off, and now has a dataset most reviewers do not have access to.</p>]]></description></item><item><title>DeepSeek V4 Tech Report: 3 Tricks That Cut Compute 73%</title><link>https://botmonster.com/ai/deepseek-v4-tech-report-3-revolutionary-tricks-chinese-ai/</link><pubDate>Fri, 08 May 2026 00:00:00 +0000</pubDate><author>Botmonster</author><guid>https://botmonster.com/ai/deepseek-v4-tech-report-3-revolutionary-tricks-chinese-ai/</guid><description><![CDATA[<div class="featured-image">
                <img src="/deepseek-v4-tech-report-3-revolutionary-tricks-chinese-ai.png" referrerpolicy="no-referrer">
            </div><p>DeepSeek V4 is a 1.6 trillion parameter open-weight Mixture-of-Experts model. It reads 1M tokens at once. It uses 27% of V3.2&rsquo;s inference FLOPs and 10% of its KV cache. The <a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf" target="_blank" rel="noopener noreferrer ">DeepSeek V4 tech report</a>
 credits three moves: hybrid CSA plus HCA attention, Manifold-Constrained Hyper-Connections, and the Muon optimizer in place of AdamW.</p>
<section class="key-takeaways">
  <h2 id="key-takeaways" class="key-takeaways-title">
    <svg class="key-takeaways-icon" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true">
      <rect x="4" y="3" width="16" height="18" rx="2" ry="2"></rect>
      <path d="M8 8h5"></path>
      <path d="M8 13h8"></path>
      <path d="M8 17h8"></path>
      <path d="M15.5 7l1.2 1.2L19 6"></path>
    </svg>
    <span>Key Takeaways</span>
  </h2>
  <div class="key-takeaways-body">
<ul>
<li>DeepSeek V4 is a free, open-weight AI that goes toe-to-toe with the top closed models from OpenAI, Anthropic, and Google.</li>
<li>It reads 1 million tokens in one prompt, enough for several full books or a long agent run without losing track.</li>
<li>It runs on roughly a quarter of the compute its previous version needed, making long-context AI affordable to operate.</li>
<li>A smaller team built it without access to top NVIDIA chips, proving clever engineering can rival raw GPU spend.</li>
<li>It scored a perfect 120 out of 120 on the 2025 Putnam math competition and beats Google&rsquo;s Gemini 3.1 Pro at 1M-token recall.</li>
</ul>

  </div>
</section>

<h2 id="deepseek-v4-at-a-glance">DeepSeek V4 at a Glance</h2>
<p>The <a href="https://api-docs.deepseek.com/news/news260424" target="_blank" rel="noopener noreferrer ">official launch announcement</a>
 on April 24, 2026 framed the release as &ldquo;the era of cost-effective 1M context length.&rdquo; It shipped two checkpoints under the MIT license. DeepSeek-V4-Pro runs at 1.6T total and 49B active parameters. DeepSeek-V4-Flash runs at 284B total and 13B active. Both models read 1M tokens at once. Both ship as open weights on <a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro" target="_blank" rel="noopener noreferrer ">Hugging Face</a>
. The routed expert weights use FP4 math, and most other weights use FP8.</p>]]></description></item></channel></rss>