For the first time in the history of personal computing, NVIDIA has built a chip that exists for one purpose alone: to run artificial intelligence agents on your laptop, without the internet, without the cloud, and without any external dependency whatsoever. The company unveiled RTX Spark at GTC Taipei during Computex on June 1, 2026, a new superchip designed from the ground up to run AI agents locally on Windows laptops, no cloud required.
This is not an incremental upgrade to existing PC technology. This is a fundamental reimagining of what a personal computer is supposed to be. CEO Jensen Huang framed the technology as transforming PCs from mere tools into “collaborative teammates.” That vision sounds like corporate marketing until you understand what the hardware actually does.
RTX Spark features up to 1 petaflop of AI compute and 128GB of unified memory to meet the processing demands of on-device agents. A petaflop is a quadrillion floating-point operations per second. That is the kind of computational power that previously required a data center. Now it fits in a laptop.
The Specs Are Genuinely Absurd
To understand why this matters, you need to understand what is packed into this chip. The RTX Spark features 6,144 CUDA cores, one petaflop of FP4 AI performance and 20 Grace CPU cores, all of which can be paired with up to 128 GB of LPDDR4X unified memory running at 300 GB/s.
That unified memory architecture is the real breakthrough. Previous laptops had separate memory for the GPU look and the CPU, which meant constant data shuffling back and forth. RTX Spark eliminates that bottleneck by giving the GPU and CPU a shared pool of 128GB that both can access at full speed. That unified architecture is what allows the chip to run massive AI models without constantly stalling to wait for data to move between different memory locations.
What does that mean in practice? This allows a 120B-parameter model with a 1M context window to run locally, enabling users to use agents and AI workflows on a laptop without relying on external services or subscriptions. A 120-billion-parameter model is in the same ballpark as GPT-4 and Claude 3. Your laptop can now run models of that sophistication on its own.
The Partner Ecosystem Is Staggering
What makes RTX Spark extraordinary is not just the hardware, but the speed with which the entire industry mobilized around it. Microsoft did not wait. NVIDIA and Microsoft collaborated to deliver a native Windows experience for personal agents, including new security primitives and NVIDIA OpenShell to run agents securely on primary devices.
That is not a minor engineering effort. Windows has been designed for 40 years around the assumption that applications run independently of each other. Reengineering the entire operating system to enable secure, sandboxed execution of autonomous AI agents is a massive undertaking, and Microsoft committed to it immediately.
Adobe did not wait either. The company announced it is rebuilding Photoshop and Premiere Pro specifically for RTX Spark. That means the next version of Photoshop will assume that the GPU has access to massive amounts of memory and can handle complex generative tasks locally. You will not need a cloud subscription to use AI features in Photoshop. You will not need an internet connection. The AI runs on your machine.
Laptops and desktops from ASUS, Dell, HP, Lenovo, and Microsoft Surface are set to launch this fall. Every major PC manufacturer is building hardware around this chip. The retail availability timeline is measured in months, not years.
The Architecture Is Built on Blackwell
The RTX Spark superchip features an NVIDIA Blackwell RTX GPU with 6,144 CUDA cores and fifth-generation Tensor Cores with FP4 precision, connected via the NVIDIA NVLink-C2C chip-to-chip interconnect to a high-performance, 20-core NVIDIA Grace CPU.
NVIDIA’s Blackwell architecture has been the company’s ace card in the AI infrastructure race. Data centers are being built around Blackwell chips. Now that same architecture is compressed into a form factor that fits in a ultrabook.
The FP4 precision support is critical. For years, the assumption has been that you need 32-bit or 16-bit floating-point precision to run AI models reliably. FP4 is 4-bit, which means the models use radically less memory and compute. With FP4 support and Tensor Cores optimized for it, RTX Spark can run models at speeds that would have required multiple GPUs just two years ago.
What This Means for Software Developers
The real revolution here is not in the hardware but in what it enables for software architecture. NVIDIA collaborated with the llama.cpp community to enable features and optimizations such as multi-token prediction (MTP) — a speculative decoding technique where a smaller draft model proposes multiple tokens at a time that the target model verifies in a single pass.
This coupled with other optimizations such as programmatic dependent launch delivers 2x performance on Qwen 3.6 and 3.5 27B, and a 1.6x performance boost on Qwen 3.6 and 3.5 35B. Those performance gains compound. 2x faster inference means the response time for an AI agent drops in half, which makes the difference between an agent that feels responsive and one that feels sluggish.
Open-source projects are already adopting this hardware. Two of the most popular agent projects — Hermes Agent and OpenClaw — are integrating Openshell and the Microsoft security primitives into their upcoming native Windows apps. The developer community is not waiting for corporate applications. They are building agents directly on top of RTX Spark.
The Security Question
One of the most important aspects of this announcement is what NVIDIA and Microsoft are doing about security. Running autonomous agents on your primary computer introduces significant attack surface. An agent that can interact with your applications, access your files, and execute commands needs to be constrained in ways that previous PC architectures never required.
NVIDIA and Microsoft are partnering to deliver a robust, secure Windows platform for on-device agents built on new OS security primitives and NVIDIA OpenShell. The security model being developed for RTX Spark will eventually become a template for how other platforms handle autonomous AI agents.
The Market Implications
Analysts from Counterpoint Research project that the RTX Spark devices will capture 15-20% of the premium PC market by 2027, primarily at the expense of high-end Intel Core Ultra and AMD Ryzen 9 mobile processors. That projection means RTX Spark is not a niche product. It is a market-reshaping force.
The premium PC market, which consists of laptops priced above $1,500, is where the margins are fattest and where professional users with serious compute needs concentrate. RTX Spark winning 15-20% of that market is equivalent to a revolution in PC architecture. Intel and AMD cannot simply copy the approach. They would need to build entirely new chips from scratch, a process that takes years.
NVIDIA has a window where RTX Spark is the only architecture in the world optimized for local AI agents. That window will not last forever, but it will last long enough for NVIDIA to entrench itself as the standard-bearer for personal AI computing.
What This Means for Cloud Services
The most important implication of RTX Spark is what it means for cloud-based AI services. For the past three years, the entire AI industry has been built around a model where users send prompts to the cloud, AI models running in data centers process them, and results come back to the user’s device. That model made sense when your laptop could not run sophisticated models.
It no longer makes sense.
RTX Spark lets creators, AI developers and gamers render ultralarge 90GB+ 3D scenes, edit 12K 4:2:2 video, generate 4K AI videos, run 120B-parameter LLMs with up to 1 million tokens context using agents locally, and play AAA games at 1440p.
If you can do all of that locally, why would you want to send your data to a cloud service? Privacy becomes automatic. Latency disappears. You maintain complete control over your data. The economics of cloud-based AI become untenable.
Companies offering cloud AI services will need to differentiate themselves through quality, through training on specialized datasets, through features that require global coordination. But the assumption that all AI computation happens in the cloud is now obsolete.
The Future of the PC
NVIDIA is positioning RTX Spark as the beginning of a new era in personal computing. Jensen Huang said, “The PC is being reinvented. For forty years, you launched apps. Click. Type. With RTX Spark and Microsoft Windows, you ask and the PC does the work.”
That framing is not hyperbole. The paradigm shift from application-centric to agent-centric computing is as significant as the shift from command-line to graphical interfaces. Your next computer will not be something you control through deliberate commands. It will be something you collaborate with. You will describe what you want done. The agent will figure out how to do it, interacting with your applications, your files, and your digital life on your behalf.
That future is launching this fall.