Context Switching Is Different Now: How AI Changes the Multitasking Equation
The old advice was simple: stop multitasking. But AI agents have made parallel work possible in ways it wasn't before. Here's a framework for managing directed parallelism without losing focus.
For years, the productivity advice around context switching was unanimous: stop doing it. Batch your work. Protect your focus. Every task switch costs you 23 minutes of recovery time, so structure your calendar to minimize transitions.
That advice was correct. And for a lot of work, it still is.
But something shifted. If you are using AI tools in your daily workflow — coding assistants, writing agents, research tools, scheduling helpers — you have probably noticed that multitasking feels different than it used to. You can kick off a research task in one window, ask an agent to draft something in another, and continue your own focused work while those run in parallel. The old rules assumed that you were the one doing the switching. That assumption no longer holds.
The question is no longer whether to multitask. It is how to multitask well when you have AI handling execution alongside you.
The old model: single-threaded work
The traditional context-switching research, most notably Gloria Mark's work at UC Irvine, measured what happens when a single human brain switches between tasks. The findings were clear: each switch incurs a cognitive tax, attention takes over 20 minutes to fully recover, and fragmented schedules destroy deep work capacity even when they appear to have plenty of free time.
This model assumed a single-threaded worker. One brain, one task at a time, and every deviation from that was a loss.
That model still applies to the work you do directly. Writing, designing, coding, thinking through a hard problem — these still require unbroken attention. Your brain has not gotten faster at context switching just because the tools around it have.
But the landscape around that single thread has changed.
The new model: directed parallelism
Here is what is actually different now: you can maintain a single focused thread of your own work while simultaneously running multiple AI-assisted workstreams that do not require your active attention.
We call this directed parallelism — the practice of intentionally running concurrent workstreams where you handle the thread that requires human judgment and AI handles the threads that require execution.
The distinction matters. Old-school multitasking meant rapidly switching your own attention between tasks, degrading performance on all of them. Directed parallelism means delegating execution to agents while keeping your attention on a single thread. You are not switching contexts. You are routing work.
This is closer to how a senior engineer or a good manager operates: they do not do everything themselves, but they maintain awareness of multiple workstreams and intervene at decision points. AI has made that operating model available to individuals, not just people with teams.
The cognitive routing framework
If directed parallelism is the goal, the practical question becomes: which tasks go on your thread, and which go on an AI thread? We find it useful to think about this as cognitive routing — deliberately deciding where each piece of work belongs based on what kind of attention it actually requires.
Work breaks down into three categories:
Focus work stays on your thread. This is the work where your specific judgment, creativity, or expertise is the bottleneck. Writing a strategy doc. Designing an architecture. Having a difficult conversation. Making a decision with incomplete information. No agent can do this for you, and attempting to split your attention while doing it still carries the traditional context-switching penalty.
Execution work goes to an AI thread. Research compilation, first drafts, data formatting, code generation, summarization, scheduling logistics — tasks where the inputs and desired outputs are well-defined. These can run in parallel with your focus work without costing you attention, as long as you set them up clearly before you start.
Review work happens in batch windows. The output from your AI threads needs evaluation. But reviewing three AI-generated drafts back-to-back is far less expensive than the context switches required to produce them yourself. You stay in evaluation mode rather than switching between creation modes.
The routing decision is the key skill. Getting it wrong in either direction is costly: routing focus work to AI produces mediocre output that still requires heavy rework, while keeping execution work on your own thread wastes the parallelism advantage entirely.
The switchboard schedule
Traditional time-blocking tells you to protect focus blocks and batch meetings. That is still good advice, but it is incomplete for a workflow that includes AI agents. You also need to account for the setup and review phases of your parallel workstreams.
A switchboard schedule adds two new block types to the standard calendar:
Dispatch blocks are short windows (15 to 20 minutes) where you set up AI workstreams. You review what needs to happen, write clear prompts or briefs, and kick off parallel tasks. This is preparation work — it requires some attention, but it is lower intensity than deep focus work. A good dispatch block at 8:30 a.m. can have three or four AI threads running by the time you start your first focus block at 9:00.
Review blocks are windows where you evaluate and integrate AI output. You batch-process what your agents produced: approving drafts, correcting research, merging code, refining plans. This is evaluative work, and doing it in batches is significantly more efficient than reviewing each output as it arrives.
A day using the switchboard schedule might look like this:
8:30 – 8:50 → Dispatch. Set up research, draft requests, and background tasks for AI. 9:00 – 11:30 → Focus. Your deep work thread. AI threads run in parallel. 11:30 – 12:00 → Review. Process AI output from the morning's parallel threads. 12:00 – 1:00 → Meetings, lunch, or light work. 1:00 – 1:15 → Dispatch. Queue afternoon AI tasks based on morning progress. 1:15 – 3:30 → Focus. Second deep work block. 3:30 – 4:00 → Review. Process afternoon AI output. 4:00 – 5:00 → Meetings, communication, or a final review cycle.
The structure gives you roughly five hours of protected focus time — which is exceptional by most standards — while also running AI workstreams that might represent another three to four hours of execution work you did not have to do yourself. Your effective output per day goes up without increasing your cognitive load.
The attention budget
Directed parallelism is not unlimited. Even with AI handling execution, you have a finite attention budget — the total cognitive capacity you can allocate across directing, monitoring, and reviewing parallel workstreams before you start losing quality on your focus thread.
For most people, this budget supports two to four concurrent AI workstreams alongside one focus thread. Beyond that, the overhead of dispatching and tracking starts to fragment your attention in exactly the way traditional multitasking does. You end up spending more time managing agents than doing the work that only you can do.
The ceiling varies by task complexity. If your focus thread is highly demanding (writing something original, solving a hard engineering problem), you might only have capacity for one or two AI threads alongside it. If your focus thread is moderate (processing email, organizing notes), you can sustain more parallelism.
Knowing your budget and staying within it is what separates directed parallelism from chaotic multitasking wearing a new label.
Where notes hold it together
The weak point in any parallel workflow is the handoff. When you dispatch work to an AI thread, you need to capture what you asked for and what the context was. When you review the output, you need to record decisions and next steps. When you resume your focus thread after a review block, you need to reload your state.
This is where note-taking becomes infrastructure rather than a nice-to-have. A lightweight running log of what is on each thread — human and AI — keeps the system coherent. Without it, you end up re-reading old prompts, forgetting what you already reviewed, and losing the efficiency gains that parallelism was supposed to provide.
The format does not need to be elaborate. A daily note with three sections works: what you are focused on, what is dispatched to AI, and what came back for review. Update it at each transition point. It becomes your switchboard — a single place that shows the state of all active threads.
The balance
The real risk with AI-enabled parallelism is not that it does not work. It is that it works well enough to be seductive. The temptation is to keep adding threads, keep dispatching more work, and keep running more things in parallel because the AI makes it feel free.
It is not free. Every parallel thread consumes some fraction of your attention budget, even if you are not actively working on it. The awareness that something is running, that output is waiting, that a decision is pending — all of this creates background cognitive load. Too many threads and you end up in the same fragmented state that traditional multitasking produced, just with better tooling.
The goal is not maximum parallelism. It is the right ratio of focus depth to parallel breadth for the work you are doing on a given day. Some days are best spent single-threaded, going deep on one hard problem with no agents running. Other days are orchestration days where you dispatch a dozen tasks, review continuously, and synthesize results. Most days are somewhere in between.
The calendar should reflect which kind of day it is. The notes should track what is running. And the person at the center of it should maintain the one thing AI cannot provide: clear judgment about what deserves their direct attention and what does not.
FAQ
What is directed parallelism? Directed parallelism is the practice of running multiple AI-assisted workstreams concurrently while maintaining a single focused thread for work that requires your direct judgment and attention. Unlike traditional multitasking, you are not switching your own attention between tasks — you are delegating execution while staying focused.
How is this different from regular multitasking? Traditional multitasking involves rapidly switching one brain between multiple tasks, which degrades performance on all of them. Directed parallelism offloads execution-oriented tasks to AI agents while you maintain focus on a single thread. The cognitive switching cost is minimal because the parallel threads do not require your active attention until review time.
What is the switchboard schedule? A scheduling approach that adds dispatch blocks and review blocks to the traditional time-blocked calendar. Dispatch blocks are short windows for setting up AI workstreams. Review blocks are windows for batch-evaluating AI output. Focus blocks in between remain protected for deep human work.
How many AI workstreams can I run at once? Most people can sustain two to four concurrent AI threads alongside one focus thread before the overhead of managing them starts to fragment attention. The exact number depends on the complexity of your focus work and the independence of the AI tasks.
What is an attention budget? Your attention budget is the total cognitive capacity available for directing, monitoring, and reviewing parallel workstreams in a given day. Exceeding it turns directed parallelism into chaotic multitasking. Staying within it means you gain execution throughput without sacrificing focus quality.
How do notes support parallel workflows? A lightweight daily log tracking your focus thread, dispatched AI tasks, and pending reviews acts as a personal switchboard. It prevents information loss at handoff points and makes it faster to reload context when switching between dispatch, focus, and review blocks.