AI as a Junior Designer - An Experiment

I Ran an AI Design Experiment. Here’s What Actually Happened.

UX Design

Feb 26, 2026

The demos look impressive. An AI agent reads a design system, interprets a prompt, and produces a screen in minutes. If you’ve been in any design or engineering conversation lately, you’ve probably seen one. You may have also felt the implicit message underneath it: “Do we still need a designer for this?”

I decided to find out. Not with a toy problem on a blank canvas, but with a real feature, a real mature design system, and a real production design-to-code system. Here’s what I learned.

The Experiment

I used a community-built MCP (Model Context Protocol) server—not the official Figma-released version—that allows two-way communication between Claude Code and Figma, including direct read/write access to an actual component library. This matters because the official tooling doesn’t yet support this in a way that I've found time-saving and accurate yet; the community tool is closer to what everyone hopes the future looks like.

The task: design a complex, multi-state management screen for a production product (details withheld because it's an active project we're building!) The AI had access to:

The full component library with all components
A fully annotated example of a related feature in the same workflow, designed with the same system utilizing the same component library
A detailed prompt that I’d estimate was more thorough than I’d give a junior developer

The prompt took 18 minutes to write. The AI ran for 37 minutes. I had to monitor it and approve access four times during that window.

I built the same feature myself in about 45 minutes across two sessions.

What It Produced vs. What I Produced

Where AI did okay

It correctly placed actual components from the library in simple cases—a quick validation test before the main prompt confirmed it could grab a component, use the right properties, and place it on the canvas. And it worked perfectly!
For the main task, it got the high-level structure right: the right tabs, a bulk actions menu, confirmation dialogs with appropriate warning language, and developer annotations. For someone looking at the output without context, it looks reasonable.

Where it fell short

This is where it gets instructive.

Deviated significantly from established design patterns. Despite having direct access to the component library and multiple worked examples, it introduced inconsistent UI patterns—both within its own output and across the established product. Inconsistency in a product used by time-constrained users isn’t aesthetic; it’s a cognitive load problem.
Used a pattern we'd specifically tested and retired. This one stung. We'd done the research, found that users consistently missed a specific UI pattern, and replaced it with something better. The AI, given examples showing the correct pattern, still produced the wrong one. It has no access to the "why" behind design decisions — only the artifacts.
I never thought to include that decision in my prompt. Why would I? It was made over a year ago. It's so embedded in how I work that it doesn't feel like a rule anymore — it just feels like obvious. That's exactly how 30 years of experience works. It becomes subconscious.
Yes, you can create instruction files for AI to reference. But there's no document on earth that contains 15 years in a high school classroom, a PhD in instructional technology, and 15 years of EdTech UX work. And even if there were — this AI ignored instructions I gave it in the prompt I actually wrote. I'm not optimistic about its compliance with a 400-page knowledge base.
Recreated components instead of using them. For anything outside of pre-built components, it built from scratch rather than grabbing a component and adapting it. And, actually, even some that were directly from existing - like several dropdown menus - there's a component for that and it was used in the worked examples I provided it and it still created it from scratch and didn't even match the styling of the existing dropdown component! This breaks Code Connect—the system that converts our Figma designs directly to production-ready code snippets. We've even set up instructions for AI to be able to handle these detached components (often cards, tables or tab panels that need widely varying content inside) - that rely on the naming of layers being the same as the original component so that it can recognize it as detached. Every recreated element means manual engineering work downstream.
Experienced designers know when to detach a component, and any junior designer I work with would have been encouraged to stop and ask me the question if they weren't sure if they should detach and modify or create something new.
Ignored design system variables. The prompt explicitly asked for spacing and gap values from our variable system. It used hardcoded numbers instead. In a design-to-code workflow, this slows everything down.
Missed requirements from the prompt. A filter specified in the prompt didn’t appear in the output. Content that was supposed to be separated by scope was mixed together on one view.
Produced incomplete developer annotations. Several were incorrect and key engineering questions were left unanswered—which is one of the areas I'd hoped to save time on as it often takes longer than the actual design (since we have such a great and robust design component system to play with like legos).
Only delivered one state. I delivered the full flow: multiple tab states, permission-based variations, all action menus. Developers can build from my files without guessing. The AI’s output covered the happy path of only one tab.

The Real Cost: Not Time. Rework.

My original estimate was that the time roughly broke even. That was wrong, and your results changed my math.

Because the AI recreated components instead of using them, and used hardcoded values instead of variables, fixing the output doesn’t mean editing a few frames. It means rebuilding the underlying structure—the same work I’d have done from scratch, but now with someone else’s decisions to untangle first. The estimate: 20 minutes of prompt work plus at least 45 minutes to fix it. That’s longer than it took me to build the original.

And this wasn’t a junior designer’s output I was improving. This was the AI’s best effort with direct library access, detailed instructions, and worked examples. In any other context, this would be the conditions for success.

Why the Demos Don’t Tell the Whole Story

Most AI design demos are run on simple interfaces with simple design systems that aren't asking for complex decisions like when to detach an instance. A login screen. A card layout. A dashboard with generic components. There’s nothing wrong with that—those are legitimate use cases. But they don’t reflect the conditions at a company with:

A mature, complex component library built over multiple years
Custom Code Connect that ties specific component instances directly to production code
Established patterns built through user research and testing
Design decisions that carry institutional memory about why certain approaches were rejected

At that level of maturity, the AI isn’t speeding up a design process—it’s bypassing the knowledge that makes the design worth building. The more sophisticated your system, the bigger the gap between what AI produces and what your product actually needs.

NNG’s most recent status report confirms this: AI design tools are improving, but they remain most useful for narrowly scoped tasks, and are not ready to replace designers on complex, production-grade work.

What This Means for AI in UX Work

None of this means AI design tooling isn’t worth watching. It absolutely is. The community MCP I tested did something genuinely new—two-way Figma communication with real component library access and even the ability to edit that design system—and that’s a meaningful step. I’ll keep testing every iteration that comes out.

The AI did surface one idea I hadn’t considered for representing a dual-state element. I would have used it differently than it did—it broke a filtering pattern in the process—but the kernel of the idea was worth seeing. That’s AI at its best in this workflow: ideation partner, not executor.

Here’s where I see the honest value right now:

Writing tasks: drafting annotation copy, generating research discussion guides, summarizing notes
Research planning: structuring plans, generating screener questions, creating realistic prototype data
Ideation volume: brainstorming when you need options fast
Skinning wireframes created by the UX professional using the existing design system components: potentially, once tooling can reliably use actual component instances (and know when to detach one to modify it) rather than recreating them—we’re not there yet

What it can’t do: understand your users’ mental models, know why your team made the decisions it made, carry the institutional knowledge that lives in a designer’s head, or deliver the full range of states and edge cases engineers need to build something real.

The Bottom Line

The tools are evolving. We need to keep up with them, test them periodically, and honestly evaluate where they fit. That’s not optional—it’s part of the job.

But “evolving” and “ready” are different things. This experiment didn’t fail because the AI was bad at design in general. It struggled because the task required things it genuinely can’t do yet: respecting institutional design decisions, navigating a mature and complex system, and delivering production-ready files rather than a single-state sketch.

This output wouldn’t have met the bar I’d set for a junior designer. And it cost more time to produce than doing the work myself.

When the tools can reliably use a real component library, respect established patterns, and deliver complete flows—I’ll be the first to integrate them into my workflow. I’m genuinely looking forward to that day.

Until then: keep testing, keep your expectations honest, and keep the strategy and design work human.

‹ From High School Teacher to UX Strategist

Subscribe to EdTech Insights

Get in Touch