Designing AI Assistance Without Breaking Trust

Introducing AI-assisted code generation in IBM Verify while preserving user control and accountability

cover

Context

This case study focuses on introducing AI-assisted code generation into IBM Verify, an enterprise identity and access management platform.

Verify relies on Common Expression Language (CEL) to define identity logic such as attribute transformations, access policies, and payload handling. While CEL is powerful, it is also difficult to write and maintain — especially for administrators who are not CEL experts.

 The product direction was to explore how natural language prompting could help users create CEL scripts more efficiently, while still operating within the constraints of an enterprise security system.

 

Why does this matter?

Introducing AI into identity and security workflows creates an inherent tension:

  • Speed: AI can significantly reduce the time it takes to write complex logic
  • Trust: Identity systems demand correctness, transparency, and user accountability

In this context, a small mistake can break authentication flows or introduce security risk.
Admins are ultimately responsible for the logic they deploy — which makes blind automation unacceptable.

 

My Role — What did I actually do?

I was the product designer on this initiative, responsible for shaping the end-to-end user experience of AI-assisted CEL generation. My responsibilities included:

  • Translating the roadmap goal into a clear, user-centered design problem
  • Defining how AI should (and should not) participate in a security-critical workflow
  • Partnering closely with xITB and engineering to align on scope, guardrails, and feasibility 

How I framed and worked through the problem

Before moving into solutions, I focused on aligning on the right problem and reducing risk early. I work iteratively — observing real constraints, reflecting on tradeoffs, and making focused decisions — while using clear alignment tools to keep the team moving in the same direction. This approach helped us navigate ambiguity early and move into execution with confidence.

Frame2

Hills
We used clear, human-focused problem statements to align on what actually needed to change before committing to solutions.

Playbacks
Regular check-ins helped us reflect on progress, validate assumptions, and ensure we were solving the right problem — not just shipping features.

Sponsor users
Close collaboration with domain experts grounded design decisions in real identity and security constraints.

Understanding the current reality (As-Is)

Before exploring solutions, I focused on understanding how administrators currently create custom attributes and CEL logic in Verify — and where friction, delay, and risk actually occur.

story


This journey captures a common scenario for identity administrators today. While the goal is straightforward — creating a custom attribute — the path to get there is fragmented and cognitively demanding.

Admins often

  • Know what they want to achieve, but not how to express it in CEL
  • Rely on searching documentation, internal chat, or colleagues for help
  • Iterate through trial and error before arriving at a working solution

Even simple changes can take significant time, introducing frustration and unnecessary dependency on others.

The cognitive load and the time taken to create a custom attribute has left me very frustrated. I feel pressure to learn CEL but there’s very little resources to do so and I don’t have the time! I’ve familiarity with Java and wish there was a way I could’ve simply used that. It would be really useful if something could’ve helped me create a custom attribute faster without having to run around looking for a code.   — IBM Verify User

The Hill (Problem Alignment) - Based on these insights, we aligned on a clear Hill to guide the work:
Help identity administrators create correct CEL logic faster, without increasing cognitive load or introducing security risk.

 

This Hill intentionally focused on:
  • Reducing friction at the moment of creation.
  • Supporting users without replacing their responsibility.
  • Improving confidence, not just speed.
  • With this alignment in place, we could explore solutions knowing exactly what problem we were solving — and what we were not.
Hills

Exploring the solution space (Options & Inspiration)

With a clear Hill in place, I explored how AI-assisted creation is handled in other code and logic-heavy tools — not to copy patterns, but to understand where assistance is most effective and where it breaks trust.

The goal at this stage was not to design a solution, but to understand the range of possible approaches, their tradeoffs, and where they might fail in a security-critical context. I looked across a mix of internal patterns, adjacent products, and developer-facing tools to understand how others support complex logic creation without removing user control.

 

tools

What we learned Across these tools, three recurring forms of assistance stood out:

  • Generation: helping users get started or translate intent into code
  • Explanation: making generated or existing logic understandable
  • Fixing: supporting correction, refinement, and iteration

Notably, tools that leaned too heavily on generation without explanation or control often felt unpredictable or unsafe — especially in more complex or high-stakes workflows

 

Early design exploration; Low-fidelity concepts

With a clearer understanding of how AI assistance could support creation — generate, explain, and fix — I moved into early design exploration to test how these ideas might translate into real workflows inside Verify.

I explored multiple low-fidelity concepts to test how AI assistance could fit into the CEL authoring workflow. Each concept intentionally varied how and when AI appeared, how much context it exposed, and how much control users retained. At this stage, the goal was not polish, but to explore interaction models, validate assumptions, and surface tradeoffs early. 

The goal was not to design a single solution, but to understand which interaction patterns felt supportive versus risky in a security-critical environment.

 

solutions

Across these explorations, a clear pattern emerged: users needed more than fast generation. Confidence came from understanding and the ability to correct. This insight led to a more holistic approach where AI supports users through generation, explanation, and fixing — not just output.

User Feedback That Shaped the Direction

Before converging on a final direction, I validated early concepts with identity administrators to understand whether the proposed AI assistance felt useful, trustworthy, and appropriate for a security-critical workflow. The goal was not to test UI polish, but to validate value and viability of the interaction model.

Research objectives :

  • How do users (or user proxies) respond to early solution concepts?
  • Assess the perceived value of multiple, lo-fidelity solution concepts with Verify users or user proxies (i.e. internal IBM SMEs)
  • Understand how the solution concepts fit into actual user workflows (i.e. how well would it meet their needs?)Identify design considerations that we may not have thought of already (i.e. what’s missing from these ideas?)
     
iteratiosnssss

Key validation signals

 Across sessions, a few clear signals emerged that guided convergence:

  • Split-screen experiences increased confidence : Users preferred keeping generated CEL visible and editable alongside explanations.
  • Inline and chat-only prompts felt lightweight but limited: These approaches were fast, but often lacked sufficient context for complex logic.
  • Reusable code libraries were consistently valued: Users saw them as a way to learn, reuse, and reduce repeated effort.
  • Control and understanding mattered more than speed: Users were willing to trade instant generation for clarity and predictability.


charts

How we decided what the final shape should be.

Based on exploration and validation, the goal was not to pick a single interaction pattern, but to combine the strengths of multiple approaches while avoiding their weaknesses.

Rather than choosing between split screen or inline prompting, we intentionally designed a solution that blended both — supported by a reusable code library.

Informed by this research, our solution combines concepts; split screen (A) and inline prompt (B).

  1. With a dual panel, the split screen offers clarity and space, allowing AI to compliment a workflow and not obstruct it.
  2. The inline prompt offers quick action for when the full chat experience isn’t required.
  3. Further, all participants valued the library as a tandem to generative AI. It’s addition adds that extra bit of panache and ingenuity to the primary solution.
Design infomed.

The To-Be Experience: Generate → Explain → Fix

The final experience introduces AI assistance into Verify through three complementary surfaces: a chat panel, an inline prompt, and a reusable code library.

Each surface serves a different moment of need, but all are designed around the same goal:
helping users create, understand, and refine CEL logic without losing control or trust. Rather than forcing a single interaction model, the experience adapts to user intent and task complexity.

Chat panel— Deep assistance without losing visibility

The chat panel provides a dedicated space for more complex interactions, such as generating CEL logic, asking for explanations, or fixing existing expressions.

Key characteristics:

  • Lives alongside the CEL editor in a split-screen layout
  • Keeps generated code visible and editable at all times
  • Supports longer, multi-step interactions without interrupting the workflow

This surface is best suited for:

  • Complex logic creation
  • Understanding unfamiliar CEL expressions
  • Iterative refinement and debugging
  • The split-screen approach ensures AI complements the workflow instead of obscuring it.

Inline prompt — Fast, lightweight actions

The inline prompt offers a quicker, more lightweight way to access assistance when a full chat interaction isn’t required. Inline prompts allow users to move quickly while still benefiting from AI support.

Key characteristics:

  • Embedded directly in the CEL authoring context
  • Optimized for short, focused requests
  • Minimizes disruption to the user’s flow

This surface is best suited for:

  • Simple transformations
  • Small adjustments
  • Repetitive or well-understood tasks

Code library — Continuity beyond a single interaction

The code library supports reuse, learning, and consistency across teams.

Key characteristics:

  • Stores generated and manually created CEL snippets
  • Allows users to save, retrieve, and share logic
  • Reduces reliance on external documentation and ad-hoc copying

This surface is best suited for:

  • Making logic persistent
  • Encouraging reuse of known-good patterns
  • Supporting organizational standards

One system, one intention

While these surfaces differ in form, they are intentionally designed to work together.

  • The chat panel supports depth and understanding
  • The inline prompt supports speed and efficiency
  • The code library supports continuity and learning

Together, they create an experience where AI assists across moments of need — without forcing users into a single way of working.

How this resolves the original tension

This solution resolves the tension between speed and trust by treating AI as adaptive support rather than a single, authoritative feature. By distributing assistance across a chat panel, inline prompts, and a reusable code library, users can choose the level of help they need without losing visibility, control, or accountability. Logic is never hidden, understanding is always accessible, and correction is an expected part of the workflow. As a result, AI accelerates CEL creation while reinforcing confidence in what gets deployed — supporting faster outcomes without compromising the integrity required in identity and security systems.