Don't Frankenstein Your Flow: How to Integrate AI Into Research Workflows with Integrity & Intention

Most research teams think they're adopting AI. In reality, they're just Frankensteining their flows—quietly ruining their ROI, their reputation, and their research integrity. And it all comes due as mounting risk the moment regulation catches up.

Here's how it happens. An AI-curious researcher goes off and maps the frontier—a clever prompt here, a transcription tool there, a synthesis shortcut bolted onto the old process. It works, sort of. So another tool gets stitched on. Then another. Nobody steps back to ask whether the whole thing still holds together. What you end up with isn't a workflow. It's a monster assembled from mismatched parts, held together by manual handoffs, and quietly leaking quality at every seam.

That's the trap. And the data says almost everyone is in it.


I watched this happen for two and a half years

At Instacart, I spent the better part of two and a half years helping a research team adopt AI. We had the trainings. We had the "AI Fridays" where people shared what they were trying. We had genuine early adopters racing ahead.

What we didn't have was time, attention, and intention.

Everyone went a different direction. The pioneers built clever personal workflows and saw real gains. The skeptics opted out entirely. And in between, most people grabbed whatever tool was in front of them and bolted it onto the way they'd always worked. We were slapping spaghetti at the wall and calling it transformation. But spaghetti isn't a strategy. It's just a mess you have to clean up later.

And here's the part leaders miss: this was never a motivation problem. People wanted to do better. What they didn't have was the time and space to do it — room to step back, tinker, test, and rebuild, rather than figuring it out alone in the cracks between deadlines. You can't redesign a flow you're never given permission to stop and look at. Intention is the piece HEARTS solves. But good intentions without time and attention are just hollow promises and missed opportunities.

The thing nobody did — the thing that actually matters — was stop, look back at the whole flow, and intentionally re-architect it. Inspect what was working. Cut what wasn't. Decide and design, deliberately, where AI belongs as a co-pilot, where a human has to stay in the driver's seat, and where the human verification, safeguards, and stopping points need to live.

Without reflection, there is no shared knowledge, no collective upleveling, and the prophesied productivity gains remained trapped with a handful of individuals. The proverbial rising tide that raises every ship remains out of reach. Ours just sloshed around in swells of AI slop, tools, and burnout.


The data: bolting on doesn't pay off

This isn't just my story. It's the whole industry's. Nearly every research leader I speak to echoes the same lament.

Adoption is everywhere, but going nowhere. Roughly 80% of UX researchers now report using AI in their workflows (User Interviews, 2025), compared with about 21% of the general U.S. workforce (Pew Research Center, 2025). What’s interesting here is that Researchers are racing into AI faster than almost anyone.

But racing in isn't the same as redesigning. Only about 21% of organizations have fundamentally redesigned a workflow around AI, and the high performers are roughly 3x more likely to redesign than to bolt AI onto what they already had (McKinsey, 2025). Almost everyone is choosing the cheap path. The problem is: the cheap path doesn't scale.

Look at what bolting on actually produces:

  • By one widely cited estimate, 95% of enterprise generative-AI pilots have delivered no measurable return (MIT Project NANDA, 2025, via Fortune).

  • A separate study put the failure rate of AI projects above 80% — and traced the cause not to weak models, but to organizational gaps: unclear purpose, bad integration, no governance (RAND, 2024).

  • Even when AI does speed up a task, nearly 40% of those gains get clawed back by rework — humans fixing the hallucinations and edge cases the process was never built to catch (Workday, 2025).

There's a reason for this, and Jakob Nielsen put it cleanly: profitable AI is workflow redesign, not task optimization (Nielsen, 2026). Speed up one step in a 10-step process, and you don't transform anything—you just move the bottleneck downstream and hit it faster. Partial automation preserves the jam. A controlled field experiment out of INSEAD and Harvard makes the cost of getting this wrong vivid: startups taught to redesign their workflows around AI generated 90% more revenue than equally funded, equally skilled peers who used AI only to speed up isolated tasks (Kim, Kim & Koning, 2026).

The population there was startups, but the mechanism is universal. That's the gap between Frankensteining your flow and rebuilding and redesigning it with intention and integrity.


The cost that actually keeps leaders up at night

Burnout, tech debt, skyrocketing token costs, hours of rework, and vanishing ROI — those are the practitioners' pain points. If you lead a research org, there's a sharper one: reputation erosion.

When you scale a Frankensteined flow across a team, you're not just risking a slow project. You're risking your reputation and your research integrity. Already, 91% of researchers worry about AI accuracy and hallucinations (User Interviews, 2025). And the more we trust these tools, the less we scrutinize them — the research shows higher confidence in AI correlates with less critical thinking, not more (Lee et al., 2025). Stitch that into a high-stakes study, ship the result to a stakeholder, and a product decision gets made on an insight nobody actually verified. Remind me again…what’s the point of research?

Then regulation catches up. The EU AI Act is already phasing in real obligations around auditability, transparency, and data governance. A research flow that was duct-taped together with no traceability and no record of where the AI stopped, and the human started, isn't just messy — it's a mountain of risk and regulatory fines waiting in the wings. The teams that bolted AI on fastest will face the biggest challenges and costs.

This is the part I'll keep saying until it lands: in 2026, authenticity and transparency aren't just ethical positions. They're operational requirements.


The HEARTS Framework for evaluating AI-assisted UX research, by Kaleb Loosbrock

HEARTS is the connective tissue

So how do you adopt AI without building a monster?

You stop bolting on and start re-architecting—with a purpose-driven framework that holds the whole flow together. That's what I built HEARTS for. It's the reflection loop and the connective tissue: 6 lenses you run an AI-assisted workflow through so you can have confidence instead of conflict, and so the result is something you'd actually stake your reputation on.

  • H — Human-Led & Centered. The researcher is the pilot, not the passenger.

  • E — Experience-Focused. Design for the experience of everyone the research touches.

  • A — Amplification, Not Automation. AI amplifies your judgment; it doesn't replace it.

  • R — Rigorous & Responsible. Maintain integrity, minimize bias, maximize confidence.

  • T — Trustworthy & Transparent. Prove the lineage of every insight. Disclose by default.

  • S — Safe, Secure & Sustainable. Protect the people, the data, the practice, and the planet.

Each letter is a lens, and each comes with a question that forces a design decision instead of a default. Where does the human stay in control? Where does AI amplify versus quietly take over? Could someone audit this tomorrow and follow the trail? The full framework — definitions, the scorecard, and how it maps to NIST, the OECD, and the EU AI Act — lives at heykaleb.com/hearts-framework.

What matters here is what HEARTS does to a flow. It turns "we bolted on some AI" into a deliberate map of what the machine does, what the human owns, and where the checkpoints are. It's the difference between a stitched-together monster and an intentionally designed system built around integrity & empathy.


How to start today

You don't necessarily need a six-week project to begin. Before you launch your next AI-integrated study, score your workflow against each of the 6 pillars, 1 to 5, where 1 is high-risk and 5 is high-integrity. Be honest — a scorecard you game is just paperwork.

Then hold one line: a 1 or a 2 on Rigorous, Trustworthy, or Safe, Secure & Sustainable means you redesign before you collect data, not after. Those are the three where the damage is hardest to undo.

And keep your expectations grounded. AI's gains are real but jagged—researchers move faster and produce better work inside the model's competence, and measurably worse outside it (Dell'Acqua et al., 2025). The rigorous, economy-wide estimates put real time savings at a few percent of work hours, not the magic numbers the vendor decks promise (St. Louis Fed, 2025). Knowing where that edge is — and designing for it — is exactly the work HEARTS forces you to do.

The payoff isn't just a cleaner process. It's a team that levels up together instead of a few individuals hoarding clever tricks. That's the rising tide. And it's the version of AI adoption that actually pays — because the returns belong to the teams that redesign, not the ones that bolt on.


An invitation

I'm publishing HEARTS so your team can use it — and pressure-test it. Take it into your next study. Run a workflow you already trust through the 6 lenses and see what it surfaces. Tell me where it holds and where it breaks—especially where it breaks.

Because our goal is never to adopt AI faster. It is to adopt it with intention and integrity — so you build something that lasts, instead of something you'll spend next year taking apart and redesigning.

And if this sounds familiar—if you're staring at a Frankensteined flow and want help turning that spaghetti into a strategy—that's the work I do. Helping to transform AI mandates into actual change, confidence, and realized ROI. No pitch, no pressure: let's talk.

Don't Frankenstein your flow. Redesign it with intention & integrity.


The HEARTS Framework, developed by Kaleb Loosbrock with the AIxUXR community. The full framework lives at heykaleb.com/hearts-framework.

A note on how this was made: this piece was human-led and AI-assisted. I wrote and own the thinking; AI helped me draft and pressure-check it, and every statistic here was verified against its primary source. That’s me practicing the T.


Sources

  • Dell'Acqua, F., McFowland, E., Mollick, E., Lifshitz-Assaf, H., Kellogg, K., Rajendran, S., Krayer, L., Candelon, F., & Lakhani, K. R. (2025). Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality. Organization Science. https://doi.org/10.1287/orsc.2025.21838 — inside the frontier: ~25% faster, ~40% higher quality; outside it: −19 points accuracy (n=758).

  • European Parliament & Council of the European Union. (2024). Regulation (EU) 2024/1689 (Artificial Intelligence Act). Official Journal of the European Union. https://eur-lex.europa.eu/eli/reg/2024/1689/oj — phased obligations on auditability, transparency, and data governance.

  • Kim, H., Kim, D., & Koning, R. (2026). Mapping AI into production: A field experiment on firm performance [Working paper]. INSEAD/Harvard Business School. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6513481 — startups taught to redesign workflows around AI generated 90% more revenue (1.9x) than matched peers (n=515).

  • Lee, H.-P., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., & Wilson, N. (2025). The impact of generative AI on critical thinking. Proceedings of CHI 2025. ACM. https://doi.org/10.1145/3706598.3713778 — higher confidence in AI correlates with less critical thinking / more cognitive offloading (n=319).

  • McKinsey & Company. (2025). The state of AI: How organizations are rewiring to capture value. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai — ~21% of organizations have fundamentally redesigned workflows; high performers ~3x more likely to redesign than bolt on.

  • MIT Project NANDA (Challapally, A., Pease, C., Raskar, R., & Chari, P.). (2025, July). The GenAI divide: State of AI in business 2025. Massachusetts Institute of Technology. Coverage: Fortune (2025, Aug. 18). — 95% of enterprise generative-AI pilots show no measurable P&L return (52 interviews, 153 surveys, 300+ deployments).

  • Nielsen, J. (2026, June). Redesigning workflows for AI. https://jakobnielsenphd.substack.com/p/workflow-redesign — "profitable AI is workflow redesign, not task optimization"; partial automation preserves bottlenecks.

  • Pew Research Center. (2025, October 6). About 1 in 5 U.S. workers now use AI in their job, up since last year. https://www.pewresearch.org/short-reads/2025/10/06/about-1-in-5-us-workers-now-use-ai-in-their-job-up-since-last-year/ — 21% of U.S. workers use AI for some work (up from 16%).

  • Ryseff, J., De Bruhl, B. F., & Newberry, S. J. (2024). The root causes of failure for artificial intelligence projects and how they can succeed. RAND Corporation. https://www.rand.org/pubs/research_reports/RRA2680-1.html — >80% of AI projects fail; root cause is organizational, not model quality.

  • U.S. Federal Reserve Bank of St. Louis (Bick, A., Blandin, A., & Deming, D.). (2025, February). The impact of generative AI on work productivity. https://www.stlouisfed.org/on-the-economy/2025/feb/impact-generative-ai-work-productivity — generative-AI users saved ~5.4% of work hours (~2.2 hrs/week).

  • User Interviews. (2025). The state of user research 2025. https://www.userinterviews.com/state-of-user-research-report — 80% of UX researchers use AI (+24 pts YoY); 91% worry about accuracy/hallucinations (n=485).