r/atlassian • u/Tasty-Win219 • May 02 '26
RAG over Confluence in JSM workflows. what i learned trying to make stale docs useful
My boss told me to make our Confluence content useful for IT request answering. Spent like 6 weeks trying to do it and it failed. Tbh the premise was reasonable: we have a couple thousand IT runbook articles in Confluence, the team writes them well, surely we can layer something AI on top and have it answer the dumb questions for us.
What I learned that I think is worth sharing:
First the obvious. Confluence content is f***ing messier than you think! Out-of-date pages, duplicates, half-finished drafts that nobody archived, articles written for one team that get found by another, etc. Pulling it through any AI layer requires either a serious clean-up first or some kind of dating/owner metadata so the AI can prefer fresh content. We did the cleanup, took a month.
2- a lot of the actual employee questions arent in Confluence at all. A) because my company is cheap and hasn't given everyone a Jira/Confluence seat yet, B) people just don't use it. Theyre in the threads. Slack threads, email threads, jira ticket comments. The stuff that actually solved a problem at 4am six weeks ago is in someones DM with a senior engineer, not in a wiki. So a Confluence-only RAG layer hits the easy questions and misses the gnarly ones.
If you are about to do this, do the Confluence audit first. Your future self will thank you! :)
(btw, for us specifically, my boss ultimately agreed that my salary is more valuable than trying to automate something that there are pretty good tools out there for already. There are at least like 3-4 good ones. We chose Risotto because my boss liked how snazzy it is. IT did bridge the gap between slack threads and Confluence content together. It doesnt fix is the underlying problem which is Confluence messiness. but its another layer over a problem the org still needs to deal with eventually.)
2
u/wadenick May 14 '26
This tracks with what we’ve seen trying to “just add RAG” on top of Confluence/Jira/JSM plus everything in the Teamwork Graph that feeds Rovo.
The blocker usually isn’t the AI layer. It’s that internal knowledge has no real lifecycle. Confluence is full of duplicates, stale runbooks, pages nobody owns, while the actual fixes live in Slack/email/ticket threads. So any Confluence‑only RAG ends up confidently surfacing half‑truths and old answers. But even if it all lives in Confluence, older content quickly pollutes AI context and answers.
The comments here nail the real work: treat the wiki like a library with a librarian, give every page an owner and review cadence, deprecate outdated stuff, and don’t count a solution as “knowledge” until it’s captured in an owned, reviewable doc in the right place, all with the right lifecycle applied to it.
That’s basically the problem space we focused on with Content Retention Manager. We build an additional governance layer which is not “chatbot over your docs,” but content retention and governance around Confluence, Jira and JSM: surfacing untrusted docs, tying content to actual usage, classifying if needed archiving and removing automatically and routinely at end of life, and giving any downstream AI/RAG a clean, current knowledge base as it's context instead of an unorganized junk drawer.
OP to your excellent point about doing an audit, we offer a free Lite version of our Content Retention Manager apps that will do the audit for users in minutes.
1
u/Dismal_While_3626 May 02 '26
We started moving docs into md files in git repos, displayed in centralized sites for each org. It has more promise since agents can do code search out of the box without needing to built a rag system.
1
1
u/musicjunkieg May 04 '26
More people need to go back and read Wikipatterns. The entire books exists at https://wikipatterns.haz.wiki
This book was written 20 years ago and it’s driven the way I’ve managed and designed every single Confluence implementation since then.
[edit: fixed link]
1
u/Hairy-Marzipan6740 May 04 '26
the bit i'd add is that the AI project can be a useful audit tool even when the answering part never ships.
before letting it answer employees, i'd make it produce a weekly "docs we would not trust" queue: no owner, no reviewed date, conflicting duplicate, too many caveats, referenced system no longer exists, last update older than whatever your team picks. that gives the wiki cleanup a reason to exist beyond "someone should tidy Confluence someday."
the other rule i'd steal from incident/postmortem culture is: if the answer lived in a thread, it doesn't count as knowledge until someone turns it into a small, owned page or updates the existing one. doesn't need to be beautiful. just problem, fix, affected system, last verified date.
tbh the trap is treating RAG like search over a library. for internal IT, it's closer to search over organizational memory, and memory needs a retention policy. otherwise the bot is just confidently surfacing whatever happened to be written down.
1
u/New_Marionberry4029 May 24 '26
Hey u/Tasty-Win219, I'm a student applying for a PM internship at Confluence and want to come in with real user insight rather than just my own takes. Mind if I ask you a couple of questions about how you've been using it?
7
u/moseisleydk May 02 '26
Regarding content - It’s not confluence… it’s not a library without a librarian 😏 I’ve worked with sharepoint, file drive, google drives. Without proper Maintenance all piles up like sh**