In this episode, Sarah O’Keefe and Alan Pringle explore how AI transforms content delivery from static documents into dynamic, consumer-driven experiences. However, the need for human-led governance is critical, and Sarah and Alan explore issues of accuracy, accountability, governance, and more. They challenge organizations to define AI success by its ability to deliver accurate, high-impact outcomes for the end user.
Sarah O’Keefe: The metrics that are being used to measure the success of AI are all wrong. We should be measuring the success of various AI efforts based on, “Are people getting what they need? Are they having a successful outcome with whatever it is that they’re trying to do?” The metric we actually seem to be using is, “What percentage of your workflow is using AI? How many people can we get rid of because we’re automating everything with AI?” It’s the wrong metric. The question is, how good are the outcomes?
Related links:
- Sarah O’Keefe: AI and content: Avoiding disaster
- Sarah O’Keefe: AI and accountability
- Alan Pringle: Structured content: a backbone for AI success
- Questions for Sarah and Alan? Register for our upcoming webinar, Ask Me Anything: AI in content ops.
LinkedIn:
Transcript:
This is a machine-generated transcript with edits.
Introduction with ambient background music
Christine Cuellar: From Scriptorium, this is Content Operations, a show that delivers industry-leading insights for global organizations.
Bill Swallow: In the end, you have a unified experience so that people aren’t relearning how to engage with your content in every context you produce it.
Sarah O’Keefe: Change is perceived as being risky; you have to convince me that making the change is less risky than not making the change.
Alan Pringle: And at some point, you are going to have tools, technology, and processes that no longer support your needs, so if you think about that ahead of time, you’re going to be much better off.
End of introduction
Alan Pringle: Hey everybody, I’m Alan Pringle, and today I’m here with Sarah O’Keefe, and we want to do something I’ve kind of dreaded to be honest, to do a check-in on AI in the content space. I’m very ambivalent about this topic. There’s still even two, three years in, there’s still a lot of hype, but there’s also been some good things that have emerged.
We need to talk about it fairly realistically. So, Sarah, get ready. Let’s see if I can not curse during this. We’ll try. I’ll try my best not to be like that in this. Legitimately, there are some things that we need to talk about, and also about the challenges because I don’t think the content world is completely ready for a lot of what’s going on right now.
Sarah O’Keefe: You know that we have AI that can remove cursing from podcasts, so I feel like we’re good here.
AP: Well, also, it’s a challenge to me to behave in a PG-13 more family-friendly kind of way. So I’ll do my best.
SO: I have no idea what you’re talking about.
AP: Yeah. So let’s start with the good and where things are right now with the positives. What is AI doing well right now? And let’s kind of get beyond the summarization. I think we can say objectively right now, in general, AI does a very good job of summarizing existing content. But I think it’s doing a lot more beyond that, and we should touch on those things instead.
SO: The first thing that I would say is that summarization, but specifically the use case of a chatbot or a large learning model, an LLM, so now we’re talking about Claude, Gemini, ChatGPT and all the rest of them, which has the ability to provide an end user with a way of accessing information, an information access point that is different than what we had previously.
In the olden days, you had a book, and you had to sort of flip it open and look at a table of contents or maybe an index and navigate to a page. Fine. Then along comes online content, and you can do full text search, or you can then go into an internet search, right? You type into the search bar, you get a bunch of results, you click, and you sort of, no, that’s not quite it. You modify your search string, you search again, and you sort of navigate your way to where you’re trying to go. With the interactivity of the, you know, ChatGPT class of tools. What happens is that I ask it a question and it gives me an answer. And then I say, that’s not quite what I wanted. And I can sort of zero in on exactly what I’m looking for and tell it, but actually make this easier. Or I don’t understand the words you’re using. Use simpler language. Give me more. Give me less. Give me a summary. Use this as a source. Do not use that as a source.
It’s a new way to access information. People love it. There is something psychologically helpful about a conversational search. Now, there’s obviously huge issues with this, particularly around people, you know, using chatbots as their therapists, which introduces all sorts of horrifying, horrifying ethical issues.
AP: Personifying them as a person on the other end. Right.
SO: But in the big picture, used well, it allows you to get to the information you’re looking for and get at it in the way that you want.
AP: There’s a control issue here. I don’t think the content consumer has ever had this level of control.
SO: Yeah, and as a content consumer, that speaks to me. That is helpful. We’re seeing increasing use of, I would say, guardrails. So, not just slam out the AI with a bunch of stuff, but rather we’ve put some guardrails around it, and there’s various kinds of technologies that you can employ there. And that has been very helpful. And then the third thing I would point to is when we talk about generative AI and generating content, there’s a lot you can do in that sort low fidelity bucket. And what I mean here is I need an image for a presentation, but the background is the wrong color, so I can just swap it out. Now, I can do that with Photoshop. Well, some people can do that with Photoshop.
AP: Well, I was about to say, don’t think you or I should be saying we can do the Photoshop because we kind of can’t.
SO: Right. Well, and that’s exactly it. So it’s lowered the bar, right? Because I can tell the AI to swap out the background, and it will. And it applies a mid-level Photoshop capability to this image. And now I have the image that I need with a dark background so that the white text shows up in my presentation, that kind of thing.
AP: Right. Yeah.
SO: We can do low-stakes synthetic audio if this podcast, which for the record we are recording with actual human beings, but let’s say that Alan curses extensively and we need to swap it out well, we could pretty easily generate some synthetic audio that sounds like him and that PG of eyes the original wording into something that is You know cleaner it would be way funnier to just bleep it. So I don’t know why we would do this but…
AP: Correct. Well, and it may come to that. The bottom line is what you’re talking about here is things that have very low risk. This is more fun stuff, the thought of doing some of what we’re talking about and stuff that describes how to use a medical device, for example. Not sure I want to go there with that. But for something low stakes like some one-off presentation that you’re giving, maybe some humor is involved, I totally think that’s an acceptable use because there’s no risk there.
SO: That’s really the key point because let’s say you’re writing content for a new medical device. Now you probably have a version one of said medical device, and you’re doing a version two. So, okay, fine. We take the version one content and we sort of, you know, say add color because that’s what we added, you know, in version two, and update all this stuff automatically.
But it then becomes very important to actually read that, look at that information, look at all the images, make sure that everything is correct. And by the time you do that super carefully, you may have given back all the time that you saved on the back end when you basically made a copy and said generate the new version. There’s some, you have to be really careful with that, especially depending on what your stakes are in terms of regulating regulatory or compliance stuff.
You can, of course, get away with using AI, as you said, for low-stakes stuff. Now, the big risk you run there, and we’re seeing this in my favorite example of low-stakes content, which is video games, the video game industry has seen huge amounts of pushback against AI-generated game content, because it’s not fun. It’s not creative. It feels flat. It’s not art, and it’s not fun to play. And so it just becomes a slog. Again, same thing. Did you use it for maybe some backgrounds here and there? Okay. Did you use it to drive the story that you’re trying to establish or set up? You know, the enemies that you’re hypothetically fighting, and then they all have a certain sameness, or they all, you know, you’re sort of stealthing your way around the map. And it turns out that the AI-generated things are really dumb in that once they turn their back, you can do literally anything and they won’t notice because it was poorly designed.
AP: Right, yeah. And that’s true even in the film entertainment industry. There’s been a tremendous amount of pushback for the very reason I read a review recently talking about a series of clips about history on, I believe it’s on YouTube, by a fairly well-known director I will not name.
SO: Mm-hmm.
AP: And some of the AI is frankly not done well. And one reviewer basically said that a lot of the people, when you look at the back of these AI-generated, like an AI-generated King George, the back of his head looks like a melted candle. This is not what we want here. If you’re so focused on that sort of thing, you’re not paying attention to the message. But again, this is low-stakes content.
We have started getting into kind of more the content creator point of view. We’ve talked about the consumer and how AI gives them much more control, flexibility in how they receive information. But let’s talk about what that means more for the people that have to create the information because it’s a huge shift on multiple levels and this idea of creating, especially in the product content world, these lovely design page-based PDFs and whatever else, and even webpages, hate to say it, those days are gone, or should be at this point.
SO: Yeah, again, you know, we step back to books, and you write the content, it goes through like a manuscript process of some sort, and then it gets poured into a book. It gets printed on paper, which is about the least flexible thing you can imagine, right? Because I, as the book publisher, get to decide what font is on the page and what size.
And if you don’t like that font, well, maybe you can get your hands on a large print edition. Maybe you can get your hands on a braille edition. Maybe. But the form factor of the content was determined by the publisher of the content, or technically, the printer. But, you know, that physical book production process. PDF, not that different in the sense that the content is bound into the PDF and it’s fixed. Now.
You get a little bit more control because you can zoom in. There’s some things you can do in PDF, but ultimately it’s more or less still a page factor determined by the author/publisher/gatekeeper.
So now we talk about the web and HTML. This is all pre-AI, right? HTML goes out there, and there’s actually a decent bit you can do in your browser. You can override the default font. You can override the default font size. You can say, I’m using dark mode or light mode or those kinds of things.
AP: Light mode, exactly.
SO: If you have an e-book reader, you can override the default font or font size.
AP: I need that font size jacked up, please. Thank you.
SO: We weren’t going to use that example. Right. Yeah. So you get a little bit more control, right? You have a little bit more control over the presentation. Now, let’s talk about what AI does to this, and particularly the large language models. Now, I, as the author, create a whole bunch of content, and I put it somewhere. And the content consumer says…
AP: I’ll use it.
SO: Tell me about this concept or tell me about this thing or give me information about whatever. And they get a response to that prompt, which is a paragraph or two of, you know, here’s what you need to know. And then they say, make it easier, make it simpler, write this at a fourth grade level, write this at an eighth grade level. I’m a PhD in microbiology. Give me more detail. Right. You can change the writing level. You can say make the font bigger, make the font smaller, give it to me in a PDF, show it to me in a spreadsheet.
AP: I’ve even seen someone create a podcast of this document and have two people talking about it, which was freaky, but you can do that.
SO: Right. So as the author and the content creator and the backend people, right, the content people, we’re accustomed to taking our content and packaging it in certain ways. Like, here’s a topic for you, or here’s a PDF, or here’s a book, or here’s a deliverable, right, a package of content. And although with structured authoring, when that came in, we let go of this idea that we, as the author, got to control the page presentation. That got automated into the system. So the person controlling the page presentation was the person who designed the publishing pipelines. But the publishing pipelines were designed on the backend by the authoring people. Now all of a sudden, we have no control over that end product. Just because I thought it should be a PDF or an HTML page, you can turn around and say, like you said, give it to me in a podcast, make me a video, show it to me in French, and the LLMs will do it.
AP: The publishing pipeline got moved over the fence basically to more of the content consumer side and they get to do what they want more or less. That’s where things are headed.
SO: So pre-AI, we talked about content as a service, right? We load up all the content in a database somewhere, and then you, as the end user of that content or another machine, can reach over and say, give me some content out of there. But it was still a pretty discreet, like, show me that topic or show me that string. And what is fundamentally different about AI and large language models processing that content is the degree to which you can mix and match and rework, reformat, translate, and transform that content to be presented to you, the end user, in the manner of your choice.
So as an author, I kind of hate this, right? Hey, you took my stuff and you mangled it and you presented it in Comic Sans, and how dare you? And that’s where we are. That authors get to create information, but they don’t get to control the manner and means of distribution or presentation or formatting or language of that information.
AP: On the flip side of that, and here I am going to look on the sunnier side of things, which never happens. This may be a pod person version of me. If you, as a content creator, are no longer on the hook for thinking about the publishing pipelines and all of that sort of thing, theoretically, that should free you up to create better content on the back end because you don’t have to think about all those things. Allegedly. I don’t know if it’s happening, but…
SO: It’s very hard as an author to let go of that end product, the target that you’re headed for. But fundamentally, there’s a bigger problem, which is that even if I write the world’s greatest explanation of how to do something, that world’s greatest explanation of how to do something is not being presented to the end user as the thing I wrote. It’s being presented after being run through the transformer, the LLM, the processing that the AI can do when they ask for it. So I could literally write how to do X. And the end user says, hey, tell me how to do X. They are not going to get that chunk of information that I wrote. They’re going to get something reprocessed.
Of course, now we ask the fundamental question, which is, is the reprocessed version going to be better or worse than what I wrote? And the answer is, it kind of depends on whether I am an above average writer with an above average understanding of what that end user wants, or whether I’m a below average writer with a below average understanding of what that end user wants.
AP: To me, it’s almost irrelevant as a content creator. My version is better because if the person receiving the information via the chatbot or whatever thinks that what it’s getting or what they are getting is what they want, that’s all that really matters. That the person on the receiving end of that information gets what they want and fine-tunes it to what they want. If they’re happy with it, then the content creator’s opinion about that is, I hate to say it, immaterial at this point.
SO: Yeah, I kind of hate this timeline because, you know, where does my voice, you know, where does my voice go? And the answer is it’s gone. But you’re right, of course, the purpose of again, what is the purpose of technical and product information that we work on? The purpose is to enable people to use a product successfully. So if shoving it through an AI results in an outcome where that person uses the product successfully, then we’re good.
AP: I don’t disagree.
SO: That’s the purpose of the kind of thing that we produce. I think, though, that looking at this, and this is where I see some of the big challenges going forward. First of all, we have to acknowledge that an enormous percentage of the technical content that’s out there is really bad. Like, terrible. Really, really bad, and might be improved by a little trip through a chatbot that’s gonna render it into actually grammatically correct English. That’s a thing.
AP: Harsh but fair.
SO: Yeah, I think you’re not the only one that’s going to have some bleeping issues in this podcast. But the problem that I see right now is that the metrics that are being used to measure the success of AI are all wrong. We should be measuring the success of various AI layers and chatbots and things based on are people getting what they need?
AP: Yeah. Yeah.
SO: Are they having a successful outcome to whatever it is that they’re trying to do? Is the search or is the process of that conversational, whatever they’re doing, does it get them to the endpoint of, okay, I understand what I need to do and I’m good and I walk away? The metric we actually seem to be using is what percentage of your workflow is using AI? How many people can we get rid of? Because “we’re automating everything with AI” is the wrong metric. The question is, how good are the outcomes?
AP: To me the idea of how much AI versus human effort, there’s a lack of, shall we say, human intelligence being applied here because merely applying AI to something is fundamentally not going to make something that is incorrect, bad, whatever. It’s not going to magically fix it. That’s a huge disconnect for me when you’re talking about measuring outcomes.
Whatever you dump into your large language model, if it is fundamentally bad, as in outdated and incorrect, right now, I am pretty sure merely applying AI to it is not going to fix those two pretty gaping holes. And there’s, I don’t know what it is, people hear AI and they think there’s some magic involved. No, the underpinnings have to be good for that magic to be useful, basically.
SO: And I think all of us have examples of asking the chatbot a question and getting answers that are just flat wrong. Or worse, they look plausible, like they’re in the form of a plausible answer, but then you read it and you read it carefully and you’re like, this doesn’t actually say anything. It’s just word salad. Which, since a chatbot effectively is the average of the database underlying it of content pretty much means that the underlying database of content doesn’t say anything useful on this topic. So I think the place that I kind of go with this is to the question of accountability.
AP: Yes.
SO: Who is legally responsible for the outcomes? Now, pretty clearly, if I or an organization produces a user guide that covers a specific product and there is wrong information in that user guide, the organization is responsible. I mean, it’s your document, you’re responsible. Okay, if I, as an end user, query a public-facing LLM and get the wrong answer for something, and then I proceed to use that in my life, whose fault is that?
Who is at fault when, or, you we saw this with, when the first came out, people were following the map, right, the GPS map, and it would send them off a cliff or it would send them into a construction area and they would drive off the side. Okay, whose fault is that? And the answer was always, well, it’s your fault because look up from the map and don’t drive past the sign that says, not enter construction zone cliff ahead.
AP: Or one-way street. Right, yeah.
SO: But AI doesn’t come with, I mean, it comes with warning labels, right? But we don’t see them. We don’t process them. What we see is a conversation where we say, tell me more about that. And it tells you more about that. And it feels as though you’re talking to a human. And therefore, when you push on something and say, are you sure? And it says yes, because what’s the typical answer when somebody says, are you sure? It’s yes.
Is it actually sure? No, it’s not sentient. So if I query a public-facing LLM, it reprocesses a bunch of content and tells me how to do a thing that is in direct contradiction to what the official user documentation says, whose fault is that? I think it’s mine because I use the public-facing LLM. Now, what if the organization that makes the product puts up a chatbot and I query the organization’s chatbot? How do I do X? And especially if that chatbot is your frontline tech support, like you cannot get to a human. You have to go through the chatbot. I asked the chatbot a question, and it says, do it this way. And it happens to be wrong. Is the organization liable? I don’t know the answer, I think yes, but I’m not sure. And so fundamentally, yeah.
AP: The bottom line here, yeah, we’re talking about governance here. The bottom line is governance and there is, there has to be some human AI interaction here. There has to be these guardrails that you mentioned earlier and that’s where humans have to be involved.
SO: And the better the AI gets, if it’s accurate half the time, then my hackles are up. I know it’s gonna be wrong. It’s wrong all the time. If it’s accurate 80% of the time, I sort of trend it like psychologically, I just assume it’s accurate all the time. So the better they get, the worse the errors are because we don’t expect them.
AP: That’s also dangerous. Yeah, right. Yeah.
SO: I see occasionally, very, very occasionally, I had directions to go somewhere. And the directions were literally, put this address into Google Maps, but don’t do A, B, and C because it’s wrong. Like, the directions to get to this location are incorrect. Do not follow them. Because these days, our assumption is that the mapping apps just work.
AP: And that’s it’s wrong most of the time, but I think part of this governance angle is we have to realize that AI is going to be wrong.
SO: Pretty much just do.
AP: And there are lots of reasons we won’t get into all the reasons that can be wrong. So what are you going to do when it is wrong? How are you going to make sure it’s not wrong? Again, there’s this whole process, this whole governance process that has to be in place. And again, I think this is where human intervention is going to be necessary because I don’t think AI at this point has any business correcting itself in these matters. That seems sort of suboptimal to me.
SO: Hmm. Yeah, I mean, hypothetically, you can tell it to check itself. And certainly there’s some people doing that type of work. I think for me, fundamentally, the takeaways are that, like any other tool, there’s some really useful productivity enhancements that we can and should be taking advantage of. To your point, there’s some really important governance work that needs to be done to ensure that your QA is appropriately scaled to the level of risk of your product. Medical device, very high. Silly gaming app, pretty low. Don’t really care. And we need to think about guardrails and what it means to inject the right kind of content and the various kinds of enablement tools that you can use to do that.
And finally, this issue of AI as a content customer, I think is really, really tricky because it’s a new, from our point as content creators, it is a delivery mechanism, right? Just like a PDF or a piece of HTML or anything else like that. and it’s a delivery mechanism that allows the end user to control how they access the content, which means we have to do way more work around the guardrails of what that means when they query the content and shape it to their own requirements.
AP: Yeah, so things have progressed in the past two years, most definitely, especially in the content space. We’ve seen a lot of improvements. But there are still some big picture things we have to work out. And I think it’s gonna be interesting in the next year or two to see what happens. You briefly mentioned there are some companies who are setting up systems that can do a decent job of checking up on itself. That’s not where everything is right now, but I think the better these systems get, the better the guardrails that get in place, they can start to find out, this is wrong, I need to fix it, or I need to update this with the latest information, let me go get it. So that is starting to happen more and more. I think it will become more part of the LLM to chatbot process, but I don’t think we’re quite there yet. And I’m interested to see what happens next with that sort of scenario.
SO: It’s definitely gonna be interesting. That much I’m sure about.
AP: Yeah, I agree. So we managed to get through this without cursing. So that’s good. I think it turned out to be a more realistic conversation, and we kind of tuned out the hype because that’s what just makes me grit my teeth and sometimes yell at LinkedIn when I see certain promoted posts on LinkedIn that I think are full of you-know-what. So anyways, I think we’ll wrap it up there. Sarah, do you have any final points you would like to sign off with?
SO: I think at the end of the day, when you try and contextualize, like, what is this AI thing and what does it mean for us fundamentally, we can look at some of the other sort of big picture shifts that we’ve made. I’ve been known to pretty dismissively compare it to a spell checker, you know? You can use it and it’ll fix some stuff, but you better check because it doesn’t know the difference between affect and effect, although some of the grammar checkers now maybe they do.
So there’s that, but I think at the end of the day, if you are looking at content strategy, content operations and enterprise level, you really do have to say, okay, where does AI fit into my strategy and how can we employ it productively to do what we need to do inside this organization to produce, manage, deliver the content that we’re working on.
AP: And I think we’re going to wrap up on that very good point. Thank you very much.
SO: Thank you.
Conclusion with ambient background music
CC: Thank you for listening to Content Operations by Scriptorium. For more information, visit Scriptorium.com or check the show notes for relevant links.
Questions for Sarah and Alan? Register for our Ask Me Anything: AI in content ops webinar!The post Check in on AI: The true measure of success for AI initiatives appeared first on Scriptorium.