Episode #12

Katherine Elkins on AI at a crossroads

Join the Global Lab conversation with Katherine Elkins on AI at a crossroads. In this episode, Katherine discusses as AI accelerates global change, critical decisions are shaping its development, governance, and impact. Who determines the values these systems reflect? 

Hosted by Stephen McCauley,  The Global School, WPI

AI at a Crossroads: Who Shapes the Future?

by Katherine Elkins & Stephen McCauley | Global lab podcasts

Guest Bio

Katherine Elkins is a leading expert on AI, ethics, and society, serving as a Principal Investigator for the U.S. AI Safety Institute and a recipient of the Notre Dame-IBM Tech Ethics Lab award. She is a member of Meta’s Open Innovation AI Research Community and a professor at Kenyon College, where she co-founded one of the first human-centered AI programs. Her work explores the societal impact of AI, from governance and safety to narrative and creativity.

Transcript

Scroll to read more

Stephen McCauley 

Hi. My name is Stephen McCauley. I’m a co-director at WPI’s global lab, and I’m here today with Katherine Elkins. Kate is a leading expert on AI ethics and society. She serves as a principal investigator for the US AI Safety Institute and is a recipient of the Notre Dame–IBM Tech Ethics Lab Award. She is a member of Meta’s Open Innovation AI Research Community, and she’s a professor at Kenyon College, where she co-founded one of the first human centered AI programs. Her work explores the societal impacts of AI from governance and safety to narrative and creativity. So welcome, Kate. I wanted to ask you to if you could just start out and give us a little bit of a flavor of what you do at Kenyon’s AI Lab. You’ve been doing so much interesting research and with your students, and I’d love to hear just a few examples, maybe of some of those projects to kind of set the stage of the work you’re doing there. 

 Katherine Elkins 

Yeah, absolutely. And thank you for having me, it’s such a pleasure to be here. So, lots happening right now in the world of AI. One of the things that we have been very focused on is the rise of autonomous agents, and these are bots, or GenAI models, that are able to go out on the internet and do things on their own. We like to think of AI as just a tool with a human in the loop, but we’re a little bit worried right now that we may be having the human fall out of the loop. So, one thing that we’ve been doing is a semantic ethical audit, and that is generating lots of responses to ethically fraught decisions and seeing what the distribution of answers by different models is, and this is really testing. Is there kind of an ethical center to these models? And they are all quite different our one of our early ethical audits showed that virtually all of the models were fairly authoritarian, very much following directions, which seems like a great thing until you imagine them in an autonomous weapon, maybe being given orders that aren’t the best or being used by a bad actor. So, we continue to do these ethical audits and update our audits, and more recently, we’ve been looking at things like, can we hack these AIs, can we hack their decision making using emotional framing or syntactic framing. And maybe one of the most complicated findings that’s a little bit worrisome is that as these models become more performant, they seem more and more human in terms of the kind of ethical reasoning they have, which is great for alignment, but we are also seeing them subject to the same kinds of biases or failures that humans tend to have in that they can be hacked using certain kinds of manipulation. 

Stephen McCauley 

So really important stuff that you’re looking into. So, thanks for your work, because we really need to be looking at these questions. You mentioned a little bit looking at that decision making, but can you take us a little more just into some practicalities of what one of the research projects might look at in this area that you’re doing. Maybe about some of the types of research question you’re asking there. 

 

Katherine Elkins 

Yeah. I mean, let me give you some other examples with agents. Maybe that might be more helpful here. We have a lot of different kinds of projects going on, so we’ve also been having students trying to use networks of agents for good. One example is a recent project seeing if we could use this network of agents to prioritize who gets medical care in a resource-poor environment. And another is trying to use a network of agents to help students with disabilities. And yet another project that we’re looking at is really this question of AI-human relationships. You may have seen in the news some really entertaining news stories about women falling in love with AI bots. We are used to thinking about men falling in love with the AI from some really famous movies, but we now have women who are doing this. And you know, we may have all kinds of judgments about what these relationships are, but in our opinion, there’s not enough research actually showing what they look like. So we’re taking huge data sets of human-AI interaction and trying to find patterns of kinds of behavior in them. For example, does the relationship change over time? Do we, if we graph it over time, do we see a different kind of relationship unfolding? We’re also looking at failures in the relationship, are there certain patterns where people become frustrated? Do they become angry? How do they treat the AI when those things happen? So failures in these relationships over time as well. So many, many exciting projects happening, but those are just a couple that we’re working on right now. 

 Stephen McCauley 

Yeah, great. Thanks for that, Kate. So when we think about generative AI particularly, I think on the surface, sometimes we think about its ability to generate, if we reduce it down to tokens or words, and so it can start to construct for us, and write for us, whether it’s in text or in images, but hearing you talk through some of the questions you’re looking at, for me, it really reveals that what GenAI is doing in its ability to work with huge corpuses of text and things that it can work with is it’s more about getting to the underlying patterns and structures in human thought and behavior, and to unpack those at layers that we’re often not even aware of, so it actually is helping us discover new things about human emotion and human decision making and ethics and things like that. 

 Katherine Elkins 

Yeah I think that’s right. You know, some people think that this is just recombining information that already exists, and to a certain extent, that’s true, but it’s not just a simple recombining like an archive or a database. Sometimes we think of it as a fuzzy database, but it’s not really storing all of this material it is sometimes generating, and there’s a real question about how creative it can be, and how much we can generate outside of the kind of thing that it’s seen. Many of us do believe that it seems to show evidence of levels of generalizing from the data that it’s seen, right? It’s not just recombining, it’s mapping this kind of underlying distribution, and there’s some evidence for that. For example, in some of the early image or actually, video generation, we see some physics there that these AI models have never been taught. We see elements of understanding aspects of the world that seem to have emerged from being fed all of this information. So I think, you know, some people don’t realize that it’s quite as performant as it is. And I think we can certainly, and we have, in the lab, looked inside to see patterns, we have, in the early days when there were fewer safeguards, we could actually look at, you know what, when the AI generated an image of women versus men executives, we could kind of see what the data seemed to show about those. So it really does give us a little bit of a lens onto our own culture. It’s trained on human data, and we can look inside and see patterns in this data. 

 Stephen McCauley 

And what do you mean specifically when you say that they show performant capabilities? 

 Katherine Elkins 

You know, we really are automating intelligence, and we are automating creativity, and these are things that we used to think of as distinctly human. So there is good reason for people to be concerned about it, and there is a lot of debate about just how intelligent and just how creative. But we do have artists who are winning awards using AI to generate art. We have studies that show that humans seem to identify the AI-generated poetry as more likely to have been written by a human. So we are seeing even these kinds of creative tasks benchmarks being passed. 

 Stephen McCauley 

Right, right. I’ve heard you talk about the simulations, I guess you could say, you’ve done with a court case and with having multiple agents play through scenarios in a court case. Could you tell us a little bit about that example? Because I think it’s really revealing in terms of how we’re building sets of values into these technologies. 

 Katherine Elkins 

Yeah, I think that’s absolutely right. So that project was thanks to an IBM Tech Ethics award, and we really started with this question of how well can GenAI predict human behavior? There’s a lot of debate about whether we should even be using AI to predict human behavior, but actually there’s a very long history, decades long, of using older forms of AI to predict who would be likely to reoffend in terms of a crime, and actually help people decide who gets sent back to jail with very real consequences. So one of the reasons we started with this project is the concern, not that people would be doing this professionally, but we know a lot of GenAI use is kind of behind the scenes. You know, it’s somebody in the office who just decides to check in and ask questions. So you know, what if somebody in the criminal justice system is actually using AI and having conversations to try to predict on an actual use case? In this case, we found GenAI actually does a poor, a very poor, job, and I can talk about the reasons for that, but we did boost behavior using agents and a network of agents. So here we seem to see that when we use language and debate and we have each agent with a set of rules, a persona—one is the prosecutor, one is the defense attorney, one is the judge—we can have a series of rounds in which each one presents a case, they respond to each other. We can actually have it update theory of mind for each person involved to see what they’re thinking. And how they’re reacting, we can see how the judge is responding to each of these debates, and we actually increased performance. So that kind of debate where we’re simulating different positions is one way to get much better performance on reasoning and these kinds of predictive capabilities. 

 Stephen McCauley 

Fascinating. And can you talk about the ways in which, either in that specific case or just in this kind of work of using agents to play through scenarios, the ways in which bias is either built into these interactions or what’s happening in the realm of the technologies to address the biases that we know are built into it. 

 Katherine Elkins 

Yeah, that’s a great question. I mean, our suspicion is actually that one of the reasons why just standard prompting with GenAI for these kinds of predictive behaviors does not work well, is because there’s been such an attempt to reduce bias that some of these models may actually be over-tuned to what we would call anti-bias, that there’s such a fear of using certain categories to predict behavior, that these models actually do worse than chance. So, you know, one of the very tricky things right now is there’s been a lot of attention about bias. In some of the earlier models, we could really look inside. They were what we would call white box. We can see how the algorithm is working, we can experiment, we can de-bias the algorithm, and this was especially true with some of the earlier models, like COMPAS, which was an algorithm that was used to decide, predict, whether somebody would reoffend. These are now black box models, even the people making them do not fully understand the behavior. There’s a lot of these very, very large models we haven’t mapped in terms of kind of unusual behavior. And then there’s the worry that the training data may be biased. So if it’s shown lots of, I’ll take the women versus male executives, when we used to be able to generate those kinds of images, you could see that the women were much less diverse, that they looked very angry and almost mean. They were not a woman that I would want to have as boss. And you can almost see the collective unconscious right in terms of what we think about women in these roles. Very illuminating as a social scientist, to actually be able to go in and see this. And now there have been many attempts to de-bias these huge neural networks, but they’re so large that it’s really hard to know and identify where that bias is. Now, one way is to change the training data, and so there has been much more of an attempt to curate the data, to not have biased examples. But of course, we need tons of data, so do we have tons of unbiased data? Another way has been to create synthetic data, so we can control that data a little bit more. But ultimately, what some of these models do is they are creating other ways downstream after the pre training, either to prevent us from looking inside, safeguard so that we can’t actually ask about or generate anything that might have bias. In some cases, we have seen that anti-bias, like Google actually had a model that was producing right the founding fathers as Black; that was a case where they were maybe over-tuning a bit. And then other types of prompts that we can’t see, that are behind the scenes so that we can’t ask things, or it won’t generate certain things. And then another process of what’s often called reinforcement learning that shapes how these act. So now the model is so complex, bias can be introduced in so many places. They’re stochastic, so every time you run it you might get something different: maybe 99 times it’s not biased, but there might be the 100th time that it would be and exploring this enormous space of probability is incredibly difficult, and so a lot of these downstream processes have been put in to prevent us from seeing that bias. Of course, the worry is, what if somebody takes one of these models, particularly one of these open models, take some of the safeguards off? Do we know how it’s going to perform? And that’s a real worry. 

 Stephen McCauley 

So by downstream measures, is that sort of a human or set of humans who are evaluating outcomes and deciding you know things about it, or is it a technical fix? 

 Katherine Elkins 

Well, the prompts could say what we’re allowed to ask if somebody asks if somebody has certain words in the prompt, you know, don’t answer, say, “I’m a large language model.” One thing that worried me in the beginning is people would say, “Oh, it can’t do that, it says it’s a large language model. It can’t do that.” Well, sometimes they really could do that. It’s the prompting, it’s the safeguards that are not allowing it to do that. So people sometimes have an imperfect idea of what these models can do. They actually can do it, but there are safeguards so it’s not doing it. But there is this process, there are a couple different types, but the most, the one that most people will be familiar with is reinforcement learning with human feedback. Why do we need this kind of training? Well, there’s no perfect answer for written text, right? Which- what is the “100” on the essay? We don’t have that. And so instead, the way they would do this is they would generate two different textual responses and have a human say that they liked one more than another. You might even find this sometimes: OpenAI’s now sometimes doing this even with their users to say, “okay, here are two options, which one do you like better?” That is a way for the human user to program the AI and use that training data after the fact to direct it and say this, this particular one is better than this other one. The problem with that is that we don’t always know what the AI is taking from that lesson, because we’re not saying why it’s better. So we’re just hoping that the AI can begin to infer our human preferences. But it’s not like we’re giving it direct lessons. And these are also ways that we can actually bring Western bias, different liberal, conservative values, whatever, as we’re saying that we like one more than another. 

 Stephen McCauley  

Interesting. So, yeah, there’s so much to think about. I know you’re a part of the US AI Safety Institute. Is work in that sort of a space playing a vital role in helping to keep attention on these critical questions around the values that we’re building into the machines. Do you see them playing an effective role as these technologies develop? 

 Katherine Elkins 

I think it’s going to be tricky. A lot has changed since the release of DeepSeek, and we are very much in a global arms race with the EU, China, and the US all competing with each other. China with DeepSeek, this new model, for those who aren’t familiar: extremely, extremely performant, quite small, doesn’t use a lot of energy, and was actually made by a very small team. And one of the reasons why the release of DeepSeek was such a shock for many is that in the US, there was this belief that we needed to keep scaling. As the models got bigger, they got better. And so there was the saying “scale is all you need.” And it was becoming more and more expensive, huge models taking a lot of energy, very expensive to train, and DeepSeek goes in the other direction. So it is really a major event, and it was done by a pretty small team with maybe less money than we have been spending in the US. What does all this mean? Everybody is in competition, and there really is a race, and we really have seen a shift in the conversation away from safety and towards security and even towards national security. There are sovereignty kind of questions, where many nation states are trying to have a sovereign AI where they develop their own AI. So unfortunately, I do see the conversation moving away from safety. We are waiting to see what happens with the US AI Safety Institute. The last time I spoke with folks there to talk about the plans, the plans were on track to continue looking at AI safety but of course, everything is changing quite rapidly. 

 Stephen McCauley 

You mentioned the race, and I hear that, and you mentioned around security and things, but I wonder if there’s more specific, a race toward what? I’m curious if there is a unified vision, at least maybe within the US or within different countries or regions about where, where this is going technologically. I sometimes get a sense that there is a particular vision of what they, the tech companies, are aiming for. And what is that vision? 

 Katherine Elkins 

You know, Sam Altman talks about developing AGI, you know, this super intelligence for humanity. And there is a stress, I think, on all the amazing things that this kind of superintelligent AI, an AI that’s smarter than us can bring: curing diseases, tackling climate change, and there are a whole group of folks who call themselves accelerationists, who just feel we need to bring it on as quickly as possible. There is a good intention behind some of that, obviously. I mean, we could really use these great advances, but I don’t think we’ve fully thought through what it will be like to create intelligence that is better than us, to automate intelligence and creativity. There are real questions about the future of work when we can have agents that will work 24 hours a day on our behalf without the need of a coffee break, sleep, or benefits like health care. So there are big questions here, and I don’t think that we have enough people thinking about that future and what it might look like. 

 Stephen McCauley 

That is really important, and that maybe brings us to a question about education. We’re here in an institute of higher education, and you work in the academic area as well. And of course, we’re all thinking about what this means for what our students will need to be well prepared for lifelong learning and success and professional contribution. Can you speak to maybe what you see as the fundamental things we should all be thinking about in terms of our priorities for educating young people at this time? 

 Katherine Elkins 

Many people do believe that AI poses an existential threat to higher ed, particularly here in the US, higher ed tends to be very expensive, so we do need to think about charging this kind of price, especially if many of the kinds of assignments that we are offering can be done using an AI. So I do think, I wish I could say it was just a fact of faculty learning a few AI tools and having one or two assignments in the classroom. I think if we’re looking a year ahead, that may be the case, but if we’re thinking five years ahead, we’re really talking about thinking deeply about what higher education means, what education means. What does it mean to be someone who is educated? And you know, it’s changing so rapidly that it’s very hard to keep up. One of the things that I try to do as an educator is give my students the intellectual framework: understanding the kind of futureproof theory and concepts behind technology but also the applications, all the different tools, all the different applications so that when they approach a problem that they’re trying to solve, they can actually do so creatively and use all of these tools in the toolboxes. But it’s also a question not just of asking the big questions, about asking what questions we need to ask. And that it is a constantly evolving landscape. 

 Stephen McCauley 

Okay, so that’s good. We can keep asking our students to fine tune the questions they’re asking, right? And I want to talk about your book a little bit, because I love the concept of your book, The Shapes of Story, and I wonder if you could just talk to us a little bit about that book. 

 Katherine Elkins 

Yeah, absolutely. So there were a couple people who started this field before me; Matthew Jockers, who’s now at Apple, wrote a book called The Best Seller code, and he wanted to see if we could use AI to predict a best seller, which, of course, nobody says you can. And in fact, they did use AI and discover a few trends and things that tended to be part of best sellers. Also, some other people who started at the University of Vermont Story Lab working in this field. And so this is looking at the emotional arc of stories. Unfortunately, it didn’t really take off for a variety of reasons, technical reasons. We still had a few methodological questions to figure out. And so that was part of the challenge when I took it up with my collaborator, John Chun. And I really took it up on a lark. You know, is this thing called emotional arc real? Can we use AI to uncover some pattern? And you know, people have been theorizing about the shapes of tragedy and of theater or storytelling for ages, and Kurt Vonnegut actually tried to work on this for a PhD, and he was told it was not academic enough. So we actually got a terrific fiction writer instead. So maybe it’s good that he didn’t become an academic, but he actually has a great lecture he can watch about Cinderella and this kind of shape of a story. 

 So what we do is we take something called natural language processing, and we extend the story time into this sequence of sentences, and we are mapping the sentiment as every sentence unfolds, almost like a stock market ticker going up and down. And then we use some signal processing to smooth that signal, and what we can find are peaks and valleys as these stories, well-told stories, go up and down. And I’d like to say, you know, it’s called the shapes of stories, instead of the shape, because there aren’t just a couple shapes, although there are overall some kind of trends we can see in the types of stories, but each story has a shape as unique as a fingerprint. On the other hand, if you step back and kind of squint and look from a distance, you can see different kinds of stories’ shapes. And for example, the best seller shape that Matt jockers identified is this W shape that descends and then rises and then descends and then rises again, and we see that kind of a shape over and over again. The thing that maybe might surprise people is when we look at these peaks and valleys in narrative, they are often the passages that literary critics tend to pick out. And so what I think that tells us is we have not paid enough attention to the way in which emotion might be a narrative engine for many, many stories, and that actually stories work on us on this emotional level. 

 Stephen McCauley 

Which does make sense, we are moved by stories, right? And you’re starting to discover more about how we’re moved. What does this say about the relationship between machines and humans? That the machines can tell us a little bit about what we actually are responding to? 

 Katherine Elkins 

I think, you know, here the shape of a story is in the language, such a subtle shift of language that is moving up and down for sentiment. And you know, one thing to keep in mind is all words have sentiment. This is not just words about emotion. So you know, if you have crash, bang, you know different kinds of words, you know, blood, all of these words are conveying emotion. Scenery might be described with a certain kind of language that might be still and calm or violent. So we’re really having this shape of this language that’s rising and falling. I think for humans, it’s so subtle and gradual that it might actually be hard to notice that that is what is happening, because it’s happening quite slowly, and it’s very, very subtle. I think really good storytellers… in fact, I was talking with a novelist the other night at dinner about this, and she said, absolutely she knows this that she has to create this shape. So I think very good storytellers tap into this on some level, even if they don’t realize it. But I think as readers, you know, often I ask my students before they use AI to surface the shape, I actually ask them to draw it, because this helps them see where they’ve noticed and where they haven’t. And I’ve had students also try to write stories, and then we map it to see, and we find that when they don’t know where the story is going, it often flat lines. So we can actually use this to help them begin to be attuned to this craft of storytelling and this way in which language can really shape our emotion. 

 Stephen McCauley 

It also kind of unfortunately speaks to the capacity of generative AI to be manipulative, right? Because it can learn to work with our emotions. 

 Katherine Elkins 

This is one of the things that I worry about the most. I did write a piece with John Chun, my collaborator, a couple years ago, debating with a colleague about whether AI will ever write stories. And I think at this point we can see that we definitely are having that ability. There’s been more work done with poetry because it’s shorter, and with screenwriting, we’ve done some of that work as well. You know, one of the hard things right now, for AI, in terms of generating an entire story is we can use AI to see that shape of the story, but keeping the AI focused on that subtle process of the ups and downs having a long form narrative is still a little bit of a challenge. On the other hand, I do think that the emotional EQ of AI in many of these models is much higher than a lot of people realize. And we see that not only because, you know, we have young people using it as a therapy bot, essentially, and you know, school teachers using it to write those difficult emails to the parents and all of this kind of emotional language, it already is very… has a very, very high EQ, and that we have seen evidence that some of these large language models do have the capacity to persuade and to deceive, and they’re also capturing all this human data. So there has been a lot of attention paid to not having all of our data be part of the training data. And there are good reasons for that, because there’s something called data leakage. So say it knows personal things about me and I’m talking to it, and then it’s used to train and then suddenly, out of nowhere, a little piece of personal info pops up in somebody else’s chat, right? That would be the real fear. So, you know, people don’t want that, but I don’t think we’re paying enough attention to the fact that all of these large companies running these models are essentially capturing all of this very personal emotional data. And it’s not that hard to construct kind of “big five” personality ideas about people’s personalities. It’s also not hard using linguistic data to notice certain kinds of disabilities, for example, ADHD and things like this. So I do think we really need to be thoughtful about the fact that we are essentially giving large companies a huge amount of personal data in a much more profound way than we ever have with the Internet and other conversations about privacy and data. 

 Stephen McCauley 

Great, I’m glad you addressed the question of privacy as it came out of the conversation about the emotions and the way that we share with that, so thanks for that. As a final question I wanted to raise, and I don’t know if I mentioned for the listeners benefit, that Kate is originally a scholar in the humanities and comparative literature, and you maintain, obviously, expertise in those areas, and I believe are a Proust scholar initially. So given your background and sort of stories and the big arc of our core kind of cultural narratives, I wonder if you could just speak to us about what is the story of AI that we are seeing unfold around us right now. Maybe where are we on the arc of this story? 

 Katherine Elkins 

This is a real question, because so much of our sci-fi has predicted catastrophic or dystopic kinds of visions with AI development, and there seem to be two kinds of narratives right now out there. There are people suggesting that AI will bring us a utopia, cure all kinds of things. No one will have to work. We can spend all of our time making art and enjoying ourselves and fostering human connections and being in nature, whatever we want to do, right? And that certainly could be a world. And then there are the possibility of mass unemployment and political unrest, and even potentially AI being used by a bad actor to do all kinds of terrible things. Some AI researchers are concerned about existential risk in AI potentially going rogue. So there are all kinds of visions on this other end too. Of course, the most mundane answer will be we imagine the best or imagine the worst. But probably the reality will be some complex combination of both, and that is the most likely. But I do think we should spend quite a bit of time thinking about our future and what it means to be human in a future which we’ve automated a lot of our intellectual tasks, some even of our creative tasks, how will we spend our time? What do we want that world to look like? And I do worry a little bit that we don’t have enough people thinking about that and working on that story. 

 Stephen McCauley 

Great, so that’s something for all of us to think about. Thanks, Kate. This has been really interesting and enjoyable, and I really appreciate your thoughts on this. 

 Katherine Elkins 

My pleasure. Thank you for having me.