Me, Myself, and AI Episode 907

Authoring Creativity With AI: Researcher Patrick Hebron

Play Episode
Listen on
Previous
Episode

Topics

Artificial Intelligence and Business Strategy

The Artificial Intelligence and Business Strategy initiative explores the growing use of artificial intelligence in the business landscape. The exploration looks specifically at how AI is affecting the development and execution of strategy in organizations.

In collaboration with

BCG
More in this series

If you’ve played with Photoshop’s Generative Fill feature or worked in Nvidia’s Omniverse platform, you’ve touched tools that Patrick Hebron’s work has made possible.

A dual major in philosophy and film production, Patrick approaches creative pursuits with a deep curiosity and the belief that if a “tool gets used in exactly the way that we anticipated, then we have really failed catastrophically.” He believes that emerging digital design tools will elevate human creativity, and he aims to develop technology solutions that will empower creative end users to continue to push boundaries.

On today’s episode of the Me, Myself, and AI podcast, Patrick describes some of the technical challenges in building generative AI solutions for creative pursuits, as well as their vast potential.

Subscribe to Me, Myself, and AI on Apple Podcasts or Spotify.

Transcript

Shervin Khodabandeh: How does the use of generative AI in creative fields translate to opportunities for the future? Find out on today’s episode.

Patrick Hebron: I’m Patrick Hebron, author of Machine Learning for Designers, and you’re listening to Me, Myself, and AI.

Sam Ransbotham: Welcome to Me, Myself, and AI, a podcast on artificial intelligence in business. Each episode, we introduce you to someone innovating with AI. I’m Sam Ransbotham, professor of analytics at Boston College. I’m also the AI and business strategy guest editor at MIT Sloan Management Review.

Shervin Khodabandeh: And I’m Shervin Khodabandeh, senior partner with BCG and one of the leaders of our AI business. Together, MIT SMR and BCG have been researching and publishing on AI since 2017, interviewing hundreds of practitioners and surveying thousands of companies on what it takes to build and to deploy and scale AI capabilities, and really transform the way organizations operate.

Hi, everyone. Today Sam and I are excited to talk with Patrick Hebron. He’s the author of Machine Learning for Designers, and he’s held roles at Nvidia Omniverse, Stability AI, and Adobe. Patrick, thanks for taking the time to talk with us.

Patrick Hebron: Thanks for having me. It’s great to be here.

Sam Ransbotham: So I have to say right off the bat, I’m curious about why machine learning is different for designers. What does the “for designers” part of that mean?

Patrick Hebron: When I wrote the book, this was not an intersection that was a very sensible one to most people. I’d been working on that intersection since the time of my master’s. I got into the idea that in design, there can be really challenging configuration problems, and machines can help to play a role in figuring out how to sort through lots of different permutations and come to an arrangement that might be useful to people.

So that was what was happening in my own work at the time. And then, as the technology was starting to advance quite a bit, it seemed to me that there were going to be some really big differences in how we thought about the production of software as a result of AI. Conventional software is always correct about mundane things like, say, two plus two, and machine learning, of course, enables you to do much more complex things, like identify faces in photos or a million [other] things, but it’s not always right about those things.

There’s an inherent imprecision to that. And this fact alone carries a huge implication when you’re designing software, thinking about how the user navigates through a process and particularly what happens when they hit a dead end or a misunderstanding.

O’Reilly approached me about writing that book, and I was really excited to tackle this subject and start to help designers to think about how this would transform their practice.

Sam Ransbotham: That is a fundamentally different approach because we’re used to software being very deterministic. We’re used to processes working the same way they work whenever you test them in the lab, but then they work differently when you introduce noise and fuzziness into the whole thing. So how do people deal with the fact that what they test and what they work on isn’t necessarily what happens when it goes into production?

Patrick Hebron: Yeah, it’s funny because I don’t want to liken machine learning models too much to a human, but I guess one thing we do have in common with them is this kind of imprecision. We’re capable of grand notions, but you can sort of never guarantee that what’s in someone else’s head is cohesive in exactly the same way that it is in yours, right?

Sam Ransbotham: Yeah, I find most people are not cohesive with what I think.

Patrick Hebron: Right. Same! So one thing is to remember that there is conventional software still around, and so having a backup plan or reverting to conventional functionality when a more complex AI system fails is one mitigation. Of course, there’s a challenge with that, which is that probably, if what your software is doing required AI in the first place, then the fallback may be difficult because the conventional software is not up to the job. But having the machine sort of present back to the user what it’s understood, I think, is very important so it doesn’t just sort of go off and act on a misconception.

Another challenge is discoverability. We see this with, say, [Amazon’s] Alexa. There are all these features, but they’re hidden somewhere in there, and so how do you know what you are or not able to do? This, I think, is, in certain ways, a regression from traditional software. Giant menu systems have been sort of the enemy of my career, I guess.

But at the same time, they do have a certain upside, which is, there’s kind of an obvious path to learning what the software can do, right? So you go find some particular feature that you need at the moment. You find it in this menu system, and it’s probably adjacent to some related features, and so this exposes you at least to seeing their names, and perhaps this’ll lead you to explore toward them.

You don’t necessarily have that with, say, an emergent feature set or ability to speak to your computer and ask for something.

Sam Ransbotham: That’s an interesting dimension. I hadn’t really thought about that, but as you were saying that, I was thinking back on my own past life, and there used to be a product called Microsoft FoxPro, which was one of the early database systems. And I, at the time, was super into knowing every possible thing that that piece of software could do.

And one of the things we did was, we opened up the executable and looked for signatures of commands that were in there even if they were not in the documentation. But that doesn’t exist anymore. The world you’re talking about here is very different. There is no executable to open. There is even often no documentation to go through. So this rapid evolution of everything at the same time seems really fascinating. I hadn’t thought about that.

Patrick Hebron: Yeah. And there’s another kind of devious point that comes with what you’re saying, which is that the software could work time and time and time again in relation to, say, some particular type of query. And then, the thousandth time, it doesn’t understand correctly. It completely goes in a different direction.

I think that’s just the nature of inductive learning: You never have sort of a strong guarantee that you have kind of seen all possible things. Like, by learning from experience, you know, we see, say, two cars, and now we have some sense of the range of car sizes, right? We see a million cars, and now we feel pretty confident that we have a real understanding, a mapping of the range of sizes that a car could be. But we really never have a guarantee that we’ve seen the largest or the smallest car. And this kind of fuzziness at the edges is a challenge but, at the same time, gives us everything else that we get with AI, so …

Sam Ransbotham: Yeah, that’s pretty fascinating too. Again, I hadn’t really thought about how we go through that inductive process there.

Shervin Khodabandeh: Patrick, how did you become interested in these types of problems? Can you tell us a bit about your background?

Patrick Hebron: My path is actually probably a little bit unconventional. As an undergrad, I studied philosophy, particularly aesthetics and semiology, and then also did a second major in film production. For my film production work, I ended up making a narrative film, so not that connected to this work. For my philosophy work, I got really interested particularly in the American philosopher Charles Peirce, a semiologist working with the theory of signs.

For that project, I talked about special effects essentially as an artistic medium, thinking about the implication of a medium where you have the ability to portray anything imaginable, as you do in painting, but with the apparent credibility of photography.

That combination is a really kind of interesting, powerful, perhaps even dangerous thing. At the time, I was imagining how this would come together through, essentially, an advancement in computer graphics. AI wasn’t really on my radar at that point, and also it was a long time from it reaching that kind of ability. But this was really interesting to me.

And so when I got out of undergrad, people said, “What kind of movie would you point me to that kind of goes along with what you’re saying?” I said, “Well, I don’t know that any has actually been made, really, that fits this description yet.”

So I started to try to make those movies, and pretty quickly this led me to feel that the tools that were out there for CG [computer graphics] production were not really set up for the things I was trying to do. And so, as a kid, I had done a little bit of electronics experimentation and a little bit of programming and stuff, but not a ton. So it was then that I started to learn how to write software in order basically just to build these design tools to make the movies that I was trying to make, and then, very quickly, I realized that that was my real interest: thinking about how design tools, and particularly AI and design tools, could work became my central interest.

Shervin Khodabandeh: And that interest has paid off. Maybe tell us a bit about some of the projects you’ve worked on in the past.

Patrick Hebron: Sure. Shortly after writing the O’Reilly book, I was approached by Adobe. And at the time, the vice presidents of research and design were talking, and they were anticipating what was going to come in the next couple of years with AI. They could see that there were many places in which AI could come in under the hood of Adobe products and have a transformation of the quality of the capability without necessarily changing the user experience in a particularly meaningful way.

An example of that might be something like Content-Aware Fill. Prior to the last couple of years, that feature was implemented using kind of a pattern extension algorithm, and so this would work pretty nicely if you were, say, trying to remove a beach ball from sand, because, of course, sand is a pattern that extends perfectly nicely, and so it works well.

But if for some reason you were trying to fill in a missing mouth region of a human face, then extending the cheeks is, of course, not going to give you what you want, and so instead, using neural in-painting, you’d be able to do that much better because, of course, you’re sort of learning what to paint there from a large statistical sample about how these different image features relate to one another. And so there, of course, you can get much better functionality.

But from a user perspective, this tool doesn’t need to operate in a very different way. That was the stuff that would be sort of easy for Adobe to integrate into its products. The things that will be more difficult are the things that we’re starting to see now that have no real direct predecessor because previous technologies just couldn’t approach them at all.

Sam Ransbotham: Give us some examples there.

Patrick Hebron: Oh, sure. For example, being able to generate an image from text or do something like completely repose a human body.

Perhaps another area that I should touch on, in answer to your question, is latent space navigation. So within the mind of a machine learning model, it produces a kind of interior representation of the variability in the things it’s seen and that it’s learned from.

And so then that space can be kind of navigated in linear traversal. In one part of that space, you might have things that look like sneakers, and then, fairly nearby, you might have things that look like work boots. And then, very far away, you might have things that look like, I don’t know, teddy bears. So navigating from one to the other is a process by which you can sort of explore and discover what you’re looking for. This can be a really, really useful design mechanism because it’s like “Well, that’s not quite right, but I want something close to that.”

Being able to look at that and kind of move in that space, it really lowers the barrier to exploration and experimentation in design, where, traditionally, maybe you drew a sneaker and now you want to try out a work boot — you have to completely redo your entire drawing, and this is a very involved process.

This is a really powerful feature. But, of course, this way of thinking about designing something is without precedent, perhaps, and so thinking about what the interfaces for that should look like is a bit of a new design exercise.

Sam Ransbotham: That seems kind of interesting too, because if you go back to your Generative Fill [example], yeah, it’s pretty clear that “hey, I just want to fill this area,” and you could use a new algorithm to do that better or stronger, faster, quicker, through some ML math. But there’s not an analogy in the user interface for some of these other tools or some of these other ways of working.

So that seems complicated to get a user … you know, in the end, there’s a human sitting there. How do you let them know that they can twist a person, or they can move something in space, or they can make a shoe look like a work boot, or move them toward the teddy bear spectrum? How do you let them know that in an interface without overwhelming them?

Patrick Hebron: Yeah, it’s a great question. It’s funny because I have opinions about this which sort of go in two completely opposite directions. In one case …

Sam Ransbotham: Oh, good. That’s always fun, when there’s tension.

Patrick Hebron: Exactly. You need the tension, in design especially.

From one side, as I was alluding to a little bit just a moment ago, you want to use familiarity where you’ve got it. If there’s some trope from some related area, then why not help acclimate the user to that?

So in the case of something like latent navigation, this could look a lot like a map, right? These sort of destinations of sneaker and work boot and stuff: You could think about them as existing on a surface, and you sort of drive a little further west if you want to get to the beach or, similarly, drive a little further in this direction if you want to get to this type of shoe, right?

I think those kind of cues are really useful. At the same time, I think you have to be careful there because an artistic medium or a design medium is something where the properties of the medium itself are going to have a huge impact on the nature of the output.

Going back to, say, Clement Greenberg, the art theorist, who basically said you shouldn’t make a sculpture that should have been a painting, to kind of truncate that — I think, similarly, you don’t want to necessarily forever have people making art with AI in sort of the same mindset as they would with pre-AI Photoshop.

I think you want to try to engender some open-endedness. And, of course, the users are going to end up doing most of the work for you there because, generally speaking, I think at first what they will do is kind of something very close to the previous paradigm, just like film editing tools really kind of borrow from Steenbeck tabletop film editors; similarly, AI generation kind of looks really close to what people were doing with the previous generation of tools.

And then they start to explore outward, and so that’s what you want to not get in the way of, is their ability to explore outward. To me, as a creator of tools, the most important thing is, we always, in a business setting, talk about use cases and user needs and user pain points, and we try to work out workflows that are well researched in terms of what a user is trying to do. But I always feel that if the tool gets used in exactly the way that we anticipated, then we have really failed catastrophically.

I think that the most interesting things that people do with software are things that are kind of at the margins of what they were supposed to be for. An example I like to give with that is people building things like 8-bit computers in Minecraft. It’s kind of crazy, from a practical perspective. I mean, obviously, this is one of the most inefficient ways you could possibly produce the simulation of a processor. But at the same time, it’s so great, right?

Sam Ransbotham: It’s fascinating.

Shervin Khodabandeh: Actually, can you go back and explain a little bit about that? I think we all have kids who pretty much understand what that is, but maybe explain the 8-bit processor in Minecraft, for people who don’t know.

Patrick Hebron: Yeah, absolutely. Minecraft is a low-fidelity-looking kind of — I guess you could say building-block, open-world game. And in it, a user’s able to place these blocks, or remove blocks, and most of those are static — sort of meant to represent concrete and things like that. But also there can be water blocks and fluids and dynamic systems. And so this means that you can move blocks around the space. As a result, you can sort of simulate data flow, and as a result, it’s actually possible to sort of build a kind of simulation of how electrons would be moving through a chip, and, therefore, you could build, essentially, kind of an emulation of a computer processor.

So these kinds of projects in an environment like that are, to me, kind of the best of an open-ended tool.

Sam Ransbotham: A lot of what you come up against here is … I’m going to frame it in [terms of] an exploration/exploitation, where you want to make it easy to do these incremental improvements because that’s the “filling in the beach ball in the sand.” On the other hand, you want to also support the ability to do something crazy. And it seems like most of what you’ve talked about is in the space of a visual design. Does this tension play out elsewhere? What other ways are we … You know, we started off with machine learning to classify spam — yes, no; one, one, zero. Yes, no: Is this spam? Is this fraud? And now we’re talking about filling in mouths and faces. Where’s this design going? What else can we use these tools to design?

Patrick Hebron: We’ve seen, over the last two or so years, this real explosion of applicability of AI to creative fields. Perhaps in our education system, we’ve come to see too stark a contrast between the arts and the sciences or between design and engineering or art and science or whatever you like.

I’ll use an art example because I think it’s easier to talk about here. If you’re trying to, say, draw a portrait of a human face, I think most artists would say that it would be a mistake to try to, say, draw the nose to full resolution and detail before moving on to the next feature of the face. Instead, it’s probably advisable to kind of plot out “OK, the nostrils will be approximately here and here, and the two eyes will be approximately here and here.” Now we sort of take a step back; we look at the whole picture: “OK, these look about right in relation to one another.” Now we start to drop into one of those features and add some detail, perhaps one of the eyes.

But now we go back to the big-picture view, and we realize that the eye now looks great, but it’s a little wrong in proportion to the nose, so we’ve got to go adjust that. And so we’re always kind of moving back and forth between these different considerations, and I think that’s very much true in software engineering or in scientific work — that we have to rejigger all of the pieces in relation to one another and then always return back to how this all fits together.

If you think about a technology that can reason from first principle, it’s not just, say, reading books that we’ve written about a disease. It can sort of do trial and error from scratch and come up with solutions that way. Naturally, this would be groundbreaking in the sciences and perhaps lead us to all the things that we have blind spots about.

So process-wise, I think that there’s a lot to be learned from design tools and how we think about tools for engineering and for the sciences. I think we’re really on the precipice of AI playing a very, very transformative role in those fields as well. I’m particularly excited about this because it seems to me that, naturally and understandably, many people are concerned about the role AI is playing in the world. I think, particularly if you look at its embedding in consumer products today, a lot of that does feel very much like sort of a replacement of the human role.

This thing can write an essay for you. This thing can draw a picture for you. So it kind of seems like one in, one out. But it needn’t be a zero-sum game, and particularly when we think about things like pharmaceutical discovery or curing diseases, there’s no reason to not want more help, and I think this can be very much a positive-sum game, where we use this technology to help us in areas where we just can’t deal with the complexity of the possibility space.

We, of course, are, by design, the people who sort of point this toward something, right? So I think the motivation — the ethos, if you will — comes from us, but the AI can play a very meaningful role in how we get there.

Shervin Khodabandeh: So, what makes it hard? That sounds wonderful, but we all know it’s not easy. What makes it hard?

Patrick Hebron: Fair question. One thing that has actually been a real problem in the application to the sciences is the ability to simulate the system that we are acting on. If we look at something like reinforcement learning and, say, Deep Mind’s application of it to games such as Go and chess, these games are very easily simulated. Their rules are not particularly complicated, and so, of course, the game can be fully simulated.

So then a reinforcement learning system, which is basically sort of a learning system that operates similarly to how you might train a dog, where if it takes the right action, you give it a reward, and if it takes the wrong action, you give it sort of a punishment. In the case of a machine learning model, that’s a numeric reward as opposed to a food, but same basic idea. These kinds of systems are able to navigate possibility spaces that are just astronomically large. The possibility space in Go is, like, larger than the number of atoms in the universe. In Go, you just have to play out millions and millions of games inside of a computer.

Sam Ransbotham: I think about, compared with the large language models right now, those things have read more than I will ever read. I mean, just fundamentally, they have ingested much more understanding of language than I will ever — than any single human ever will. Yet, I outperform it in much of the writing. So there must be some missing link in there that we’re not quite hooked on, and maybe it’s that scaffolding and transfer that is a chunk of that.

Patrick Hebron: It’s true. Your point leads me to something that I find very, very interesting — something that I don’t know would have been in any way obvious to me if it weren’t for what has happened in the last couple of years of AI — which is that omniscience actually has some downsides.

Sam Ransbotham: I’m well aware of that.

Patrick Hebron: It turns out that reading everything can actually be sort of a way to be kind of unopinionated or not really have a perspective in the world. And so there’s this process that is used as part of the training of language models called RLHF, which stands for reinforcement learning from human feedback. And this is used for multiple purposes. One of them is to condition the model to speak in a more conversational way, as opposed to a kind of autocomplete way, where it ends the sentence for you, and that’s important from a user experience point of view.

But it also kind of helps to “perspectivize” the model — to understand what is a good answer. The way this process works is, basically, you sort of take an already largely trained model, you give it a prompt — some kind of query — you ask it to generate multiple responses, and then you get a human to give a score as to which answer they found preferable.

And so this can help to condition things like how wordy or chatty or friendly or unfriendly to be in the response, but it also has the effect of kind of singularizing the perspective of the model so that it’s not looking at everything from all angles, which is kind of equivalent to looking at it from no angle.

Sam Ransbotham: That’s an interesting analogy there. Let me transition, given your omniscience here. We have a segment where we try to ask some rapid-fire questions. We’re just looking for the first thing that comes off the top of your mind here. What do you think is the biggest opportunity for AI right now?

Patrick Hebron: Advancing science.

Sam Ransbotham: What’s the biggest misconception that people have?

Patrick Hebron: That there’s no future for humans.

Sam Ransbotham: What was the first career that you wanted?

Patrick Hebron: You know, when I was about 5, this was when virtual reality was kind of coming around for the first time and I was very excited about that. I thought it was very cool and would make kind of replicas of VR headsets out of aluminum foil — not functional ones, but just the design. So VR designer was really my first major career aspiration.

Sam Ransbotham: When is there too much artificial intelligence?

Patrick Hebron: That’s a great question. When we reach the point that we sort of don’t feel the motivation to try to make the world better ourselves. If we lose touch with that, then there’s kind of no point in AI being able to do it.

Sam Ransbotham: All right. So, what’s one thing you wish artificial intelligence could do for us now that it cannot?

Patrick Hebron: It’s funny because, in my own life, I feel pretty content about things. I don’t feel like “Oh, there’s some, kind of missing capability,” so I get very excited about the technology, but, at the same time, it’s almost sort of for the sake of just seeing what’s possible more than feeling like there’s something missing, but, perhaps, more tangibly.

Sam Ransbotham: Can you build an 8-bit computer in Minecraft?

Patrick Hebron: Exactly, exactly. It’s kind of an intellectual conquest, I guess. But, no — I mean, I think it’d be very cool to have home robots.

Sam Ransbotham: This is quite fascinating. I can’t believe all the topics we’ve covered. I think you’ve really opened my eyes to the potential for design. You’ve got a lot that we’ve done so far with design, but the potential is really pretty fascinating. Thanks for taking the time to talk with us today.

Patrick Hebron: Thanks so much for having me. It’s been really a pleasure. Such a fun conversation.

Shervin Khodabandeh: Thanks for listening. Next time, Sam and I close out Season 9 by speaking with Joelle Pineau, vice president of AI research at Meta. Please join us.

Allison Ryder: Thanks for listening to Me, Myself, and AI. We believe, like you, that the conversation about AI implementation doesn’t start and stop with this podcast. That’s why we’ve created a group on LinkedIn specifically for listeners like you. It’s called AI for Leaders, and if you join us, you can chat with show creators and hosts, ask your own questions, share your insights, and gain access to valuable resources about AI implementation from MIT SMR and BCG. You can access it by visiting mitsmr.com/AIforLeaders. We’ll put that link in the show notes, and we hope to see you there.

Topics

Artificial Intelligence and Business Strategy

The Artificial Intelligence and Business Strategy initiative explores the growing use of artificial intelligence in the business landscape. The exploration looks specifically at how AI is affecting the development and execution of strategy in organizations.

In collaboration with

BCG
More in this series

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.

Subscribe to Me, Myself, and AI

Me, Myself, and AI

Dismiss
/