Transcript of the AImpactful Vodcast
Branislava Lovre: Welcome to AImpactful. Today, we’ll talk about AI and product management. Our guest is Chris Butler, Staff Product Operations Manager at GitHub. Let’s start with the basics. What is most important when we are talking about AI and product management?
Chris Butler: I worked at Kayak for a little bit, and that was when there was this time when mobile usage was going to be larger than desktop usage. So my specific role was to focus on mobile. I liken back to that time because we didn’t know much about how people would use Kayak or other websites regarding mobile usage. This was when the iPhone had already come out, but we were still building a lot of this stuff.
Some interesting revelations were that even though people were searching more with mobile sites and apps, they were not buying more. This said to us that we were getting earlier into the process. Because the mobile phone is ubiquitous and available everywhere, it means you don’t have to sit at your computer to research if you want to go on a trip somewhere.
You could just be sitting around with friends and say, “Well, what if we went to this place? How much would that cost?” You could do that immediately. So the usage patterns started to change in the sense that people were doing things earlier because they could do it wherever they wanted to. But people were less likely to make purchasing decisions on the mobile phone because if you’re going to spend $1,000 on a flight, you want to be sure you got the right one. So doing that on a small screen with not all the tabs open is interesting. The lessons and heuristics that started to come out of that were that we should provide the ability to be available anywhere, which is the goal of the mobile site, but not to do everything on that mobile because there’s different levels of trust with that. Providing continuity from the mobile experience to the desktop experience to complete the transaction became more valuable.
I think the same thing applies to AI. We’re still learning about how to include these types of machine learning models inside these things. PMs have to worry a lot about the data being used to train these models because of privacy and consent laws. For example, Illinois, Texas, and Washington have restrictions on facial recognition, meaning even on a state-by-state basis inside the United States, you have to have different privacy policies.
So you’ve got to ensure the data is collected correctly and used to train the right model. PMs need to get more involved in this, asking questions about the data without being experts in data science. Sometimes, when collecting that data, you may need to build the product that does data labeling and annotation. So, there are multiple users inside this world, not just the customer but also the data annotator.
Branislava Lovre: How do we build a good team for an AI project? Who should be part of it?
Chris Butler: A lot of people need to be involved. Think of the original Venn diagram for product management, with three circles for business, technology, and design. Product managers are now creating a central place to exchange ideas between not just those three domains but multiple domains. Legal, privacy, safety, security, and integrity are all important. There may be one person at a startup, but the person who cares about all the issues related to those things is essential.
The difference between qualitative user research and data science is also important. Data science helps understand behavioral data and ask the right questions, while qualitative experiences from the people you’re building things for provide continuity between those two aspects. Strong partnerships between these roles are crucial. Not every team has their own user researchers and may need to work with a centralized user research team or a centralized data science team.
Branislava Lovre: Regarding ethical implications, what’s the main message when implementing AI?
Chris Butler: When I was at Google, one of the last things I worked on was really around efficiency of use and efficacy of use of TPU chips. And so because they’re in high demand, including GPUs, right? Like, if you waste a lot of time on those things, there’s an opportunity cost of you being able to train the model again in a better way. And so there’s a lot of things that you need to be involved in like, do you want to maximize the number of GPUs? Is that valuable in this case? To actually use all of the resources you have to train this one model. And how often do you need to retrain it based on things like drift that are happening inside the world?
So that’s like a product. It’s not a product management decision to say, here’s the model architecture or here’s the training methodology or anything like that. But they’re there to help have that conversation with the experts, to understand, like, should we be spending as much money and energy and water as we are, especially if we think about it from a sustainability standpoint, to be able to train and update these models?
Branislava Lovre: What’s the message for companies, both small and large, considering or struggling with AI implementation?
Chris Butler: Because we’re starting to live in such like an interconnected world that we need to think about these things as ecosystems. And so, you know again, like ecosystems have been in and out of kind of favor when we talk about the way that marketplaces work or the way that we end up solving problems for customers, it’s actually inside of them are like an ecosystem of different solutions that they’re getting. And so I would just say that like for the really early companies focusing on a particular and again this is more like regular product management type of stuff, is that focusing on a particular niche, on a particular problem and solving that problem in a very error-free way for someone? There’s lots of ways that different machine learning technology can help with that, right?
But being niche focused ends up being more successful for startups because then for them, actually growth ends up being how do they take over or widen their customer segment over time in some way or go or get into different domains.
So, like the machine learning stuff is fits inside of that. And it’s like for those cases, you may not have the money to be able to like train a model from scratch. And so it ends up being like, can you solve a model enough with off the shelf things or open source models that are fine tuned or something like that, but not going all the way to like overinvest thing into just like pure machine learning because it’s about the community and the ecosystem that you end up building that is more valuable, right? Especially it’s like you’re just a wrapper for ChatGPT, ChatGPT is going to do that eventually, right? Like they are going to be every wrapper that you could ever imagine.
So the relationship with your customer or the way you fit into a system that becomes available.For very large companies, I think the issue is like they end up really focusing on return, on investment a lot of the time. And so they are not very good at experimenting or exploring new places to go. I think the question becomes like, how do you allow for rapid experimentation using these types of tools? That is all about learning and no longer about like how do you maximize ARR or something like that, right? So, I think that’s something I’ve seen in almost every large company I’ve ever been part of, right? They have a really hard time doing a great job of experimentation. I think, like, how do you allow for the teams themselves to do good experimentation?
Because I’ve also worked as an innovation consultant back when design thinking was really popular, we would be hired as an innovation consultant, but the continuity of those ideas into an actual product was not ensured. And the reason why was because the team themselves are the experts in the problems, too. Right. And so getting, giving them the space to be able to experiment actually is way better than creating a completely separate team. Now, of course, there should be research and development organizations that are discovering new technologies. Right. But that is actually not the evolution of solving problems for people, that’s building solutions, and you look for problems that it solves. That’s R&D, right? The people that are on the teams themselves that are building the products, they are the ones that know the problems. They know the next place they need to go. They just usually need to be able to give them the space to actually experiment and start to work on those things.
So, I think that’s what I would say to those large companies, they need to allow for that type of thing. And it’s not just the R&D team that’s off in the corner writing papers. It’s like, how does this team get the space and the budget honestly, to be able to experiment with new things without having to prove ROI before they do it? Basically.
Branislava Lovre: You have worked on different projects, including AI and machine learning. Can you share the challenges and lessons learned?Chris Butler: When I worked at Facebook Reality Labs and I was working on the portal device, which is a video calling kind of device for the TV and a standalone. And one of the things I was working on was whether we could use things like facial recognition to be able to personalize the use of the device. And so that requires certain types of models like again, face matching and recognition models.
There’s interesting issues that start to come up when we are hyper technical about the way that we are asking questions about that model. So for example, like we might get the question, you know, I could start to specify things when I was working with my engineering and research team, like the minimum number of lumens as far as brightness in the room or what is the maximum number or minimum number of pixels for a face to be able to be recognized and things like that. But the reality is like, we really want to go after our kind of what are the regular failure cases that are acceptable or not. And this gets us to something that I really ended up learning about at Google and have now found is like a very effective way to start to have those conversations that are called design policies. And so inside this design policies, you’re looking for like non-deterministic behavior that is both in the realm of what you want to do versus not what is actually an acceptable failure of some type, like where it would be maybe not totally correct, but the user would still be okay with them with the system being wrong in that case. And then what are the harms that actually come out of this if it does the wrong thing? And so that’s an example, is kind of doing kind of face and voice biometrics while working on the portal device.
And then another thing I’ll talk about is when I was at a company called Philosophy, we had a client project with Google and PWC where we were trying to understand how field service operations, which is this idea that a service like engineer goes onsite to help fix something. And in this case it was gas stations, but it could really be like a repair. It could be anything that is like where a person goes on site. And so, there’s a bunch of people like dispatchers, there’s warehouse workers, there’s the field service reps, there’s the customer and a bunch of other people that end up interacting to help solve these problems and fix that problem. And so we experimented with a lot of different types of machine learning systems that could do this. And so, again, when we talked about AI and machine learning, there’s actually lots of different things we’re talking about, right? Like it could be like image analysis or it could be summarization or it could be something like trying to predict what the problem is based on a textual input. Right? Which is the response. So, or even just like the idea of like a conversational agent inside of a group chat right between the dispatcher or the field service engineer and the warehouse worker. And so like we did, we tried out all of those different things and we did a lot of that kind of research by checking with these things, with the people that we’re going to use them to see what benefit they might get. And so we’re going to use, those are kind of like two main ones that will come out of this particular group. But there are plenty of other stories that I have.
When I was at the Google Core Machine Learning Group, I worked at a company called IPsoft, which had Amelia, which is a conversational agent and a platform for really applying conversational agents to enterprise use cases. So there’s a bunch of different things like that. Even the startup I did, which was around restaurants, you know, if it was done today, we would call it AI for restaurants back then, we called it business intelligence for restaurants. But it was a lot about like, how do you end up creating kind of like constraint effective seating, how do you end up doing auto generation of text? How do you help the host and the manager actually do a better job of this? And so there’s a lot of things in there that I think can be great lessons about how you should be building these things.
Branislava Lovre: How do you use AI in your daily work at GitHub?
Chris Butler: I do use generative systems, one to help give me new ideas. So, I wrote an article that was about like getting just enough confusion. And so I use card decks and things like that to kind of create random prompts or provocations for myself, to be able to make sure I’m thinking about as many things as possible. And so, I think this idea of like asking ChatGPT or something like that to give me 20 ideas in this domain, I would say like 15, 18 of them are boring ideas that I’ve already thought of, but there’s like one or two that come out of there that I just had not thought of yet and ends up allowing me to extend kind of my, kind of the way that I think about this. Now I think other teammates do that for you as well.
And I actually, I use it a lot when it comes to like, I do, I write science fiction, for example I’ve published a few stories. And so, there are just sometimes like, I need some help in kind of like ideating a concept a bit more about what would be the things that would be on this desk or what would be the things that would be in this world. It can help create. Like I can ask it for 20 ideas and it helps kind of like just give me some idea fodder. But again, I transform it into a written work at that point. The other thing that I’ve been playing around with a bit is, kind of doing like brain dumps into documents of a topic that I’m thinking about and then using ChatGPT or a system like that to then do a summarization. And what’s really interesting about that is that if it gets boiled down to actually the key points that I want, I realize that maybe this is a workable draft and so I think there’s something interesting there longer terms about like simplification of writing that we do, we write way too much as product managers. Like, I will get like humongous documents for people that no one ever reads, by the way. And so, like, if you just had something that’s shorter, that’s more targeted, I think it’s something interesting there. And so, I mean, I don’t know the future that I imagine when it comes to like technical and non-technical people working together. Is that like right now, today, you would have to have maybe a product person write a PRD or a specification that would be reviewed by a designer that would create some mockups. And then an engineer would go in and start to spike or pilot some type of architecture for that thing. And they may use code, you know, code creation tools in that, but then afterwards they’ll now create like an architecture diagram to do a design review. And now the problem is, is that all four of those things, the PRD, the design mockup, the actual code and this design technical design document all out of sync, basically.
And so I just wonder when it comes to summarization and translation of anything to anything type of capabilities, whether that PRD could automatically generate certain mockups that then the designer is updating based on what they think is the right experience, that then updates the PRD, and that the PRD and that mockup, then auto generates boilerplate code, that is a starting point within the architecture and then as an engineer updates that not only can they output technical design documents for code reviews and things like that, but they then update the mockups and the PRD. And so it’s no longer about this idea that there’s all these like disconnected documents in the world, but they’re all kind of like updating each other.
And if something doesn’t make sense and clashes, that’s a conversation that people then should have. And so, like, any time that there’s a comment or a to-do or like a PR, like the idea of translating what the code review came out of code review into non-technical jargon, that’s a very interesting thing as well.
So anyways, I just see that, that this idea of like more customized terminology and text or understanding for people in any of their job roles. Right? So like what are the implications like when it comes to the trade-off we need to make between, say, the customer obsession part of a product role and the maintainability and long-term viability of code. From an engineering perspective, what are the inherent tensions between those things? If we don’t always identify them as people, we should try to get a system to also help us identify things that are common problems. And so in those cases, I think we have like a provocateur that helps enable better conversations between technical and non-technical people. Basically.
Branislava Lovre: Could you share any current plans or projects you are working on?
Chris Butler: Well, I can’t share a lot about, like public, you know, anything that’s otherwise public from like Copilot and things like that. But what I would say is I am working on a project that is kind of the employee manual of the future and that employee manual of the future is like, what if you worked in an office where there were agents that are both purely synthetic or ones that are modeled after people as like digital twins? And then how do documents become like living things? And so there’s something interesting there that, anyways it’s like a speculative project in the sense of like how will we team not only people but pure AI agents and then digital twins of people? And so, like, if you can’t be at a meeting, can your digital twin be a stand-in? And what types of things would that digital twin be able to do as a stand-in versus not?
Like right now, even just having your like system automatically transcribe and maybe collect action items for you. It’s kind of the beginning of something like that, right? But what if it could respond with the way that you may respond in that case, but with the caveat that it’s clear it’s a digital twin rather than, you know, I think getting back to your question about like ethics and morals and those types of things, I think one of the things that I learned when I was at Google, there’s a team called Moral Imaginations, and I can send you a link to their latest paper about this. But the idea of like, we can’t make more ethical and moral decisions when we’re building things unless we understand what are the values that we’re actually trying to make decisions around. And so I think the key component of this that is not thought about enough within any product team actually, is that what are the values that we’re trying to actually use when we build things and then whenever we make a decision, is it in service to those values or not?
And the moral imaginations ends up doing that by, you know, some workshopping around values for the team in relationship to maybe values that are that are talked about in a company level. And then we go through kind of a role play situation with a futuristic scenario about what might happen in the future for an extreme version of this technology. And then how does that actually make us understand whether we could have made more moral or ethical decisions in service to our values in that future? And so I think that idea of like role play and that type of thing, it’s I think that’s very key and being able to actually make those decisions. And so the question becomes like for all of these agents that end up existing within these teams, how much in adherence do they have to be to the values of the organization?
And so I think there’s like very interesting questions. It’s hard to come into that. It’s like, do you want everybody to have the same moral compass inside of an organization, or do you want a variety of those? But some kind of convergence or alignment on what is acceptable decision making? And so anyways, I think a lot about decision making and strategy. And so that’s why this is kind of a very interesting subject to me, is like you could then probably take a digital twin of me and make a digital twin that is more of one value and less of another. For example, right?
You could have me be more about execution mode or more of an exploration mode. And so there’s like lots of interesting things that you can start to do where you’re like tuning agents in some way. Again, how accurate they are long term is a really good question because I would argue that like humans are constantly evolving. And so the idea of capturing everything that I’ve set up until this point still means that you will not exactly capture what I’ll say next, but you might be within 90% acceptability or something like that. And so I think those are really interesting questions that we need to like grapple with. And there’s a lot of ethical and moral decisions or questions that come up from them.
Branislava Lovre: You have watched another episode of AImpactful. Thank you and see you next week.



Leave A Comment