The 4 Generative AI Challenges You Can't Ignore (Guest: Quentin Reul)
Sep 01, 2024Season 3Episode 22
Andreas Welsch
In this episode, Quentin Reil (Founder & Chief AI Officer) and Andreas Welsch discuss the four key challenges of Generative AI that you can't ignore. Quentin shares his story about leveraging business data as a differentiator and provides valuable advice for listeners looking to overcome the most common issues surrounding Generative AI applications.
Key topics: - Describe the impact of the evolving Generative AI landscape on business strategies - Assess the “fitness” of different Large Language Models to solve particular customer needs - Select technologies that can address the challenges of Generative AI - Understand the role of data in the current regulatory landscape
Listen to the full episode to hear how you can: - Concentrate on the problem rather than the technology - Leverage your business data as a differentiator of your AI applications - Drive business outcomes based on data-centric products and applications
*********** Disclaimer: Views are the participants’ own and do not represent those of any participant’s past, present, or future employers. Participation in this event is independent of any potential business relationship (past, present, or future) between the participants or between their employers.
In this episode, Quentin Reil (Founder & Chief AI Officer) and Andreas Welsch discuss the four key challenges of Generative AI that you can't ignore. Quentin shares his story about leveraging business data as a differentiator and provides valuable advice for listeners looking to overcome the most common issues surrounding Generative AI applications.
Key topics: - Describe the impact of the evolving Generative AI landscape on business strategies - Assess the “fitness” of different Large Language Models to solve particular customer needs - Select technologies that can address the challenges of Generative AI - Understand the role of data in the current regulatory landscape
Listen to the full episode to hear how you can: - Concentrate on the problem rather than the technology - Leverage your business data as a differentiator of your AI applications - Drive business outcomes based on data-centric products and applications
*********** Disclaimer: Views are the participants’ own and do not represent those of any participant’s past, present, or future employers. Participation in this event is independent of any potential business relationship (past, present, or future) between the participants or between their employers.
Today we'll talk about the four generative AI challenges that you cannot ignore. And who better to talk about it than someone who's passionately sharing advice on them. Quentin Reul. Hey Quentin, thank you so much for joining.
Quentin Reul:
Hi, thank you for having me today.
Andreas Welsch:
Wonderful. Why don't you tell our audience a little bit about yourself, who you are, and what you do.
Quentin Reul:
Yeah, so my name is Quentin Reul. I'm originally from Belgium. Spent my bachelor degree and my PhD in the UK, and I've now been in Chicago in the US for 10 years. I from a, on the side of work, I do martial arts with my kids and my wife. And it's a lot of fun. From a work perspective I worked at Fortune 500 before, and I'm currently looking at helping companies with their journey into the AI strategies by adopting UX but also the technology and the deep knowledge in technology in doing so.
Andreas Welsch:
That's awesome. Again, thank you so much for being on the show today. And I know you've been sharing a lot of your advice and experiences on social media lately. Should we play a little game to kick things off? Wonderful. Hang on. Let's see. This game is called In Your Own Words, and when I hit the buzzer, the wheels will start spinning. When they stop, you'll see a sentence, and would love for you to answer with the first thing that comes to mind, and why. In Your Own Words. To make it a little more interesting, you only have 60 seconds for your answer. And folks, for those of you in the audience, you can participate too. Put your answer and why in the chat as well. Quentin, are you ready for what's the buzz? Okay, here we go. Almost. Here we go. It is a live show. If AI were a fruit, What would it be? 60 seconds on the clock, and go. I think
Quentin Reul:
I would be an apple, in as much as you have a lot of layers, and, It tastes good if you eat the skins, but if you remove the skin and the layers and you try to dig deeper and eat the apple, the more you know and the better you can leverage it.
Andreas Welsch:
Wonderful. And I think there's a love there at the core as well that you need to know about and how to get close to it. Wonderful. And
Quentin Reul:
it grows again if you use the
Andreas Welsch:
seeds, so it's regenerative. Haha, perfect! I love that answer. Excellent. And I see comments here from Ashwin, he's joining us from Hyderabad, and he said, yeah, something sweet. Exactly. I think that's where we're still at in that phase where it's sweet and we're figuring out what else can we do with it? Can we make apple pie and apple sauce and apple cider and whatever else? So very versatile too. Okay. So with that out of the way why don't we talk about the topic of the show? The four generative AI challenges that, that you need to know about, and if you look at the, history the, last what, two and a half years, ever since ChatGPT has come on the map, I think there's, been not only so much hype, but also a already so much maturity, so much evolution that I'm curious what are you seeing? How is generative AI evolving that the business strategy and the impact on business strategies?
Quentin Reul:
Yeah, I think if we look at what happened November 2022 to maybe July 2023 a lot of companies are at the foremost the fear of missing out on leveraging generative AI, and they were looking at a lot of different use cases and what they could do it, what advantage and what they what gap that was filling compared to more traditional machine learning out of the box, you had thousands of data points that can you now use and model that were pre trained for problems that were complex compared to before where you needed to spend years creating the training sets to, to generate that. But I think now, there is a realization that maybe The direction that the big LLM providers like OpenAI or Anthropic and where the companies are going is not necessarily in the same direction. In as much as OpenAI is promoting artificial general intelligence and I think that for the businesses, where their value is more in narrow AI. It is not They will gain value in content creation and marketing by leveraging AGI. But it's where they are very specific to their niche that really the differentiation is going to come. So I think that there we're going to start seeing a realization that one, not only All use cases that you add could be fulfilled with the large language models. If you want to do the prediction of what to put on your shelf tomorrow, LLM is not going to help you. But if you are, as I say, like trying to create content, or you're trying like to answer questions, it's there and it can help you.
Andreas Welsch:
See, that's the part where I'm always curious to to, see The two be combined. We, we know that LLMs hallucinate, they have factual inaccuracies, they have some biases they, are not even able to do certain tasks that narrow AI and even predictive analytics and machine learning can do. I feel if you combine the two or if you create a prompt where you substitute some information and some variables and you put in those data points that you have generated using other proven technologies and methods, then it's a, good vehicle, right? To leverage the best of both worlds. What are you seeing there?
Quentin Reul:
I think that like what I was working on there, Fortune 500 and we were working with a lot of different models and GPT 4 and there was, to be honest, not a lot of use cases for which we needed something different than GPT 4 because they weren't generic enough. Now, I think that once you get to the point that you just mentioned, which I tend to call or refer to as a cost to be wrong. So the higher the cost to be wrong, the higher you have a risk of hallucination. If your cost to be wrong is nil or zero, I call it creativity. And I think that's another thing that business need to think about is that there are plenty of use cases where the creativity is important and actually useful. Like I have the blank page syndrome. So whenever I try to write and I wrote a PhD and it took me a long time and if I had a generative AI, it would have been much faster. I get stuck on the page and I want to rewrite, everything. The LLMs are very good for that problem. But if I try to make a judgment, let's say, predict or give a treatment to a patient or predict how to apply law for particular problems. I can't be wrong. I can cause someone's going to jail or I can cause someone's death. So I think that's where we need to, look at the problem at a different scale. I think that what is being done with the AU Act, where they have that different level of risk for assessment. And more recently, I saw that the IEEE is actually following that same methodology in terms of accrediting whether your solution is going to be ethical. And they look at it from the perspective of low risk, medium risk, High risk and unacceptable risk and things that are considered unacceptable risk are things like social scoring where for example You would be in hr and you would use whether people are attending meetings or whether they're sending emails are the way to determine Whether or not they are predictive and whether they should keep or stay in the company or not
Andreas Welsch:
now, I think those are really important aspects in coming back to that challenge, right? It's first of all understanding what can you use the technology for and what can you maybe use other technologies for that are better suited, that are maybe cheaper, that are maybe more cost effective to operate and to build. Certainly everything has a trade off, but again, you don't need to have a large language model for So many use cases. Now I'm curious folks for those of you in the audience, if you have a question for Quentin or I, please put it in the chat. We'll take a look in a minute or two and pick up those questions.
Quentin Reul:
And I think to the last point that you made, Andreas, with regard to not every problem is a Generative AI problem. I think that's a responsibility of technology leader. And to have a deep enough understanding of the technology to be able to advise the business and their partners that, yes, you could use Generative AI, or you could put an AI label, but that doesn't mean that you're going to have a better product for your customers. Because at the end of the day, as a business, how are you making money? What is your return on investment? It is about delighting your customers by providing a solution that is intuitive. Integrating AI makes it more complicated for the solution or the problem or the process that people are going through. Or you're not really helping your customers and you're not necessarily going to get the return that you're looking for.
Andreas Welsch:
Now, you've mentioned you've worked at Fortune 500. I did a bit of work in previous roles in corporate with Fortune 500 as well. And I heard leaders say just recently at the beginning of the year, if it's not Generative AI, it's not AI; or we're not pursuing it. And I think that's misleading in so many ways. And it's challenging. It's troublesome. To your point you need to know when you need to use a large language model, when you use your logistic regression or other capabilities. I'm wondering there, from your experience, how can you assess the fitness, if you will, of different LLMs, of different approaches, if you want to solve a particular need, so you don't run into the trap of if it's not Gen AI, we're not doing it?
Quentin Reul:
Yeah, I think there's been a lot of work that has been done in evaluation and benchmarks. Thanks. And I think they're providing good insights as to whether or not a model is better than another model. But I think that we have to be very careful when it comes to be looking at benchmark, is that they're not designed for your narrow AI problem. They are designed for very generic problem. The coding problem is looking at different language, but it's probably not very extensive on SQL. It's also somewhat misleading because we have seen some companies being using part of the benchmark as their training set and that's cause overfilling because it does what it says like it's going to do because you train it on. So I think that's definitely one aspect. It's like you can use a benchmark as a pre selection for what you're going to look based on the type of problem that you have. But I think after that you really have to test it. You have the chatbot arena where you can put two different LLMs side by side and you have your prompt and you see which one is going to give the best results. And that is, it's an interesting problem if you are creative. But I think if you are a company you really have to invest on a golden set. So you probably don't need as many data points as you needed before. Like a good training set or a good golden set of 500 data points is probably going to be sufficient, but that is going to give you an ability to test like the different models over time not only like the first time as you're creating, but a lot of these models are evolving very rapidly and because they're offered as SaaS model they go today, tomorrow like the new version is the old version is gone. And there is an opaque problem in as much as you don't know what training data is going in. So you don't know whether the length of fitness of your LLM before and when you do it like your previous assessment is going to remain as you're doing like your assessment. So monitoring and having that golden set for ongoing monitoring is very important.
Andreas Welsch:
That's an important point that you're bringing up. And I feel that hasn't been getting as much attention lately as it has been probably a year ago, right? As as models change even if your API stays the same, but if the LLM underneath changes from GPT 3 to 3.5 to 4.0, to whatever is next, you need to do your regression testing again. You need to test if the same prompts work. We know that between 3.5 and 4, there are differences, right? Differences in the creativity and other aspects of these models and underneath. So you need to build that into your plan as well, and you need to make sure that you have resources and you have budget to do all these changes. And especially if vendors are deprecating models that you can react fast enough to put in the new ones.
Quentin Reul:
And it's very much trial and errors. At the end of the day, to give you an example, as I mentioned, I have the white page syndrome and I started creating an application to create blogs. Because most of the solutions that exist can get you somewhere, but it was not giving me the solution that I wanted. And I used Gemini. And I was writing my prompt and my goal was to create blogs that were of a certain length, either in time or in number of words. And when you do LLMs and it fails the first time, what do you do? You go back and you change your prompt, because it's the cheapest thing you can do. And you tell the LLM to actually rewrite the prompt in a way that it would understand it himself. So it will provide more context, it will write the information in the right, with the right level of instructions and so forth. But you get to do that four, four or five times, and you realize that it's not the prompt the problem. And in the particular case, I did some digging, and what I realized after a while is that there was no training data about length. And therefore, no understanding of that. It could predict length in the next token, but because there wasn't a notion of length as part of the training data, it was not able to follow my instruction to the end. I think that's where you really need to test. And frameworks like LangChain are very useful for that because it provides you with an easy way to integrate with different providers in a quick way. About a year and a half ago, if you were on Azure, the only model that you had were the OpenAI models. That has changed now, with the Azure AI Studio, where you can have access to Llama, and choose every integration through HuggingFace. But, before that, you were stuck with one thing. So at least now you have a bit more of that connection, but I think that something like LangChain, definitely the PoC or the early analysis stage is the best thing like to use. I wouldn't necessarily use that framework in production. Because it's very heavy and leverage a lot of other libraries. But in terms of doing that research aspect, like what we used to call ADA in the more traditional machine learning if you do that with something like LangChain, then you're going to get faster results and you are going to be able to do that comparison much faster as well.
Andreas Welsch:
Awesome. And just looking at the chat here somebody saying if I've asked it one time, if it can count, smiley face, it can't. I'm sure it thought it could. But we've seen those examples. It's a good reminder as well. Now, we've already talked about this notion of don't give in too much into this hype of everything should be Generative AI. We talked about, hey, there's some evaluation frameworks, methodologies. But what are some of the technologies that you're seeing that can address those challenges of Generative AI?
Quentin Reul:
I think, LangChain is definitely one that can help you with that in person integration. I think it's so difficult today to stay up to date with all the tools. Every day, like I'm on a TLDR getting my new specs on what is happening in AI. Thanks. And many other newsletter, and there's just no way you can keep up with all the tools. You have Llama File, you have a lot of things that are happening. We spoke about different aspect, but completion as opposed to chat. That also has totally different aspect. I was for example, I was asking a normal LLM, I think it was the Llama to write a blog and I'd given the instruction and the Llama model just gave me back the instruction. But then I used like the Llama instruct and I provided like the same prompt through the Llama instruct and it created like the blog that I wanted. So it's again like with these trial and error that you will find like some of these. Aspect of it and be able to address it as you go forward.
Andreas Welsch:
I think that's a really important point as well, right? Do your evaluation, especially now that there is so much choice and it seems like it only keeps getting bigger and more complex, what do you use for? What do your evaluation and see what type of tasks do you have, what model performs or even better than others. There's one question from Jennifer here in the chat and she says, Hey you mentioned that LangChain is heavy and not suitable in production. Can you share more about this? Or maybe qualify what you meant when it comes to lang chain and similar tools.
Quentin Reul:
Yeah, so LangChain, because it was designed like as a community and trying to address all of the different LLMs, it brings TensorFlow, it brings PyTorch. As you're creating an image, let's say a Docker image based on LangChain, it's going to be bloated. I think to recall it was something like 6GB. And my prompt and my code was maybe about like 200k. And it has an implication on how you are going to productionalize it. A lot of the use cases that I see are still very much on demand it's not that you have a lot of use cases that are flawed where your LLM is going to run 24/7 and processing content all the time. One way that you're going to use to minimize your cost and be able to scale is something like Lambda or serverless, depending on whatever infrastructure you are. But there are limitations. For example, Lambda doesn't support Docker images to be more than a certain amount of memory. So now you have to create layers on top. So you are adding complexity by using something like LangChain. That has got a lot of things that may be useful, but if you're doing OpenAI you don't need to have access to Oracle Cloud Infrastructure. You don't need access to Bedrock because you're not going to use any of these components. Whereas if you were using directly the OpenAI or the Microsoft Azure API it's going to be much leaner because it's only going to have what is necessary to do the job.
Andreas Welsch:
Wonderful. Thank you for summarizing that. I think that was really tangible and some good advice there. What are some of the complexities that you inherit when you go down a certain path? You also mentioned that you don't have like a continuous loop where it's not always on, depending on the use cases that you see, especially in enterprise. To me, that comes back to data, to pipelines, these kinds of topics. And you alluded to that earlier. Yes, large language models help us get to production faster. We might not necessarily need to have a data scientist or be a data scientist to get something out there, at least something that's good enough. I'm curious, what are you seeing? What's the role of data still when it comes to LLMs in that current landscape especially looking at regulation at the beginning, you talked about the EU AI Act and IEEE, what role does data play there?
Quentin Reul:
Yeah, I think it goes back to the goal as well of the companies, like OpenAI and AGI, like I mentioned earlier, and narrow AI if you take a large language model that is going to include a lot of information and you want to apply it to a narrow problem, like what you have in your business. Trying to make things unlearn that may be causing hallucination is not that easy. Now, if you take a smaller model, like a Phi model or something else, and you fine tune it with your own data, which you've been gathering for many, years, then now you actually have that unfair advantage that you want to have against your competitor. If you're all working out of GPT 4, without any data that is local to your problem or local to what you've built over the years. You have no you're no different than the other companies that is using GPT 4. Now, if you're using your data and you're integrating that either through few shot learning, but more generally I think it would be like, through fine tuning on smaller model that's the point at which you can differentiate the worst shooting. And today, like from Gemini to H2O, there's so many platforms that are making it easier for people to fine tune their model. And the cost of fine tuning is also much lower than what it used to be. Like, if you think about pre Generative AI, when you need it to have hundreds, thousands, or hundred thousands data points to train your model, while for fine tuning, you need maybe a thousand or two thousand. And that has proven like to already live show. Remarkable difference from using a model out of the box. And to be honest, there are certain problems for which an LLM is never going to address the problem, even though it's a generation. If you think about relational database or knowledge graphs in every query based languages, yes there's been like some training on the models for SQL and the SQL syntax. But in a lot of these cases, that syntax then needs to map to the underlying data model, which is unknown. So to really be efficient at creating queries that are going to go with your data, you need to fine tune it to your schemas, to your tables, or to your data. To whatever else you have used to express that data. Otherwise, you're only going to have garbage in and garbage out. And you're not going to get the answers that you're looking for.
Andreas Welsch:
Awesome. I love that one because I think that's such an important one in business where there is so much data and that's the Key differentiator that makes your business unique, that makes your processes unique, that gives you the edge and has been doing that for a number of years. So bringing that into your LLM application to make the output more specific or the summarization or the recommendations if you will. Again, writing assistant type things or tweaking things. I think that's really key. So thank you for highlighting that as our fourth challenge that you still need data if you want to continue differentiating and not do things that everybody else is doing as well. Now, Quentin, we're getting close to the end of the show, and I was wondering if you can summarize the key three takeaways for our audience today.
Quentin Reul:
Yeah, the first one is a quote that an old colleague of mine used to have. It's fall in love with a problem, and I mean like the customer problems, the narrow problems that you're trying to solve. Not the solution. And in this case, Generative AI. It doesn't apply to all the problems that you may have. Two, as we spoke about: data is still your differentiator. You may have spent a lot of money doing a data warehouse over time. It's probably translated now into a data swamp, but you have that data and you are able like to take like a small slice of it and put it back into your model and fine tune it to your problem. And, really that the last one: data-driven companies were already with big data going like to be the differentiator. But I think with AI it is going to be the game changer. And I get confused when I hear companies that were data companies becoming AI companies. Because I'm pretty sure that in a few months they would have to change their labels back to data companies because that's going to be the true reason that's going to resonate to their customer a lot better than AI companies.
Andreas Welsch:
Awesome. Wonderful. Thank you so much. And maybe one question. How can people that have joined to today's session or listened to it connect with you and learn more about what you do and maybe how you can help them?
Quentin Reul:
Yeah. I'm on, on LinkedIn. And I'm also on X, and I have recently started a YouTube channel where I'm putting the recordings of different assignments that I've been given at different presentations. So I'm trying to curate the material as I go along. I also have written a few blogs about how to take the problem of Generative AI, From inception and using things like jobs to be done framework all the way to how to consider it how to charge and how to make to monetize like your solution depending on how you are applying the content and whatever else so all of that is on my page and you can reach out to me.
Andreas Welsch:
So folks do make use of that opportunity and reach out to Quentin for more advice and insights. Now, Quentin, again, thank you so much for joining us and for sharing your expertise with us. I think it was a great session. I learned a lot and hope you and the audience did as well.