Is ChatGPT Ready for the Enterprise?

February 15, 2023

Recently, we were visiting a client showing the latest version of our application when a participant asked, “Why aren’t we using ChatGPT?” It was a good and logical question with the attention that ChatGPT and other AI-based solutions are warranting these days. While these tools, built using complex machine-learning components like large-language models (LLMs) and neural networks, offer much promise, their implementation in today’s enterprise should be weighed carefully.

Rightfully so, ChatGPT and similar AI-powered solutions have created quite a buzz in the industry. It really is impressive what they currently do, and they offer much future promise. Since those of us in the technology world have been inundated with questions and remarkable tales about ChatGPT and similar tools, I took it upon myself to do a little experiment.

The Experiment

As a die-hard Cubs fan, I hopped over to the ChatGPT site and asked: “Which Cubs players have won MVPs?”

It provided a list of names that, on the surface, appeared correct. However, a few minutes spent on Google confirmed that one answer was factually wrong, as were some of the supporting facts about correctly identified players.

Impressively, a subsequent question: “Are there any others?” provided another seemingly accurate list of results. ChatGPT remembered the context of my first query and answered appropriately. Despite this, further investigation confirmed that, once again, not all of the information returned was correct.

As shown from this tiny sample, any organization needs to tread carefully when considering implementing ChatGPT and other AI-powered solutions in their current form. It’s quite possible that they lead to more problems than they solve.

Here is a list of the top issues to consider before embarking on an AI-based search solution like ChatGPT.

Accuracy Issues

For all their potential, their current implementations are haunted by one fact – they can return blatantly false information. As shown above, a sizable portion of the answers were wrong, especially during the follow-up question. Unfortunately, this is a common experience.

Further, there is no reference information returned with the result. This produces more questions than it does answers. What is the “source of truth” for the query response? What authoritative document states this information that can be referenced and verified?

Granted, when you perform a search on a traditional keyword search engine, you can sometimes get nefarious, outdated, or incorrect results. Still, these search engines are not selling the promise that they’re returning the single, definitive answer to your question. You are presented with a list to sift through and make your final decision on what is relevant to your particular needs.

While it’s entertaining to ask ChatGPT – “What is Hamlet’s famous spoken line and repeat it back to me in a pirate’s voice” – would you really want to base an important business decision on feedback that is often inaccurate and unverifiable? All it takes is being burnt by one wrong answer for your users to lose faith in the system.

Complexity and Expense

I like to joke with my clients that we can build any solution quickly, cheaply, and impressively but that they have to pick two of the three. With an AI-based solution like ChatGPT, you may only get to pick one. Implementing an AI solution is inherently complex and expensive. There is a lot of time and complexity involved, and there’s no “point and click, out of the box” option. Relevant tasks to prepare AI for the enterprise include:

Designing and planning for both hardware and software,
Collecting relevant and accurate data to feed into the system,
Building relevant models and training them about your domain-specific knowledge,
Developing a user interface,
Testing and analyzing your results, then iterating, perhaps multiple times, to make improvements; and,
Operationalizing the system into your existing infrastructure, including data integration, support, and monitoring.

Additionally, projects like these require developers with niche, advanced skills. It’s difficult enough finding experienced developers to implement basic keyword search solutions, let alone advanced AI logic. Those that can successfully build these AI-based solutions are few and far between, and in software development, the time of highly-skilled developers comes at a significant cost.

Lack of Explainability

AI-based solutions like ChatGPT tend to be “black box” solutions. Meaning that, although powerful, the logic they use to return results is virtually impossible to explain to a user if it’s even available.

With traditional search engines, the scoring algorithms to rank results are easier to understand. A developer can compare the scores between documents in the result set and quickly understand why one appears higher than the other. Most importantly, this process can be explained to the end user, and adjustments to the scoring can be made easily through search relevancy tuning.

Searching in the enterprise is a different paradigm than the impersonal world of Google, Amazon, and e-commerce search applications. Your users are employees, and you must ensure they are empowered to have productive search experiences. If users can’t intuitively understand why a particular result is showing up for their query, they’re more likely to question the tool’s accuracy. This is especially true for certain users, like librarians, legal assistants, or researchers, who have very specific search requirements and need to understand the logic of the search engine before they trust it.

User Experience and Readiness

The user experience for a tool like ChatGPT will be markedly different. For starters, many of the rich features to which users have grown accustomed – faceting, hit highlighting, phrase-searching – are currently unavailable in ChatGPT.

Furthermore, consider if your users are actually ready to leverage an AI-based solution. For example, how do they normally search? Are they entering 1 or 2 keywords, or are they advanced enough to ask natural language questions? If they’re accustomed to using keywords, a single-term query won’t produce markedly better results in an AI-based solution than a traditional search engine.

Conclusion

Although the current version of ChatGPT may not deliver immediate value to your organization, it still has significant potential. We’re focusing our current research on a couple of areas in particular. First, its capabilities around categorization and auto-summarization are very promising and could easily be leveraged in tandem with the more ubiquitous keyword search engines. Categorization lets you tag your content with key terms and provides rich metadata that powers functionality like facets. Meanwhile, auto-summarization creates short abstracts of your lengthy documents. These abstracts, properly indexed into your search engine, can serve as the basis for providing more accurate search results.

It’s perfectly acceptable to be equally impressed by the promise of tools like ChatGPT yet skeptical of how well their current offerings will meet your real-world search needs. If your organization is grappling with this decision, contact us, and we can help you navigate through this exciting journey.

Blog