Using AI to build a DIY customer sentiment analysis solution


When it comes to data, marketers have focused primarily on the quality of the structured data in their CRM and marketing automation platforms over the past 15 years. 

In my March article, I discussed why current data governance and management processes must be revisited. This time, I will maintain my focus on unstructured data by diving into the challenges of customer satisfaction surveys.

We’re all familiar with customer satisfaction surveys. For our purposes here, there’s no need to differentiate between an NPS (Net Promoter Score), five-point rating or similar customer experience metrics. Instead, we’re going to focus on the free text box, which is often the last part of these surveys.

How we got here

Whether you are directly responsible for customer surveys or you’re partnering with a customer success team, extracting actionable insights from free-text comments is a common challenge. 

There are several reasons for this. 

The platform-related challenge: Today’s survey tools range from form-only solutions like Google Forms and Typeform, to full survey solutions like SurveyMonkey, and full enterprise customer experience platforms like Qualtrics and Medallia.

Process-related challenges: There is a range of different survey approaches, resulting in different data formats and different roles responsible for analyzing survey data.

People-related challenges: These include prioritizing voice of customer (VOC) programs relative to other initiatives, properly analyzing unstructured comments, and relying on quantitative metrics that are simply easier to measure, such as year-over-year trends. 

In the automation era, we might also be guilty of over-surveying, which leads to even more data —  a problem of volume and velocity.  

Dig deeper: How to develop a customer marketing strategy from scratch with Google Gemini

‘Black box’ solutions and limitations

Many survey tools and customer experience platforms have embedded algorithms and natural language processing (NLP) capabilities to conduct some analysis. Think of NLP and corresponding algorithms as keyword matching on steroids. They use the surrounding context to help structure the unstructured free-text feedback into various sentiment categories, like positive, neutral or negative.

Some platforms also included human-in-the-loop feedback cycles that ask users to help categorize feedback, particularly if there was additional context in the free text box. 

Because this functionality is often embedded in the survey tools, there are a few potential issues:

  • Trust gaps: You don’t know precisely how the sentiment analysis algorithm categorizes the free text.
  • Capacity hurdles: Team members must invest time to help with system categorization.
  • Implementation trade-offs: Teams have to use centralized tools to get analysis capabilities.
  • Cost: It’s another system to buy and manage and more integration work.

We can now chat with our data

Thanks to generative AI tools like ChatGPT and Gemini, today we all have a black box at our fingertips and can use it to test and validate sentiment categorizations. We can go beyond the positive, negative or neutral rating by probing deeper to understand the specifics.

If your team finds the cost of a full-featured customer survey and analysis platform too steep for its budget, you could even build a DIY solution.

Let’s examine how I put this approach to the test.

First, I created a set of synthetic customer data using ChatGPT to avoid any data privacy concerns. I prompted it to provide a sample of customer data records that engaged with an online retailer, including the customers’ overall satisfaction and open-text box feedback.

A sample of synthetic customer data generated by ChatGPT.

I then uploaded the file into Google Sheets and clicked on the “Analyze this data” button, which puts Gemini’s AI model directly into Sheets.

Gemini's 'Analyze this data
 button

Within seconds, Gemini generated a top-line summary of qualitative results with broad-based trends to characterize the data.

Gemini analysis of the data.

Without leaving Google Sheets, the embedded Gemini chat asked me if I’d like to delve deeper.

Gemini's offer to delve deeper.

This inline capability demonstrates why the new DIY approach will be so impactful. It’s no longer about separate tools, as I literally chatted with my data to delve deeper with each prompt, like this:

Prompt for free text feedback analysis.
Summary of initial themes in the data.

I know what you’re thinking: Shouldn’t I verify this myself? 

Yes, but this top-level summary gave me a starting point. It’s no different from what I’d receive from an entry-level analyst (which I’d also want to verify). The critical difference was my ability to do this directly inline, without leaving my data sheet and analysis flow.

I then simulated a similar test. I uploaded the data directly to Gemini and prompted it to do a more complete sentiment analysis.

Observations from open feedback from Gemini.

If you’ve used some of the newer LLM models, you’ve seen how they show their thinking to users. In this case, that’s a critical dimension, as it provided me with its sentiment classification method.

Survey platforms with embedded analysis tools might link unstructured data feedback to the overall satisfaction score. I asked my LLM to determine the sentiment exclusively on the text without anchoring it to the overall rating.

The LLM identified four specific examples in which the anchored sentiment differed from the direct sentiment and why. This could spark other customer service or follow-up actions, all within minutes and within my control.

Key findings from Gemini analysis.

Implications for marketers and the martech stack

If you’re currently using a full-featured survey and analysis platform, this exercise might raise new questions about what embedded algorithms do and how they do it. Alternatively, if you run surveys outside of a platform, you’re more capable than ever of separating the collection and analysis of data.

For SMBs, this exercise demonstrates a cost-effective, DIY approach to survey collection and analysis. These teams can leverage basic forms solutions (e.g., Google Forms)  instead of full survey platforms. 

Data privacy and confidentiality.

It would be irresponsible to conclude this exercise without addressing one of the major reasons these approaches may be limited initially: the lack of clarity regarding data privacy and confidentiality guidelines.  

Most experts recommend teams conduct this type of analysis only within the walled garden of their organization’s Team or Enterprise versions of ChatGPT, Microsoft CoPilot or Google Gemini Workspace, based on data retention and do-not-train-the-model settings.

Teams should consult their organization’s legal, compliance, and IT policies before placing sensitive data in an AI/LLM platform. Of course, many individuals may be experimenting already, which leads to the critical step of revisiting your organization’s data governance policies. The latest wave of “chat with CRM” connectors will accelerate these concerns and heighten the need to adjust compliance frameworks.

Flip your feedback process

Although it’s still early, I am encouraged by these early tests of “chatting with data.” Like every other AI-infused trend, the technology has outpaced our ability to adjust processes and platform management. However, when analyzing free-text feedback becomes comparable to running quantitative summaries, my hope is that we can flip our feedback processes to focus more on what they said rather than only how they rated us.

Dig deeper: Why AI-powered customer engagement projects fail before they start

Contributing authors are invited to create content for MarTech and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. MarTech is owned by Semrush. Contributor was not asked to make any direct or indirect mentions of Semrush. The opinions they express are their own.



Source link