Chatbot Testing Strategies: Balancing Efficiency and Quality

In the ever-evolving world of software development, it is crucial to continuously refine our testing processes to ensure high product quality and timely delivery. In this blog, we delve into a case study showcasing how our Quality Assurance (QA) team improved chatbot solution delivery for a client through the optimization of our testing strategy. By adopting and implementing automated solutions and leveraging AI technology, the QA team not only enhanced efficiency and maintained rigorous quality standards but also achieved significant reductions in testing efforts.

Project Overview

The client at the center of this project is a leading provider of AI-powered chatbot solutions for local governments and residents to use for efficient and effective communication and civic change. Its business focuses on rolling out chatbots across various counties and cities to handle resident complaints related to common issues such as potholes, traffic signals, broken pavements, etc. This particular project was aimed at implementing and enhancing the client’s chatbot solutions by integrating AI features such as automated Web Chat, Text Chat, and Text Alerts.

Challenges Identified

The primary challenge was to streamline the deployment process and ensure consistent quality amidst the growing number of chatbots. Initially, the QA team relied on manual testing for new chatbot implementations, reserving automated testing for regression. However, as the project advanced and the volume of chatbots increased – along with the rollout of a new chatbot version – balancing timely delivery with maintaining high-quality results became increasingly difficult.

To address these issues, the SourceFuse QA team began identifying and analyzing potential bottlenecks. It found several areas requiring re-evaluation and strategy adjustments, including:

Test Data Setup Issues:

The challenge in identifying realistic user queries for chatbot testing impacted the formalization of data setup on the automation framework, affecting new chatbot testing. Although the client provided a list of keywords related to specific issues, such as “pothole” and “mosquitoes,” the QA team struggled to generate diverse user queries. This was because users often phrase their requests in many different ways which complicated the creation of effective test data.

For instance, potential queries for keyword “pothole” could include variations like: “I want to report pothole issues,” “Pothole,” “Large pothole,” “I need to report a large pothole on the street”. With this large number of variations the chatbot needed to recognize, it required to be tested properly in order to give a valid response.

The QA Team conducted extensive research on the county’s portal and performed Google searches to review public complaints, and analyzed test data from previous chatbot configurations to come up with various user queries. This thorough process was essential to identify user queries, but also labor-intensive and prone to manual errors.

Test Case Creation:

The QA team encountered challenges with manually creating hundreds of test cases for each onboarding user story, often with minimal variation between similar cases. Most chatbots share common feature implementations with only minor variations, such as chatbot titles, greeting messages, service request responses, query responses, color codes, message button functionalities, hyperlinks, and multi-language support. Test cases for verifying chatbot responses to the variety of user queries were similarly repetitive and required specific minor adjustments to meet different chatbot requirements.

Initially, the QA team manually adjusted or created these test cases to align with each of the new chatbot requirements. However, this method proved to be inefficient and time-consuming due to its repetitive nature.

Manual Execution of Similar Test Cases:Repetitive testing practices led to inefficiencies, particularly due to manual execution despite having an automation framework in place. For a typical onboarding ticket, the client provided 25 to 40 keywords, each generating about 10 to 15 associated queries. The QA team adhered to a conventional approach, manually certifying each chatbot by executing around 450-500 queries per new chatbot.

After successfully passing the manual test cases, they would then transition to preparing the regression suite for automation. This process was both time-consuming and inefficient, highlighting the need to optimize the automation framework to streamline testing efforts.

Automation Framework Issues:

The automation suite, used solely for regression testing, was taking longer to execute and occasionally produced flaky results, necessitating further optimization.

Transforming the Approach to Testing

As part of their commitment to continuous improvement, the QA team concentrated on refining their testing strategy with the following objectives:

Reduce Testing Efforts: Streamline both the manual and automated testing processes to minimize effort while maintaining high quality.
Maintain Quality: Ensure rigorous quality standards are upheld despite the reduction in testing efforts.
Enhance Efficiency and Effectiveness: Implement practices that boost the overall testing efficiency and effectiveness.

To achieve these objectives and address the identified challenges, the SourceFuse QA team developed a revamped testing approach.

Test Case Replicator for Similar Test Cases:

The team implemented a test case replicator tool in Excel that efficiently created similar test cases for verifying various chatbot responses and other common features with a single click based on stored template values. This tool significantly reduced the manual effort involved in generating similar test cases.
This video demonstrates how the test replicator tool, using simple variables and templates, efficiently generated similar test cases:

Improvised Test Data:

To resolve the primary issue related to test data and enhance the automation framework, the QA team conducted a thorough analysis of chatbot functionalities. Collaborating with the client, they compiled a list of essential keywords and leveraged ChatGPT to generate a diverse range of user queries. This approach enabled the team to phrase the various ways that end-users could query a chatbot and further helped to formalize a robust test data sheet.
This video illustrates how clients typically provided keywords like “Pothole,” “Missed Trash Pickup,” and “Dead Animal Pickup” for chatbot responses. The QA team then employed ChatGPT and Gemini AI to generate relevant test data based on these keyword combinations.

Effective Utilization of Automation Suite:

The QA team enhanced its automation framework to use the improvised Test Datasheets and Test Case Replicator tool to test new chatbots directly via automation. The test data sheets were prepared with the help of ChatGPT during the development phase of the new bots. The enhanced framework used the Test Replicator tool to create test cases quickly and utilized these data sheets to test the newly deployed bots instantaneously by making some configuration changes. Utilizing the bot development period for preparing test cases and their data and using this data to test bots directly via automation, enhanced efficiency and reduced execution timelines.

Parallel Execution: The QA team leveraged automation for repetitive tasks while conducting manual testing for more complex scenarios:
- Automation Testing: Test case replicator tool along with test data sheets focused on coverage for chatbot responses across various utterances related to finalized keywords, multi-lingual scenarios, and negative validations.
- Manual Testing: Limited to essential test cases, such as UI validations (chatbot logs, color codes, message avatars, audio jacks), happy path scenarios, and any complex cases.

Optimization of Automation Scripts:

Code Refactoring: To address long-running issues, the automation team refactored the code to enhance script robustness while producing reliable results.
Regression trimming: Optimized regression testing scope to eliminate redundant checks and thereby decreasing automation script runtime.
Jenkins Configuration: Configured Jenkins for optimized local execution, thereby freeing up local resources for other tasks.

Results Achieved

Improved Delivery Metrics and Velocity:
The team streamlined the certification process for new onboarding user stories, reducing the verification time for new chatbots from 3-4 business days to just 1.5-2 business days. This significantly accelerated the rollout of new chatbots for business, allowing the QA team to increase their output delivery from 3-4 chatbots per sprint to up to 6-8 chatbots enhancing client delivery. Furthermore, the team’s ability to pick up and deliver user stories doubled, allowing for more efficient management of both new and legacy chatbot testing.

Boosted Client Confidence:
Previously focused solely on legacy V1 work, the team struggled to gain traction on the new V2 chatbot version. However, by implementing the Test Case Replicator tool, utilizing test data templates, and refining the automation framework, manual efforts were reduced by 50%. This enabled the team to take on additional new chatbot version(V2) testing, ultimately fostering greater client confidence.

Improved Query Resolution Metrics:
Through close collaboration with clients and by formalizing end-user queries from the outset, the team effectively minimized issues that arose from earlier vague chatbot questionnaires. This proactive approach not only reduced client concerns but also allowed the team to address user queries more swiftly.

Improved Defect Detection and Resolution:
By adopting the redefined approach of early testing using the automation framework and through parallel usage of manual and automation testing, the team significantly improved on defect detection. The automation suite now identifies issues such as response inaccuracies within just 1-2 hours, a major improvement over the previous process that took up to 2 days for detection and another day for resolution. This shift has markedly enhanced overall software quality.

Time Savings and Resource Efficiency:
Refactoring of the automation scripts reduced the overall runtime of the automation scripts by 50%. Also due to more test coverage through automation, testing resources were reallocated to focus more on exploratory testing and addressing complex scenarios, enhancing overall testing effectiveness.

Enhanced Test Practices:
Improved QA processes for test data setup and test case management established more structured testing practices, leading to better overall testing outcomes.

Conclusion

The strategic overhaul of the testing approach throughout this project resulted in a notable reduction in testing efforts while maintaining high quality. By adopting the appropriate tools and optimizing their use, the project achieved significant gains in both efficiency and productivity. This case study underscores the importance of continually evaluating and adapting testing strategies to ensure ongoing improvements and uphold software quality.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Chatbot Testing Strategies:
Balancing Efficiency and Quality

Project Overview

Challenges Identified

Transforming the Approach to Testing

Results Achieved

Conclusion

Discover how SourceFuse helps you to achieve peak testing performance.

Email Us @

Call Us On

Drop By At

Follow Us