Understanding the Evolution of AI Chatbots: A Guide based on learnings from the Jisc LearnWise Pilot

1. Introduction

Chatbots are not new. Many colleges and universities have experienced them over the past decade, often with mixed results. But while the use cases have broadly remained the same — providing instant, automated responses to common queries — the technology powering today’s chatbots is fundamentally different from that of earlier systems.

This guide explains what has changed, and why those changes matter. It focuses on the rise of large language models (LLMs) and how these models are now being applied in education to deliver smarter, more accurate, and more flexible chatbot assistants.

This report forms part of a wider series supporting the adoption of AI in education. For guidance on practical implementation, see Implementing LearnWise in Colleges and Universities: A Practical Guide based on learnings from the Jisc LearnWise Pilot.

2. From Scripted Responses to Contextual Understanding: How Chatbots Have Changed

Then: Scripted Chatbots with Limited Scope

Earlier generations of chatbots typically relied on rule-based systems and limited natural language processing. These bots used an intent recognition engine to match user queries to a predefined set of responses. For example, if a student typed “How do I reset my password?”, the chatbot would match that to an intent such as IT help: password reset and return a scripted response.

These systems could be effective for a narrow set of questions, but they lacked flexibility. If a user asked the same question in an unexpected way — “Can you help me with my login?” — the bot might fail to understand. Furthermore, updates and expansions required technical staff to add new intents and responses manually.

Now: Conversational AI Built on LLMs

Modern chatbots powered by large language models do not rely on pre-scripted paths. Instead, they can generate appropriate responses even for queries they haven’t seen before.

Rather than mapping queries to fixed categories, LLM-powered chatbots interpret meaning from the full context of a user’s question. This enables them to respond more naturally, handle a wider range of phrasings, and manage follow-up questions without losing context.

This flexibility is especially valuable in education, where students may ask the same question in many different ways, or combine multiple requests in a single message.

3. Key Technologies Behind Today’s Chatbots

What is a Large Language Model?

Large language models (LLMs) are AI systems trained on vast amounts of text data to predict the next word or token in a sequence. They are essentially predictive engines: given a string of words, they calculate the most likely continuation, one token at a time.

For example, if given the phrase “I need help with my…”, an LLM might predict tokens like “assignment,” “login,” or “course,” depending on context.

Flow diagram showing that the sentence "I need help with my..." can be completed in alternative ways, and that assignment has been selected as the most likely next word in this case.

Figure 1. A diagrammatic representation of how LLMs predict the next word in a sequence

This process is repeated many times per second, allowing the model to generate full, fluent sentences that often seem remarkably human.

How LLMs Use Attention

A key feature of modern LLMs is the attention mechanism, part of the transformer architecture that allows the model to consider the importance of each word in a sentence when making predictions.

Unlike earlier models that processed language word by word in a fixed order, attention allows LLMs to dynamically link words and phrases across long stretches of text. This gives the model a much more powerful and nuanced grasp of language structure, meaning, and intent.

Why LLMs Are Well-Suited for Chatbots

LLMs can adapt to a wide range of queries, including vague, informal, or multi-part questions. In educational contexts, this means:

Handling informal phrasing: “When’s the deadline for hand-in?”

Understanding context: “Where do I upload my assignment — the link’s not working.”

Managing follow-ups: “What if I miss it?”

Because LLMs generate responses based on patterns in language rather than predefined rules, they can deal with a broader range of user inputs — making them significantly more effective than older systems.

Making Chatbots Accurate and Contextual: RAG and Prompting

While LLMs are flexible, their general training does not include institution-specific information. To provide accurate responses in local contexts, they are paired with additional technologies.

Retrieval-Augmented Generation (RAG)

RAG allows chatbots to search a curated knowledge base (such as student handbooks, policy documents, or FAQs) and use that content as the basis for their answers. The process works like this:

A user asks a question.
The system retrieves relevant documents or snippets from the knowledge base.
The LLM generates a response using the retrieved content.

This approach ensures that the chatbot’s answers are grounded in authoritative, institution-approved sources, reducing the risk of errors.

Prompt Engineering and Guardrails

Chatbots can be configured using prompts — structured instructions that shape how the model responds. This might include:

Encouraging a helpful and concise tone

Including follow-up suggestions

Redirecting high-risk queries to human teams

Platforms like LearnWise allow institutions to define these behaviours in detail, and to specify different settings for different audiences (e.g. students vs staff).

Managing Hallucinations and Incorrect Answers

LLMs can sometimes generate confident-sounding but incorrect answers — a phenomenon known as hallucination.

To reduce this risk, educational chatbots typically include:

RAG grounding, so responses are based on factual data

Fallbacks or disclaimers when the chatbot is unsure

Analytics and monitoring tools to identify and address problematic responses

Escalation pathways that redirect users to appropriate support when needed

During the LearnWise pilot, participating institutions reported high accuracy and minimal hallucinations.

4. A Tale of Two Chatbots: Diverging Experiences from Jisc Pilots

To date, Jisc has run two AI pilots that focused on chatbots. In 2021, we launched a project in which a chatbot developed in-house was piloted with 5 colleges. This chatbot was built using Amazon Lex, and utilised the intent matching approach described above. In 2024, we launched a pilot of LearnWise, an AI assistant that harnessed LLMs as a key part of its architecture. The difference in performance of these two chatbots was staggering.

Our evaluation from the 2021 pilot found that the chatbot in question:

Rarely gave an appropriate answer to the question being asked

Often gave irrelevant or null responses

Was only effective for very simple queries

Demanded significant time from staff, who needed to continually review the content used to set up the chatbot

Tended to consume more staff time than it saved

Contrast this with LearnWise. In the 2024 pilot, institutions reported a markedly improved experience. LearnWise consistently delivered accurate, relevant responses across a range of student and staff queries. Unlike the 2021 pilot, where staff spent considerable time maintaining and troubleshooting the chatbot’s intent library, LearnWise’s LLM-based architecture enabled more flexible and natural interactions with minimal manual upkeep.

Participants highlighted several strengths: a straightforward and efficient setup process, and high-quality, context-aware responses. Crucially, LearnWise avoided many of the issues that plagued earlier systems — including irrelevant answers and poor handling of follow-up questions. Feedback from institutions noted that hallucinations were rare, and that responses were professional, accurate, and, where necessary, empathetic — particularly important for queries relating to wellbeing and safeguarding.

Staff appreciated the ability to customise the chatbot for different audiences and use cases, and several institutions viewed the assistant as a contributor to long-term strategic goals like improving retention and reducing workload.

Jisc’s own implementation of LearnWise on our Explore AI platform echoed these findings. The assistant has been well received by users, delivering clear, relevant answers and demonstrating strong reliability across a broad range of queries. Furthermore, our ExploreAI Bot has been straightforward to administer, requiring markedly less setup and upkeep work that the chatbot we piloted in 2021.

5. Summary and Next Steps

Chatbot technology has evolved. While earlier systems relied on rigid scripts and limited language processing, today’s AI chatbots are built on large language models capable of understanding, reasoning, and generating context-aware responses.

These improvements are not just theoretical. As shown in the LearnWise pilot, modern chatbots can provide fast, accurate, and relevant answers – which are grounded in trusted institutional data.

For guidance on how to deploy a chatbot in your institution, including integration, testing, and governance, see Implementing LearnWise in Colleges and Universities.

Find out more by visiting our Artificial Intelligence page to explore publications and resources, learn more about our communities and sign up for our AI Literacy training.

For regular updates from the team sign up to our mailing list.

Get in touch with the team directly at AI@jisc.ac.uk