MODELING SEMANTIC DRIFT IN CHATGPT RESPONSES
11
conversational dynamics. Unlike prior work that relied on external dialogue corpora, we
construct a transition matrix from annotated ChatGPT responses themselves. By apply-
ing Markov chains to track shifts in meaning, we provide a simple and reproducible way
to see how ChatGPT moves between definitions, examples, clarifications, and eventually
drifts off-topic. This approach fills a gap in the literature by modeling semantic drift with
probability, making the process transparent and scalable.
2.1. Markov Chains for Semantic Modeling. Markov chains are a simple yet pow-
erful way to model how things change over time. Imagine a system that moves from one
state to another—like shifting from “explaining” to “giving an example” based only on
where it is right now, not on the full history. This idea is called the Markov assumption:
the next move depends only on the current state. Formally, if Xt represents the semantic
state at time t, then:
Pij = P(Xt+1 = sj | Xt = si, Xt−1, . . . , X0) = P(Xt+1 = sj | Xt = si).
The behavior of a Markov chain is governed by a transition matrix P, where each entry
Pij represents the probability of moving from state Si to state Sj. Each row of the matrix
sums to 1, ensuring that the system always transitions to some state.[1],[2]
In this paper, we use this method to study how ChatGPT’s responses shift between
different types of meaning, like “Definition,” “Example,” or “Repetition.” We treat each
of these as a state in the chain and simulate how ChatGPT moves between them over
time. By doing this in Excel, we uncover patterns in how ChatGPT stays on topic,
repeats itself, or drifts away. It’s a simple, transparent way to understand how these
models behave—no deep coding or AI expertise needed.
3. Methodology
We define six semantic states based on functional roles in ChatGPT responses:
S = {s1, s2, s3, s4, s5, s6} = {Definition, Example, Meta-comment, Repetition, Clarification, Off-topic}
3.1. Transition Matrix Construction. Transition Matrix
To understand how ChatGPT’s responses shift over time, we broke each utterance into
one of six categories: Definition, Example, Meta-comment, Repetition, Clarification, and
Off-topic. We then collected 50 responses from ChatGPT and annotated them according
to these roles. By looking at pairs of consecutive utterances, we could see how meaning
moved from one state to another. The probability of moving from state si to state sj was
calculated using a simple formula
N(si → sj)
Pij =
ꢀ
6
k=1 N(si → sk)
The resulting transition matrix, built directly from ChatGPT’s annotated responses, is
shown below: