About the Explorer

Mapping how AI word usage differs from human language.

Research Motivation

This project is an implementation of research conducted at the FSU NLP Lab. It is motivated by three simple observations:

  • AI systems exhibit a distinct language style;
  • Millions of people interact with this AI-generated language every day; and
  • Sustained exposure to this language may influence how humans write and speak.

One core goal of our research was to develop a fully automated procedure for identifying words that are systematically overused by large language models. Automation is crucial: it allows us to scale the analysis across many languages, domains, and model architectures without manual curation. The Explorer makes this procedure accessible and interactive. For a detailed explanation of the methodology, see the video below.

Visualising the Method

How we track word frequency using windowed prevalence across documents:

The “Why”

Our research goes beyond identifying which words are overused. It asks why large language models develop language styles that diverge so strongly from human baselines.

A substantial part of this divergence appears to be linked to Reinforcement Learning from Human Feedback (RLHF). However, many aspects remain underexplored: for example, the role of annotator demographics, the influence of task framing, or how different optimisation objectives shape lexical behaviour. Understanding these mechanisms is essential if we want to interpret AI-generated language not merely as output, but as a system that can actively shape human communication.

Papers

You can find a selection of our related work here:

Get in Touch

Contact: Tommie Juzek