The Problem: Traditional Fraud Detection Misses Language

Most fraud detection systems look at structured data — transaction amounts, IP addresses, device fingerprints, behavioral patterns. These signals are valuable. But they miss something fundamental: the text people write when they're lying is structurally different from the text they write when they're telling the truth.

An insurance adjuster reading a claim can sometimes feel something is off — a vagueness in the timeline, an oddly passive voice, a conspicuous absence of emotional detail. That feeling isn't intuition. It's the brain detecting statistical patterns in language that cognitive science has been formally studying since the 1990s.

The gap is enormous. Human judges correctly identify deception at rates barely above chance — roughly 54%, close to a coin flip. Automated systems trained on psycholinguistic features consistently outperform that baseline. The problem was never that deception signals don't exist. It's that humans can't track dozens of linguistic dimensions simultaneously while reading.

The Science Behind the Signals

Five foundational studies established the empirical basis for psycholinguistic deception detection.

Newman et al. (2003) published the first large-scale LIWC-based analysis of deceptive narratives, finding that liars use significantly fewer first-person singular pronouns ("I"), more negative emotion words, and fewer exclusive words (but, except, without) — which require cognitively demanding distinctions between what is and isn't true.

DePaulo et al. (2003) conducted a meta-analysis of 120 studies covering over 10,000 deceptive and truthful statements. The analysis confirmed that deceptive accounts are less detailed, less compelling, and less logically structured. Liars produce less information and provide fewer contextual embeddings — the "where I was," "who else was there," and "what happened right before" details that honest recall naturally includes.

Pennebaker (2011), through his work on LIWC (Linguistic Inquiry and Word Count), demonstrated that pronoun use functions as a reliable cognitive indicator of ownership, focus, and authenticity. High first-person singular use correlates with genuine subjective experience; deflection to second- or third-person ("one would think," "they say") correlates with psychological distance from the account.

Hancock et al. (2004) studied asynchronous text-based deception in computer-mediated communication and found that deceptive messages contained more words overall — liars over-compensate with volume — but fewer sensory and perceptual words, and more references to other people (deflection). They coined the term "linguistic displacement" for the pattern of liars centering their narratives on others rather than themselves.

Zhou et al. (2004) introduced automated deception detection frameworks based on syntactic complexity and quantity cues, finding that deceptive text scores lower on contextual diversity — the range of unique concepts introduced per unit of text — and higher on verbal redundancy.

Six Signal Categories

Across this body of research, six categories of psycholinguistic signals emerge consistently as discriminating between honest and deceptive text.

Signal 01

Pronoun Distancing

Reduced first-person singular usage ("I", "my", "me") and increased passive constructions or third-person framing. Liars psychologically detach from their own narrative.

Signal 02

Hedging & Qualifiers

Overuse of uncertainty markers ("I believe," "it seems," "perhaps," "approximately") when describing events the speaker should know precisely. Honest accounts of lived experience don't require this much qualification.

Signal 03

Emotional Leakage

Inappropriate emotional register — either absent where expected (flat affect in high-stakes claims) or overwrought where it doesn't fit. Genuine emotion is contextually calibrated; fabricated emotion tends to be formulaic or mismatched.

Signal 04

Cognitive Complexity

Reduced lexical diversity, simpler syntactic structures, and lower contextual embedding. Maintaining a lie is cognitively demanding — resources go toward tracking the story, not elaborating it.

Signal 05

Detail Specificity

Honest accounts include spontaneous contextual detail — peripheral people, sensory observations, temporal anchors. Fabricated accounts are often suspiciously clean, with precisely the details needed and none of the noise that real memory produces.

Signal 06

Negation Patterns

Elevated negation frequency ("it wasn't," "there was no," "I never") signals preemptive denial — the linguistic equivalent of someone protesting too much. Truthful accounts don't require extensive pre-emptive negation.

These signals don't operate in isolation. The discriminating power comes from scoring all six simultaneously across the full text — a task that's trivially easy for a machine and nearly impossible for a human analyst reading under time pressure.

Real-World Applications

Psycholinguistic deception detection is a triage tool, not a verdict-rendering system. It surfaces text that warrants closer human review. The applications are broad:

Insurance Claims

Fraudulent property and casualty claims consistently over-index on passive constructions, hedged timelines, and absent sensory detail. The Candor API scores the narrative portion of a claim and flags linguistic anomalies for adjuster review — not to deny claims automatically, but to focus human scrutiny where it belongs.

Legal Testimony & Statements

Witness statements, affidavits, and depositions contain recoverable psycholinguistic signals. Investigators use automated scoring to prioritize which statements deserve intensive follow-up, reducing the workload on human analysts without removing them from the loop.

Marketplace Review Integrity

Fake reviews exhibit pronounced linguistic fingerprints: over-positive emotional language, absence of specific product detail, high first-person use paradoxically combined with generic claims. Automated scoring at scale catches review fraud that human moderators miss.

Compliance Screening

Financial disclosures, regulatory filings, and HR investigation reports all benefit from automated deception signal scoring. A 95-page document can be scored in milliseconds; a human reviewer can then focus their attention on the flagged sections.

Building a Production API

Translating psycholinguistic theory into a reliable production system requires more than counting pronouns. Our approach at Candor involves three layers:

Feature extraction. We compute 40+ linguistic features per input: pronoun ratios, hedge density, negation frequency, exclusive word density, average syntactic depth, type-token ratio, contextual embedding score, and more. These map directly to the signal categories described above.

Composite scoring. Features are weighted and aggregated into a single 0–100 deception score, with five sub-scores exposed in the API response for transparency. A score of 70+ indicates high linguistic anomaly; 40–70 is elevated; below 40 is within normal range. The sub-scores let callers understand why a document scored the way it did.

Calibration. Raw model outputs are calibrated against human-labeled ground truth. Our current calibration set is derived from the LIAR dataset, supplemented with domain-specific labeled examples.

The API accepts raw text — no preprocessing required. Response time is under 800ms for inputs up to 10,000 characters. See the full API reference for request format, response schema, and authentication.

Validation: What the Numbers Actually Mean

848 LIAR samples evaluated
0.534 F1 score
54% Human baseline

We evaluated Candor against 848 samples from the LIAR dataset — a benchmark of political statements with expert veracity ratings. The dataset covers a range of deception types, from outright fabrication to misleading framing, making it a reasonably demanding test of generalization.

F1=0.534 means the model balances precision and recall above the 54% human baseline. This is not a ceiling — it's a floor. The LIAR dataset contains short political claims, which are a harder case for psycholinguistic detection than the longer-form texts (insurance narratives, reviews, statements) where the signal categories have more text to work with.

We report this honestly because the alternative — cherry-picking favorable test conditions — is exactly what a deception detector shouldn't do.

"Automated systems can track linguistic dimensions humans cannot monitor simultaneously. The value isn't superhuman accuracy — it's consistent, tireless attention to features that human reviewers underweight." — Drawn from Hancock et al. (2004) on the asymmetric cognitive load of deception

The full validation methodology — including how we handle the gray zone of ambiguous statements, our precision/recall breakdown, and how we compare against the human baseline — is published on the validation page.

Try the API yourself

Paste any text into the live demo. See the score, the five sub-signals, and the flagged sentences — in real time, no account required.

References

  1. Newman, M. L., Pennebaker, J. W., Berry, D. S., & Richards, J. M. (2003). Lying words: Predicting deception from linguistic styles. Personality and Social Psychology Bulletin, 29(5), 665–675.
  2. DePaulo, B. M., Lindsay, J. J., Malone, B. E., Muhlenbruck, L., Charlton, K., & Cooper, H. (2003). Cues to deception. Psychological Bulletin, 129(1), 74–118.
  3. Pennebaker, J. W., Chung, C. K., Ireland, M., Gonzales, A., & Booth, R. J. (2011). The development and psychometric properties of LIWC2007. Austin, TX: LIWC.net.
  4. Hancock, J. T., Thom-Santelli, J., & Ritchie, T. (2004). Deception and design: The impact of communication technology on lying behavior. CHI 2004 Proceedings, 130–136.
  5. Zhou, L., Burgoon, J. K., Twitchell, D. P., Qin, T., & Nunamaker, J. F. (2004). A comparison of classification methods for predicting deception in computer-mediated communication. Journal of Management Information Systems, 20(4), 139–165.