Skip to content

AI-iness density pre-check to avoid over-correcting human-first text #93

@adelaidasofia

Description

@adelaidasofia

Problem

When humanizer runs on text that's already mostly human-written (personal journals, rough drafts, meeting notes), it can over-correct. Fragments, first-person voice, and specific names get flagged by lower-tier rules even though the text isn't AI-generated. The result is unnecessary rewrites that strip voice from already-authentic writing.

Proposed solution

A pre-flight density check that counts Tier-1 AI tells per 100 words before applying any fixes, then selects a pass strength:

  • Low density (0-2 tells/100 words): Light mode. Only apply dead-giveaway patterns (Tier 1). The text is human-first; leave it alone except for obvious AI artifacts.
  • Medium density (3-5 tells/100 words): Mixed mode. Apply Tiers 1-2. Some AI assistance detected but the voice is mostly authentic.
  • High density (6+ tells/100 words): Full mode. Apply all tiers. This is AI-first text that needs comprehensive humanization.

Why this matters

The current behavior is all-or-nothing: every rule runs on every text. This works great for fully AI-generated content but creates false positives on human-authored or lightly-assisted text. The density check makes humanizer safe to run on any input without worrying about destroying authentic voice.

I've been running this in my fork (adelaidasofia/humanizer) and it eliminates the "humanizer made my writing worse" failure mode on personal/journal content.

Happy to send a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions