-
Notifications
You must be signed in to change notification settings - Fork 0
Granular Control over Number Splitting/Merging #23
Description
Feature Enhancement Request for strings2:
Title: Granular Control over Number Splitting/Merging
Problem:
Currently, WithNumberSplitting(true) treats any transition to/from a digit as a hard word boundary. This forces outputs like 123test -> 123-test. Standard programming identifiers often treat numbers as suffixes (file1) or prefixes (2fa) without separation.
Proposed Solution:
Introduce a NumberMode configuration:
SplitAlways(Current behavior):word123->word-123MergeRecursive: Treat digits as compatible with both preceding and succeeding lowercase letters.123test->123testtest123test->test123test
TreatAsLowercase: Simply treat digits as if they were lowercase letters for boundary detection.
Example Prompt:
"Please add a NumberMode option to strings2. We need a mode that mimics typical variable naming where digits do not force a new word. For example, ToKebab('User123ID') should optionally result in user123-id (if '123' merges with 'User') or user-123-id (current). Specifically, we need to support 123test -> 123test (no hyphen) to match legacy behavior."
arran4/go-subcommand#166 (comment)
To prevent the split in
StartWithDigit123->start-with-digit-123,strings2needs to allow numbers to attached to the preceding word.Feature Request: Configurable Number Handling.
Details: We need a mode where digits are treated as a continuation of the previous character class (specifically lowercase letters) rather than a new word boundary.
Example:
strings2.ToKebab("StartWithDigit123", strings2.WithNumberHandling(strings2.MergeWithPreceding))should yieldstart-with-digit123.
arran4/go-subcommand#166 (comment)
To replace the manual
SanitizeToIdentifierlogic completely,strings2would need a way to strictly filter the input characters.Feature Request: Add a
WithAllowedCharacters(predicate func(rune) bool)option.Details: Currently,
strings2splits on delimiters but may preserve other symbols. We need an option to explicitly drop unwanted characters (like@,#, emojis) from the output.Example Prompt: "Add a
WithAllowedCharactersoption toToPascal(and others). If provided, any rune wherepredicate(r)is false should be discarded during tokenization. This allows users to generate strict identifiers by filtering out!unicode.IsLetter(r) && !unicode.IsDigit(r)."