Skip to content

Inline Iffy content moderation into Gumroad#4544

Open
slavingia wants to merge 38 commits intomainfrom
sahil/inline-iffy
Open

Inline Iffy content moderation into Gumroad#4544
slavingia wants to merge 38 commits intomainfrom
sahil/inline-iffy

Conversation

@slavingia
Copy link
Copy Markdown
Contributor

Inline Iffy content moderation into Gumroad

What

Removes the external Iffy service dependency and runs all content moderation logic directly in the Gumroad Rails app. Previously, Gumroad sent product/post/profile content to Iffy via HTTP, then received webhook callbacks with moderation decisions. Now everything runs locally:

  • Three moderation strategies run in parallel: blocklist (word matching), OpenAI classifier (moderation API), and LLM prompt (GPT-4o-mini with adult content + spam presets, two-pass uncertainty filtering)
  • An orchestrator service extracts content, runs strategies, takes action (unpublish/flag/suspend), and auto-suspends users when flagged record count exceeds a configurable threshold
  • Async Sidekiq jobs on the low queue replace the old ingest-then-webhook round-trip

Also removes: Iffy webhook controller/route, request_from_iffy? admin auth bypass, _from_iffy admin actions, dead contact_iffy_risk_analysis buyer fraud endpoint, and all related env vars/config.

Why

Iffy added a separate codebase, HTTP round-trips, webhook HMAC verification, and external failure modes — all for logic that's straightforward to run locally. Inlining it makes content moderation easier to debug, faster to iterate on, and eliminates an external dependency. The moderation strategies (blocklist, OpenAI classifier, LLM prompt) are implemented using the ruby-openai gem already in the Gemfile.

Configuration is via GlobalConfig keys (CONTENT_MODERATION_ENABLED, CONTENT_MODERATION_SUSPENSION_THRESHOLD, CONTENT_MODERATION_PERCENTAGE, CONTENT_MODERATION_BLOCKLIST, OPENAI_API_KEY) so behavior can be tuned without deploys.


This PR was implemented with AI assistance using Claude Opus 4.6.

Prompts used:

  • "inline all iffy logic. we are going to delete iffy and just rely on gumroad code as it'll be much easier to debug and less confusing: https://github.com/antiwork/iffy/pulls is the codebase for iffy to analyze what it does. you can see the inputs/outputs from gumroad repo and inline it all. use gumroad-old for how things used to be done if knowing existing patterns helps."
  • "use new branch for this inline-iffy"
  • "create a PR"

@gumclaw
Copy link
Copy Markdown
Contributor

gumclaw commented Apr 16, 2026

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 16, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

@gianfrancopiana
Copy link
Copy Markdown
Member

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 16, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

Copy link
Copy Markdown
Contributor

@gumclaw gumclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Inline Iffy Content Moderation

Nice work inlining the Iffy service. The architecture is clean: three strategies running in parallel, configurable via GlobalConfig, and async via Sidekiq. The net deletion of ~1200 lines is great.

Issues to address before merging

1. Thread safety in ModerateRecordService#run_strategies
The three strategies run in Thread.new blocks. If one raises an exception, thread.value will re-raise it in the calling thread, but the other threads may still be running. Consider using a rescue inside each thread or Thread.new { strategy.perform rescue ... } to avoid orphaned threads on partial failures.

2. PromptStrategy silently swallows all errors
Both evaluate_preset and passes_uncertainty_check? rescue StandardError. If OpenAI is down or returns garbage, the strategy silently returns "compliant." This means a broken API key or network issue = zero moderation from the LLM path. Consider at minimum logging at warn level in passes_uncertainty_check? (it currently has no logging).

3. Race condition in check_user_suspension_threshold

return if user.suspended?
# ...
user.suspend_for_tos_violation!(...) unless user.suspended?

Between the first check and the suspend call, another job could have already suspended the user. The unless user.suspended? inside the transaction helps, but user may be stale. Consider user.reload inside the transaction or using a DB-level lock.

4. queue_content_moderation_job condition looks inverted

def queue_content_moderation_job
  return unless is_unpublished_by_admin? && !saved_change_to_is_unpublished_by_admin?
  ContentModeration::ModerateProductJob.perform_async(id)
end

This only queues moderation when the product is already unpublished by admin and that flag did not just change. That seems like it only re-moderates already-flagged products. Is this intentional? The old queue_iffy_ingest_job_if_unpublished_by_admin had the same logic, so maybe this is correct for re-evaluation, but it means newly published products only get moderated via the purchase probability path (trigger_content_moderation in Purchase) or the after_save on rich_content/seller_profile_rich_text_section. Worth confirming the product creation/update flow triggers moderation too.

5. Missing after_save callback for Link (product) creation
Looking at the model changes, Link only has after_save :queue_content_moderation_job which has the condition above, and after_update :reset_content_moderated_flag. There's no callback for when a product is first published or updated. The old Iffy code had Iffy::Product::IngestJob triggered in more places. How does a brand new product get moderated? Only via the purchase probability path?

6. mark_product_compliant auto-publishes

def mark_product_compliant
  return if !record.is_unpublished_by_admin?
  record.update!(is_unpublished_by_admin: false)
  record.publish!
end

If a product was unpublished by a human admin for reasons unrelated to content moderation, the automated system could re-publish it. The old Iffy system had the same issue, but now is a good time to consider adding a moderation_unpublished flag vs generic is_unpublished_by_admin.

7. RedisKey.content_moderation_probability rename
The code references RedisKey.content_moderation_probability but I don't see where RedisKey is updated. The diff shows the change from iffy_moderation_probability to content_moderation_probability in redis_key.rb. Make sure any existing Redis keys are migrated or the default value (currently 0.001) works for new deploys.

Minor suggestions

  • ContentExtractor#extract_from_post calls Nokogiri::HTML(installment.message) twice. Cache the parsed doc.
  • BlocklistStrategy converts @text to lowercase in initialize but then uses /i flag in the regex. One or the other is sufficient.
  • The AUTHOR_NAME = "ContentModeration" constant is good. Consider also adding this to a shared module since the Sidekiq jobs and service both reference it indirectly.
  • Tests for the new services are missing from this PR. The old Iffy tests are deleted but no new content_moderation specs are added. This is a significant gap.

Overall

The design is solid and much simpler than the external Iffy round-trip. The parallel strategy execution is nice. Main concerns are around edge cases in the moderation lifecycle (when products get moderated, auto-publishing safety) and the missing test coverage.

@slavingia
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 16, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/controllers/api/internal/helper/users_controller.rb
Comment thread app/services/content_moderation/strategies/classifier_strategy.rb
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
gumclaw and others added 2 commits April 16, 2026 17:22
- Add ContentModeration::ModerateRecordService.check class method that
  returns a CheckResult (passed/reasons) without taking actions
- Hook into Link#publish! to run moderation synchronously and block
  publishing if content is flagged
- Hook into Installment#publish! with the same pattern
- Remove probability-based purchase trigger (moderation now happens
  at publish time)
- VIP creators are exempt from publish-time moderation
- Keep async Sidekiq jobs for bulk re-scans
- Add comprehensive specs for publish-blocking behavior
Move the Link publish-time content moderation specs out of the existing
`describe "publish!"` block, which relied on a factory-created Stripe
merchant account. Those cassettes are missing/brittle for the new
examples and caused long Stripe-retrieve timeouts. The new context stubs
the merchant/shipping/transcode enforcement so the specs focus purely
on the publish-blocking behavior.

Also stub `vip_creator?` via `allow_any_instance_of(User)` so the VIP
exemption path is hit reliably regardless of how the record reloads
its `user` association during `save!`. The previously recorded VCR
cassettes for those examples are no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/content_extractor.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/controllers/api/internal/helper/users_controller.rb Outdated
- Add suspended_by_content_moderation? check so Stripe-risk and manual
  admin suspensions are never overridden by content moderation
- Clear content_moderated flag after republish to prevent stale flag
  from causing future false republishes
- Add .visible scope to products query in user_flagged_record_count
  to exclude soft-deleted products
- Validate user is actually suspended/flagged before allowing appeal
  submission in Helper API
Comment thread spec/controllers/api/internal/helper/users_controller_spec.rb
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread spec/services/content_moderation/moderate_record_service_spec.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
gumclaw added 2 commits April 16, 2026 21:43
1. Helper appeal tests: user must be flagged/suspended before
   create_appeal succeeds (new guard added in controller)
2. Compliant product test: mark_product_compliant now clears
   content_moderated flag, so expect not_to be_content_moderated
3. Unsuspension test: suspended_by_content_moderation? checks
   author_name, so use 'ContentModeration' not 'Admin'
The add_product_comment callback requires product_id unless
bulk: true is passed. The test setup doesn't have a product
context, so needs the bulk flag.
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/controllers/api/internal/helper/users_controller.rb
gumclaw added 2 commits April 16, 2026 22:29
…ation check

The content_moderation_check validation runs during save!, which
triggers image variant processing that creates tempfiles. These
tempfiles get garbage collected before ActiveStorage's after_commit
upload fires, causing ENOENT errors.

Fix: pass text_only: true for the synchronous publish validation.
Image moderation still runs in the async Sidekiq job path.
Previous fix stripped image_urls after extraction, but
extract_from_product still loaded display_asset_previews and
triggered image variant processing, causing tempfile ENOENT.

Now text_only is passed through to the ContentExtractor methods
so image URLs are never loaded in the first place.

Also updates link_spec stubs to match the new text_only: true
signature.
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/models/link.rb Outdated
Comment thread app/services/content_moderation/strategies/prompt_strategy.rb
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/models/installment.rb Outdated
Comment thread app/services/content_moderation/content_extractor.rb Outdated
Comment thread spec/services/content_moderation/moderate_record_service_spec.rb Outdated
Comment thread spec/models/installment_spec.rb
Comment thread spec/models/link_spec.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread spec/services/content_moderation/moderate_record_service_spec.rb Outdated
Comment thread app/sidekiq/content_moderation/moderate_product_job.rb Outdated
gumclaw added 3 commits April 18, 2026 10:06
1. mark_product_compliant: move update_flag after user-state guards to
   prevent orphaned records; add stale-flag clearing branch for manually
   re-published products (mirrors mark_post_compliant)
2. ModerateProductJob: use Link.where(deleted_at: nil, banned_at: nil)
   instead of Link.alive so CM-unpublished products can be re-moderated
3. link_spec: flip test expectation for published product description
   changes (now correctly expects job to be enqueued)
4. moderate_record_service_spec: update tos_violation_reason assertion
   to match actual reasoning string; update fraud/suspension tests to
   expect content_moderated flag preserved when guards prevent republish
Add previously_new_record? guard to the published-product edit path
in queue_content_moderation_job so the job doesn't fire when a product
is first created (only on subsequent edits to name/description).

All 58 content moderation specs pass locally.
Comment thread app/models/installment.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
@github-actions
Copy link
Copy Markdown

👋 @slavingia — This PR has had changes requested for over 27h. Could you address the review feedback when you get a chance? Keeping the review cycle under 24h helps keep things moving.

This is an automated review-cycle reminder.

@github-actions
Copy link
Copy Markdown

👋 @slavingia — This PR has had changes requested for over 51h. Could you address the review feedback when you get a chance? Keeping the review cycle under 24h helps keep things moving.

This is an automated review-cycle reminder.

Comment thread app/models/link.rb
Comment on lines 208 to +211
validate :one_coffee_per_user, on: :create, if: -> { native_type == Link::NATIVE_TYPE_COFFEE }
validate :quantity_enabled_state_is_allowed
validate :default_offer_code_must_be_valid
validate :content_moderation_check, if: -> { publishing? && !skip_content_moderation_check }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The admin/links_controller#publish action does not rescue ActiveRecord::RecordInvalid, so when a product fails the new content moderation check during an admin-initiated publish, the admin receives a generic 500 error message instead of the actual moderation failure reason. Fix by adding ActiveRecord::RecordInvalid to the rescue clause at line 52, mirroring the existing pattern in links_controller.rb:447.

Extended reasoning...

Bug: The PR adds validate :content_moderation_check, if: -> { publishing? && !skip_content_moderation_check } to Link. When blocklisted content is detected, the validator calls errors.add(:base, "Content moderation failed: ...") and returns without raising — but publish! calls save!, which raises ActiveRecord::RecordInvalid when any validation fails.

Code path: admin/links_controller.rb:49-60 calls @product.publish! inside a begin/rescue block that only catches Link::LinkInvalid and WithProductFilesInvalid (line 52). An ActiveRecord::RecordInvalid exception raised by the new CM validation falls through to the catch-all rescue => e block at line 54.

Why existing code does not prevent it: The non-admin links_controller.rb:447 already includes ActiveRecord::RecordInvalid in its rescue clause (rescue Link::LinkInvalid, ActiveRecord::RecordInvalid) and correctly renders the specific error message. The admin controller was not updated in this PR to match.

Impact: When an admin tries to publish a product with blocklisted content:

  1. ErrorNotifier.notify(e) is called, generating spurious error-tracking noise for an expected, user-actionable condition.
  2. The response returns the generic I18n.t(:error_500) message instead of the specific CM failure reason, giving the admin no actionable information about what content triggered moderation.

Fix: Change line 52 of app/controllers/admin/links_controller.rb from:
rescue Link::LinkInvalid, WithProductFilesInvalid
to:
rescue Link::LinkInvalid, WithProductFilesInvalid, ActiveRecord::RecordInvalid

Step-by-step proof:

  1. Admin visits a product in the admin panel and clicks Publish.
  2. admin/links_controller#publish calls @product.publish!.
  3. publish! sets self.publishing = true then calls save!.
  4. save! triggers all validations including content_moderation_check.
  5. The CM service finds blocklisted content, calls errors.add(:base, ...), and returns.
  6. save! sees non-empty errors and raises ActiveRecord::RecordInvalid.
  7. The rescue at line 52 does not match ActiveRecord::RecordInvalid, so execution falls to rescue => e.
  8. ErrorNotifier.notify(e) fires (false-positive alert).
  9. Admin receives { success: false, error_message: I18n.t(:error_500) } with no moderation details.

Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/strategies/classifier_strategy.rb
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread .env.example
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/content_extractor.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/models/rich_content.rb Outdated
validates :description, json: { schema: DESCRIPTION_JSON_SCHEMA, message: :invalid }

after_update :reset_moderated_by_iffy_flag, if: -> { saved_change_to_description? && alive? }
after_update :reset_content_moderated_flag, if: -> { saved_change_to_description? && alive? }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 When a seller edits their product's rich content blocks (ProseMirror JSON), the RichContent#after_update callback at line 59 fires reset_content_moderated_flag but never enqueues ContentModeration::ModerateProductJob. Since Link#queue_content_moderation_job only watches saved_change_to_name? and saved_change_to_description? on the Link record itself, rich content edits bypass all three moderation strategies entirely—even though ContentExtractor explicitly includes rich content text and images in its checks. Add an after_update callback on RichContent that calls ContentModeration::ModerateProductJob.perform_async(entity_id) when the entity is a Link and the description changed.

Extended reasoning...

The gap: RichContent#after_update (rich_content.rb:59) fires reset_content_moderated_flag whenever description changes on a live record. That method (lines 113–118) either clears the content_moderated bit flag or returns early—it never enqueues a background moderation job. The only place ContentModeration::ModerateProductJob gets enqueued for a product is Link#queue_content_moderation_job (link.rb:1431–1437), which fires only when the Link record itself saves a change to its own name or description columns.

Why existing guards don't help: Link#queue_content_moderation_job has no awareness of associated RichContent records. Editing the ProseMirror blocks touches only the rich_contents table; the links row is never dirtied, so saved_change_to_description? on Link is always false and the job is never scheduled.

The irony: ContentModeration::ContentExtractor#extract_from_product (content_extractor.rb:13) explicitly builds its text payload as "Name: … Description: … " + rich_content_text(product.alive_rich_contents) and gathers rich_content_file_image_urls and rich_content_embedded_image_urls (lines 55–65) for image checks. The infrastructure to moderate rich content is fully wired—the trigger is simply missing.

Concrete proof of exploit:

  1. Seller publishes a product. ModerateProductJob runs once on the Link name/description and the product passes moderation (content_moderated = false).
  2. Seller opens the product editor and edits a rich-content block to embed a blocklisted phrase or adult image URL.
  3. Rails saves the RichContent row. after_update :reset_content_moderated_flag fires; because content_moderated is already false, the guard \!entity.content_moderated? short-circuits and the method returns immediately—no write, no job.
  4. No ModerateProductJob is ever enqueued. The seller's edited content is live and will never be inspected by BlocklistStrategy, ClassifierStrategy, or PromptStrategy.

Second scenario (admin-unpublished products): If a product was previously flagged (content_moderated = true) and then admin-unpublished (is_unpublished_by_admin? = true), reset_content_moderated_flag hits the entity.is_unpublished_by_admin? guard on line 115 and returns early without clearing the flag or queueing re-moderation. So even after the seller cleans up the violating content and re-edits, no job is scheduled to verify the fix.

Fix: Add an after_update callback to RichContent that enqueues ContentModeration::ModerateProductJob.perform_async(entity_id) when entity is a Link, saved_change_to_description? is true, and alive? is true—mirroring the pattern already used on Link and Installment.

Comment thread app/models/link.rb Outdated
Comment thread app/models/installment.rb
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/models/link.rb Outdated
Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Comment thread app/services/content_moderation/strategies/blocklist_strategy.rb
Comment on lines 535 to 558
return render json: { success: false, error_message: "An account does not exist with that email." }, status: :unprocessable_entity
end

iffy_url = Rails.env.production? ? "https://api.iffy.com/api/v1" : "http://localhost:3000/api/v1"

begin
response = HTTParty.get(
"#{iffy_url}/users?email=#{CGI.escape(params[:email])}",
headers: {
"Authorization" => "Bearer #{GlobalConfig.get("IFFY_API_KEY")}"
}
)

if !(response.success? && response.parsed_response["data"].present? && !response.parsed_response["data"].empty?)
error_message = response.parsed_response.is_a?(Hash) ? response.parsed_response["error"]&.[]("message") || "Failed to find user" : "Failed to find user"
return render json: { success: false, error_message: error_message }, status: :unprocessable_entity
end

user_data = response.parsed_response["data"].first
user_id = user_data["id"]

response = HTTParty.post(
"#{iffy_url}/users/#{user_id}/create_appeal",
headers: {
"Authorization" => "Bearer #{GlobalConfig.get("IFFY_API_KEY")}",
"Content-Type" => "application/json"
},
body: {
text: params[:reason]
}.to_json
)

if !(response.success? && response.parsed_response["data"].present? && !response.parsed_response["data"].empty?)
error_message = response.parsed_response.dig("error", "message") || "Failed to create appeal"
return render json: { success: false, error_message: }, status: :unprocessable_entity
end

appeal_data = response.parsed_response["data"]
appeal_id = appeal_data["id"]
appeal_url = appeal_data["appealUrl"]
if !user.suspended? && !user.flagged?
return render json: { success: false, error_message: "User is not suspended or flagged" }, status: :unprocessable_entity
end

render json: {
success: true,
id: appeal_id,
appeal_url: appeal_url
}
rescue HTTParty::Error, Net::OpenTimeout, Net::ReadTimeout, Timeout::Error, Errno::ECONNREFUSED, SocketError => e
ErrorNotifier.notify(e)
comment = user.comments.new(
content: "Appeal submitted: #{params[:reason]}",
author_name: ContentModeration::ModerateRecordService::AUTHOR_NAME,
comment_type: Comment::COMMENT_TYPE_NOTE
)

render json: {
success: false,
error_message: "Failed to create appeal"
}, status: :service_unavailable
unless comment.save
return render json: { success: false, error_message: comment.errors.full_messages.join(", ") }, status: :unprocessable_entity
end

render json: {
success: true,
id: comment.id,
appeal_url: nil
}
end
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The create_appeal action stores user appeals as COMMENT_TYPE_NOTE records — the same generic type used by the create_note admin action — making appeals structurally indistinguishable from ordinary admin notes with no notification sent to admins. Suspended/flagged users can submit appeals that silently land in the flat notes list with no alert, no dedicated query path, and no review workflow to act on them.

Extended reasoning...

What the bug is and how it manifests

The new create_appeal action (lines 542–546) creates a Comment record with comment_type: Comment::COMMENT_TYPE_NOTE. This is exactly the same comment_type value ("note") used by the create_note action at line 437–442 for generic admin notes. The only structural difference is that appeal records use author_name: ContentModeration::ModerateRecordService::AUTHOR_NAME (i.e., the string "ContentModeration") instead of author_id: GUMROAD_ADMIN_ID. There is no dedicated COMMENT_TYPE_APPEAL constant, no separate scope, and no other field to distinguish an appeal record from any other admin note.

The specific code path that triggers it

When a suspended or flagged user submits an appeal, the controller creates a comment, saves it, and returns { success: true, id: comment.id, appeal_url: nil }. The response is valid JSON, the record persists, and from the caller's perspective everything succeeds. However, Comment#notify_seller_of_new_comment (line 100–107 of comment.rb) is guarded by return unless user_submitted?, which resolves to comment_type == COMMENT_TYPE_USER_SUBMITTED (line 109–111). Since appeal comments use COMMENT_TYPE_NOTE, the callback returns early and no notification fires. appeal_url is hardcoded to nil, so the response carries no external tracking reference either.

Why existing code doesn't prevent it

There is no COMMENT_TYPE_APPEAL constant defined anywhere in comment.rb. The existing comment-type constants (lines 8–21) are all status-change markers or admin note types; the COMMENT_TYPE_NOTE value is a catch-all for freeform admin annotations. No scope filters appeals out of the general note stream, and the admin UI (if it queries with_type_note) would surface both generic notes and user-submitted appeals as identical records, distinguishable only by reading the content prefix "Appeal submitted: " or the author_name field.

What the impact would be

Any user appeal submitted through this endpoint is effectively silent: no admin receives an alert, there is no queue or dashboard view that surfaces pending appeals separately, and the only way to discover the appeal is to manually scan the full notes history for the user. In a high-volume environment this means user appeals are operationally lost — even if the data is technically saved — because no review workflow is triggered. The returned appeal_url: nil confirms no external tracking reference is created.

Step-by-step proof

  1. A suspended user POSTs to create_appeal with a valid email and reason.
  2. The guard at line 538–540 passes because the user is suspended.
  3. user.comments.new(..., comment_type: Comment::COMMENT_TYPE_NOTE) is called at line 542–546.
  4. The comment saves successfully.
  5. after_commit :notify_seller_of_new_comment fires (line 42 of comment.rb), but immediately returns at return unless user_submitted? because COMMENT_TYPE_NOTE \!= COMMENT_TYPE_USER_SUBMITTED.
  6. The response is { success: true, id: ..., appeal_url: nil } — no error, no notification, no admin alert.
  7. An admin querying user notes sees the record mixed in with all other COMMENT_TYPE_NOTE admin notes, with no indication it requires action.

How to fix it

Add a dedicated COMMENT_TYPE_APPEAL = "appeal" constant to Comment and use it in create_appeal. Add a scope with_type_appeal for admin queries. Trigger an InternalNotificationWorker or equivalent after saving the appeal comment so the admin team is alerted to pending reviews. This preserves the design choice to store appeals locally while restoring the review-workflow semantics that Iffy's external system previously provided.

Comment thread app/services/content_moderation/moderate_record_service.rb Outdated
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional findings (outside current diff — PR may have been updated during review):

  • 🔴 app/services/content_moderation/moderate_record_service.rb:215-234 — The vip_creator? guard in check_user_suspension_threshold is checked before acquiring the lock using the cached user object, but is not re-checked after user.reload inside the lock — only flagged_for_fraud? and suspended? are re-verified post-reload. If a user is promoted to VIP creator between when the service is instantiated and when the lock is acquired, user.reload will reflect the updated VIP status but the code will proceed to auto-suspend them anyway.

    Extended reasoning...

    What the bug is and how it manifests

    In check_user_suspension_threshold (lines 215–234 of moderate_record_service.rb), the method guards against suspending VIP creators with return if user.vip_creator? at line 216. However, this check runs on the cached user ActiveRecord object — the same object used to instantiate the service. The method then acquires a DB-level lock via user.with_lock do and immediately calls user.reload (line 222) to get the freshest state from the database. After the reload, the code only re-checks flagged_for_fraud? (line 223) and suspended? (line 224). The vip_creator? condition is never re-evaluated inside the lock.

    The specific code path that triggers it

    1. A Sidekiq job calls ModerateRecordService for a user who is not yet a VIP creator; the user AR object is loaded at job start.
    2. check_user_suspension_threshold begins. Line 216: user.vip_creator?false (cached state) → does not return.
    3. The Gumroad support team promotes the user to VIP creator in the DB (e.g. via an admin action).
    4. The job reaches user.with_lock do and acquires the row-level lock.
    5. user.reload now returns the user with vip_creator: true.
    6. Only flagged_for_fraud? and suspended? are checked inside the lock — neither is true.
    7. user_flagged_record_count meets the threshold → suspend_for_tos_violation\! fires and the VIP creator is suspended.

    Why existing code doesn't prevent it

    The pattern of re-checking guards inside a with_lock block is already established in this method (lines 223–224 mirror lines 217–218 for flagged_for_fraud? and suspended?). The developer correctly understood that stale pre-lock reads are unsafe and added post-reload guards for two of the three conditions. The third condition — vip_creator? — was simply missed.

    What the impact would be

    A legitimate VIP creator, correctly identified as such before the suspension job fires, could have their account auto-suspended by the content moderation system. Suspension disables their products, notifies them, and creates a support burden to reverse. VIP status exists precisely to exempt trusted creators from automated enforcement, so this defeat of the exemption undermines the intent of the feature flag.

    How to fix it

    Add next if user.vip_creator? inside the user.with_lock block, immediately after user.reload, alongside the existing flagged_for_fraud? and suspended? guards:

    user.with_lock do
      user.reload
      next if user.flagged_for_fraud?
      next if user.suspended?
      next if user.vip_creator?   # ← add this
    
      flagged_count = user_flagged_record_count
      ...
    end

    Step-by-step proof

    1. User Alice has 2 flagged products. Threshold is 3.
    2. Alice's third product is flagged → ModerateProductJob enqueued.
    3. At job start, Alice is loaded from DB: vip_creator = false.
    4. check_user_suspension_threshold runs; line 216 passes (false → no early return).
    5. An admin concurrently runs alice.update\!(vip_creator: true) in the console.
    6. The job acquires the lock and calls user.reloadvip_creator = true in memory now.
    7. Lines 223–224 pass (not fraud, not suspended). No vip_creator? check follows.
    8. user_flagged_record_count returns 3 ≥ 3.
    9. suspend_for_tos_violation\! fires — Alice is suspended despite being a VIP creator.
  • 🔴 app/services/content_moderation/moderate_record_service.rb:215-234 — In check_user_suspension_threshold, user.suspend_for_tos_violation\! is called unconditionally without first checking user.can_suspend_for_tos_violation?. For verified users, the not_verified? before_transition guard in user.rb halts the state-machine transition and the bang method raises StateMachines::InvalidTransition, causing the Sidekiq job to fail and retry until dead. Meanwhile, the associated product or post is already unpublished by the earlier flag_product/flag_post step and has no recovery path. Fix: add next unless user.can_suspend_for_tos_violation? immediately before line 232, mirroring the existing guard in flag_profile (line 150).

    Extended reasoning...

    What the bug is and how it manifests

    check_user_suspension_threshold (line 215) calls user.suspend_for_tos_violation\! (line 232) unconditionally. The User state machine defines before_transition any => %i[flagged_for_tos_violation suspended_for_tos_violation ...], do: :not_verified? (user.rb:314-315). not_verified? returns \!verified, i.e., false for verified users. When a before_transition callback returns false and a bang state-machine method is used, state_machines raises StateMachines::InvalidTransition instead of silently returning false.

    The specific code path that triggers it

    A verified user (e.g., user.verified == true) whose recent gross sales are below the VIP threshold will pass all three early-return guards at lines 216-218 (vip_creator? checks sales volume, not verified status; flagged_for_fraud? and suspended? are false). Inside the with_lock block the same guards repeat at lines 223-224 but again do not cover verified status. Once flagged_count >= threshold, execution reaches line 232 and suspend_for_tos_violation\! fires. The state machine's not_verified? guard returns false, causing the bang method to raise StateMachines::InvalidTransition.

    Why existing code doesn't prevent it

    The sister method flag_profile (line ~150) correctly uses next unless user.can_flag_for_tos_violation? before calling flag_for_tos_violation\!. That guard consults the state machine — including the not_verified? before_transition — and skips verified users cleanly. check_user_suspension_threshold is missing the equivalent next unless user.can_suspend_for_tos_violation? guard entirely. The three existing return/next guards cover fraud-flagged and already-suspended states, but none covers verified status.

    Impact

    The unrescued StateMachines::InvalidTransition propagates out of the with_lock block and out of perform, causing the Sidekiq worker to retry the job (up to 3 times per the job's sidekiq_options retry: 3) before it is moved to the dead queue. More critically, flag_product or flag_post already ran successfully earlier in perform — those methods set is_unpublished_by_admin = true and persist the changes before check_user_suspension_threshold is reached. After the job dies, the product/post remains permanently CM-flagged and unpublished with no automated recovery path. The content creator (a verified user) loses their listing with no administrative action actually completed against their account.

    How to fix it

    Add a single guard line immediately before line 232:

    next unless user.can_suspend_for_tos_violation?
    user.suspend_for_tos_violation\!(author_name: AUTHOR_NAME, content: comment_content, bulk: true)

    This mirrors the pattern already used correctly in flag_profile and ensures verified users are skipped without raising an exception.

    Step-by-step proof

    1. Verified creator with user.verified = true, gross sales below VIP threshold, 1 flagged record, CONTENT_MODERATION_SUSPENSION_THRESHOLD = 1.
    2. Moderator service calls flag_product → product is unpublished; user_flagged_record_count returns 1.
    3. check_user_suspension_threshold is called. user.vip_creator? → false (low sales). user.flagged_for_fraud? → false. user.suspended? → false. Threshold check: 1 >= 1, passes.
    4. user.suspend_for_tos_violation\! is invoked. The state machine evaluates before_transition callbacks: not_verified? returns \!true = false. The bang method raises StateMachines::InvalidTransition.
    5. Exception propagates out of with_lock, out of perform. Sidekiq retries 3×, then job is dead.
    6. Product remains unpublished (is_unpublished_by_admin = true); user account is untouched but creator has no recourse.

Comment on lines +63 to +85
if all_reasoning.any?
Result.new(status: "flagged", reasoning: all_reasoning)
else
Result.new(status: "compliant", reasoning: [])
end
rescue StandardError => e
Rails.logger.error("ContentModeration::PromptStrategy error: #{e.message}")
raise
end

private
def evaluate_preset(preset)
messages = build_messages(preset[:rules], skip_images: preset[:skip_images])

response = @client.chat(
parameters: {
model: MODEL,
messages: messages,
response_format: { type: "json_object" },
temperature: 0.1,
}
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The outer rescue in perform (lines 68–70) redundantly logs any error that has already been logged and re-raised by the inner rescues in evaluate_preset or passes_uncertainty_check?, producing duplicate log entries for every single OpenAI failure. Fix by removing the Rails.logger.error call from perform's outer rescue and letting it simply re-raise.

Extended reasoning...

What the bug is

evaluate_preset (lines 93–95) and passes_uncertainty_check? (lines 121–123) each have their own rescue StandardError => e block that logs a specific error message (e.g., "preset evaluation error") and then calls raise to re-propagate the exception. The outer rescue StandardError => e in perform (lines 68–70) then catches that same re-raised exception and logs a second, more generic message ("ContentModeration::PromptStrategy error") before raising again.

The specific code path

  1. evaluate_preset encounters an OpenAI timeout → logs "ContentModeration::PromptStrategy preset evaluation error: {msg}" → raise
  2. Control returns to perform's outer rescue → logs "ContentModeration::PromptStrategy error: {msg}" → raise
  3. Sidekiq catches and schedules a retry

The same message string appears twice in the log for every single API failure.

Why existing tests don't catch it

The spec at line 75 asserts have_received(:error).with("...preset evaluation error: API failure").at_least(:once). The outer rescue fires too (logger.error("...PromptStrategy error: API failure")), but the test only checks for the inner message, not that the outer one is absent. So CI stays green while production doubles every error log entry.

Step-by-step proof

Given the test stub in "logs and re-raises when the OpenAI request fails" (line 71):

  • client.chat raises StandardError, "API failure"
  • evaluate_preset rescue fires: Rails.logger.error("...preset evaluation error: API failure") + re-raise
  • perform rescue fires: Rails.logger.error("...PromptStrategy error: API failure") + re-raise
  • Result: two logger.error calls for one network failure

The same doubling occurs when passes_uncertainty_check? fails, as shown by the "logs and re-raises when the uncertainty check fails" test (line 54).

Impact

Duplicate log lines and inflated error counts in Sentry/alerting pipelines make it harder to triage how many distinct failures are actually occurring, and could trigger double the alert notifications.

Fix

Remove the Rails.logger.error call from perform's outer rescue block — the inner rescues already provide specific, actionable log messages. The outer rescue should simply raise to preserve the retry behaviour without adding noise.

Comment on lines +63 to +85
if all_reasoning.any?
Result.new(status: "flagged", reasoning: all_reasoning)
else
Result.new(status: "compliant", reasoning: [])
end
rescue StandardError => e
Rails.logger.error("ContentModeration::PromptStrategy error: #{e.message}")
raise
end

private
def evaluate_preset(preset)
messages = build_messages(preset[:rules], skip_images: preset[:skip_images])

response = @client.chat(
parameters: {
model: MODEL,
messages: messages,
response_format: { type: "json_object" },
temperature: 0.1,
}
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The nested rescue blocks in evaluate_preset and passes_uncertainty_check? both log and re-raise, causing any API failure to produce two log entries — one specific (from the method) and one generic (from perform's outer rescue). This is purely operational noise with no correctness impact, but should be cleaned up for log clarity.

Extended reasoning...

What the bug is and how it manifests

PromptStrategy has three rescue blocks that interact: one in evaluate_preset, one in passes_uncertainty_check?, and one in the perform method itself. The two private-method rescues each log a specific error message and then call raise (bare re-raise). Because perform calls both private methods, any exception they propagate is caught again by perform's own rescue, which logs a second, more generic message before re-raising.

The specific code path that triggers it

When @client.chat raises inside evaluate_preset, the rescue at the end of that method runs:

rescue StandardError => e
  Rails.logger.error("ContentModeration::PromptStrategy preset evaluation error: #{e.message}")
  raise   # re-raises the original exception

That re-raised exception propagates up to perform, which has:

rescue StandardError => e
  Rails.logger.error("ContentModeration::PromptStrategy error: #{e.message}")
  raise

Both rescues fire for the same exception, so the error message appears twice in the log with different prefixes.

Why existing code doesn't prevent it

Bare raise inside a rescue block re-raises the current exception unchanged, so Ruby's normal propagation rules cause it to bubble up and be caught by the next enclosing rescue in the call stack — which is the one in perform. There is no mechanism to mark an exception as "already logged," so the outer rescue has no way to know it should stay silent.

The test suite actually documents this by using .at_least(:once) on the logger expectation for the API failure case, which is a code smell indicating the author knew more than one log call would happen but didn't want a brittle exact-count assertion.

Impact

No data is lost, no user-facing behavior is incorrect, and no moderation decision is affected. The duplication only inflates log volume and can make incident investigation slightly noisier — two log lines per failure instead of one, with potentially confusing generic + specific pairing.

How to fix it

The cleanest approach is to match the pattern used by ClassifierStrategy, which has a single rescue in its perform method and no per-method rescues. For PromptStrategy, either:

  1. Remove the rescue blocks from evaluate_preset and passes_uncertainty_check? and let perform's rescue handle all logging (possibly adding more context to the message there), or
  2. Remove perform's rescue entirely and rely solely on the more-specific per-method messages.

Option 1 is simpler. The per-method messages add context ("preset evaluation" vs. "uncertainty check") that can instead be inferred from the stack trace.

Step-by-step proof

  1. Client calls PromptStrategy.new(text: "...").perform
  2. perform calls evaluate_preset(preset)
  3. Inside evaluate_preset, @client.chat raises StandardError, "API failure"
  4. evaluate_preset's rescue fires → Rails.logger.error("ContentModeration::PromptStrategy preset evaluation error: API failure") is called (log entry docs: Update README.md #1)
  5. raise re-raises the same exception
  6. Exception propagates back into perform
  7. perform's rescue fires → Rails.logger.error("ContentModeration::PromptStrategy error: API failure") is called (log entry Auto route suspension emails to Iffy #2)
  8. Two log lines appear for one API call failure

Comment on lines +59 to +74
def run_ai_strategies(content)
strategies = [
ContentModeration::Strategies::ClassifierStrategy.new(text: content.text, image_urls: content.image_urls),
ContentModeration::Strategies::PromptStrategy.new(text: content.text, image_urls: content.image_urls),
]

threads = strategies.map do |strategy|
Thread.new do
# Silence Ruby's stderr dump on thread death; Thread#value re-raises for the caller.
Thread.current.report_on_exception = false
strategy.perform
end
end

threads.map(&:value)
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 In check, the call to run_ai_strategies has no rescue clause, so any OpenAI network error (Faraday::TimeoutError, Faraday::ConnectionFailed, HTTP 5xx from the API) propagates through the ActiveRecord validation callback into save!/publish! as an unhandled StandardError, returning a 500 to the creator. Add a rescue StandardError around run_ai_strategies in check and return a passing CheckResult (fail-open) when AI is unavailable so that an OpenAI outage does not block all product and post publishes.

Extended reasoning...

The bug

ModerateRecordService#check runs two AI strategies concurrently via run_ai_strategies, then collects results with threads.map(&:value). The code comment on that call says explicitly: "Thread#value re-raises for the caller." Both ClassifierStrategy#perform and PromptStrategy#perform rescue every StandardError, log it, and then re-raise it:

rescue StandardError => e
  Rails.logger.error("ContentModeration::ClassifierStrategy error: #{e.message}")
  raise

This means any network-level exception from OpenAI (timeout, connection failure, 5xx response) surfaces as a live exception in the thread, and threads.map(&:value) re-raises it into check.

The propagation path

check has no rescue around run_ai_strategies:

ai_results = run_ai_strategies(content)   # no rescue here

Both Link and Installment declare:

validate :content_moderation_check, if: -> { publishing? && !skip_content_moderation_check }

ActiveRecord validation callbacks run inside valid?, which is called by save!. An exception raised inside a validate callback propagates straight through valid? -> save! -> publish! — it is not caught and converted to a validation error; it is a raw exception. Controllers rescue ActiveRecord::RecordInvalid to display form errors, but a raw Faraday::TimeoutError or OpenAI API error is not a RecordInvalid, so no controller rescue clause catches it. The request terminates with a 500.

Why existing code does not prevent this

The strategies intentionally re-raise after logging so that Sidekiq can retry the job when moderation runs asynchronously. But publish! is also called synchronously from web requests (product editor, post editor). The re-raise is appropriate for background jobs but catastrophic for synchronous web requests when OpenAI is down.

Impact

During any OpenAI service disruption — even a brief timeout — every attempt by any creator to publish a product or a post fails with a 500 error. There is no degraded mode; AI unavailability completely blocks publishing.

Step-by-step proof

  1. Creator clicks "Publish" in the product editor.
  2. Controller calls link.publish!.
  3. publish! calls save!, which runs validations including content_moderation_check.
  4. content_moderation_check calls ModerateRecordService.check(self, :product).
  5. check passes the blocklist check, then calls run_ai_strategies(content).
  6. run_ai_strategies spawns two threads; each calls strategy.perform.
  7. ClassifierStrategy#perform makes an HTTP call to OpenAI; OpenAI is timing out → Faraday::TimeoutError is raised.
  8. The strategy rescues it, logs it, and re-raises it.
  9. The thread dies with the exception.
  10. threads.map(&:value) re-raises Faraday::TimeoutError into check.
  11. check has no rescue → exception propagates out of the validation callback.
  12. save! does not catch it → propagates out of publish!.
  13. Controller has no rescue for Faraday::TimeoutError → 500 response to creator.

Fix

Wrap the AI strategy call in check with a rescue and fail open:

begin
  ai_results = run_ai_strategies(content)
rescue StandardError => e
  Rails.logger.error("ContentModeration: AI strategies failed, failing open: #{e.message}")
  return CheckResult.new(passed: true, reasons: [])
end

This ensures an OpenAI outage degrades gracefully (content passes moderation temporarily) rather than taking down publishing entirely. The blocklist check still runs and catches the most egregious content even when AI is unavailable.

Comment on lines +76 to +92
def leave_admin_comment(reasons)
return if user.blank?

record_label = case entity_type
when :product then "Product ##{record.id} (#{record.name})"
when :post then "Post ##{record.id} (#{record.name})"
end

content = "Content moderation blocked publish of #{record_label}: #{reasons.join("; ")}"
user.comments.create!(
author_name: AUTHOR_NAME,
comment_type: Comment::COMMENT_TYPE_NOTE,
content: content,
)
rescue StandardError => e
Rails.logger.error("ContentModeration failed to leave admin comment: #{e.message}")
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The user.comments.create\! call inside leave_admin_comment is executed within the same ActiveRecord transaction opened by save\!, so when content moderation fails validation and save\! rolls back the transaction, the admin note INSERT is silently discarded. The rescue StandardError guard only catches exceptions thrown by create\! itself and cannot protect against a rollback imposed by an outer transaction. Wrap the user.comments.create\! call in ApplicationRecord.transaction(requires_new: true) to open a savepoint that survives the outer rollback.

Extended reasoning...

What the bug is and how it manifests

ModerateRecordService#leave_admin_comment calls user.comments.create\! to record an audit note whenever content moderation blocks a publish. The method rescues StandardError to suppress any exception from create\! itself. However, it does not protect against its write being silently undone by an enclosing ActiveRecord transaction opened by a caller higher up the stack.

The specific code path that triggers it

  1. Link#publish\! (or Installment#publish\!) sets self.publishing = true then calls save\!.
  2. save\! opens an ActiveRecord transaction and runs all validations inside it, including validate :content_moderation_check.
  3. content_moderation_check calls ContentModeration::ModerateRecordService.check(self, :product).
  4. Inside check, when the blocklist or AI classifier flags the content, leave_admin_comment(reasons) is called — executing user.comments.create\! inside the still-open save\! transaction.
  5. check returns CheckResult.new(passed: false).
  6. content_moderation_check calls errors.add(:base, ...), causing validation to fail.
  7. save\! raises ActiveRecord::RecordInvalid and issues ROLLBACK — which undoes every write made inside the transaction, including the user.comments.create\! INSERT from step 4.

Why existing code doesn't prevent it

The rescue StandardError => e block in leave_admin_comment only intercepts exceptions raised by the create\! call itself (e.g., a constraint violation). A transaction rollback is not an exception at the point of the create\! call; the INSERT appears to succeed, and the rescue block is never entered. The comment row simply disappears when the outer transaction rolls back.

What the impact would be

Every time content moderation blocks a product or post publish, the admin audit note (COMMENT_TYPE_NOTE) intended to explain why the publish was rejected is silently lost. Admins reviewing flagged users will see no history of why their content was moderated, degrading the usefulness of the audit trail entirely.

Step-by-step proof

The rescue in leave_admin_comment is never triggered. The comment simply never exists in the database.

How to fix it

Wrap the create\! call in a nested transaction that uses a savepoint, so it commits independently of the outer transaction:

def leave_admin_comment(reasons)
  return if user.blank?
  # ... build content ...
  ApplicationRecord.transaction(requires_new: true) do
    user.comments.create\!(
      author_name: AUTHOR_NAME,
      comment_type: Comment::COMMENT_TYPE_NOTE,
      content: content,
    )
  end
rescue StandardError => e
  Rails.logger.error("ContentModeration failed to leave admin comment: #{e.message}")
end

requires_new: true creates a SAVEPOINT in the database. Even if the outer transaction rolls back, the savepoint has already been released and its writes are committed independently.

Comment thread app/models/user.rb
Comment on lines 301 to 303

after_save :create_updated_stripe_apple_pay_domain, if: ->(user) { user.saved_change_to_username? }
after_save :delete_old_stripe_apple_pay_domain, if: ->(user) { user.saved_change_to_username? }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The PR removes the trigger_iffy_ingest after_save callback from User (which triggered Iffy::Profile::IngestJob on name/username/bio changes) with no replacement, leaving profile content entirely unmoderated. The new ContentModeration::ContentExtractor has no extract_from_profile method, ModerateRecordService#extract_content only handles :product and :post entity types, and there is no ModerateProfileJob equivalent. After this PR merges, a seller can change their display name, username, or bio to policy-violating content and it will never be flagged by the new inline moderation pipeline.

Extended reasoning...

What the bug is and how it manifests

The PR's goal is to replace the external Iffy service with an inline content moderation pipeline. For products and posts, the migration is complete: new ModerateProductJob and ModeratePostJob jobs wire into ModerateRecordService, which calls ContentExtractor#extract_from_product or ContentExtractor#extract_from_post. For user profiles, however, the migration was never completed — the old trigger was removed but no new trigger was added.

The specific code path that no longer exists

Before this PR, app/models/user.rb had:

after_save :trigger_iffy_ingest

def trigger_iffy_ingest
  return unless saved_change_to_name? ||
                saved_change_to_username? ||
                saved_change_to_bio?
  Iffy::Profile::IngestJob.perform_async(id)
end

This callback was removed in the PR. Iffy::Profile::IngestJob itself was also deleted. The PR adds no after_save callback on User to trigger profile moderation through the new pipeline.

Why existing code does not prevent it

ContentModeration::ContentExtractor only defines extract_from_product and extract_from_post — there is no extract_from_profile method. ModerateRecordService#extract_content switches on entity_type with only :product and :post cases; a :profile case is absent. There is no ModerateProfileJob in the PR. Even if someone manually enqueued a moderation job for a user profile, the extract_content method would return nil and the moderation would be a no-op.

Impact

Any seller can freely update their display name, username, or bio to contain policy-violating content — hate speech, spam keywords, explicit material — and the new content moderation system will never inspect or act on it. This is a full regression of profile-level moderation coverage.

Step-by-step proof

  1. A seller navigates to their profile settings and changes their bio to a string that would be caught by the blocklist or Iffy classifier (e.g., a known hateful slur or spam phrase).
  2. User#save is called. Previously, after_save :trigger_iffy_ingest would fire and enqueue Iffy::Profile::IngestJob.
  3. After this PR, no after_save callback on User calls any moderation job. The bio change is saved to the database without moderation.
  4. The Sidekiq queue never receives a profile moderation job. ModerateRecordService is never invoked for this user.
  5. The violating bio remains live on the seller's public profile indefinitely.

How to fix it

Add an after_save callback on User (and SellerProfileRichTextSection if applicable) that enqueues a new ModerateProfileJob when saved_change_to_name?, saved_change_to_username?, or saved_change_to_bio? is true. Implement ContentExtractor#extract_from_profile to pull the relevant text fields, and add a :profile branch in ModerateRecordService#extract_content.

Comment thread app/models/link.rb
Comment on lines +1423 to 1430
def content_moderation_check
return if user&.vip_creator?

def queue_iffy_ingest_job_if_unpublished_by_admin
return unless is_unpublished_by_admin? && !saved_change_to_is_unpublished_by_admin?
result = ContentModeration::ModerateRecordService.check(self, :product)
return if result.passed

Iffy::Product::IngestJob.perform_async(id)
errors.add(:base, "Content moderation failed: #{result.reasons.join("; ")}")
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The content_moderation_check validation leaks sensitive moderation internals to sellers: blocklist words (Matched blocked word: [word]), OpenAI classifier scores and thresholds (OpenAI moderation flagged: sexual (score: 0.95, threshold: 0.8)), and verbatim GPT-4o-mini reasoning are all passed through result.reasons.join and surfaced as the top-level error message returned to the client. This enables systematic enumeration of the blocklist and reverse-engineering of classifier thresholds; the fix is to return a generic message like Your content was flagged for policy violations while keeping the detailed reasons only in the internal admin comment that leave_admin_comment already creates.

Extended reasoning...

What the bug is and how it manifests

content_moderation_check in link.rb (line 1429) adds the error message Content moderation failed followed by result.reasons joined with semicolons. The links_controller.rb returns errors.full_messages[0] verbatim as a JSON API response body. Every string that any strategy places in its reasoning array is therefore forwarded directly to the seller browser.

The specific code path that triggers it

ModerateRecordService#check (moderate_record_service.rb lines 17-42) runs strategies in sequence: (1) BlocklistStrategy#perform returns reasons like Matched blocked word: [exact word]. (2) If blocklist passes, run_ai_strategies spawns threads for both ClassifierStrategy#perform which returns OpenAI moderation flagged: sexual (score: 0.952, threshold: 0.8) and PromptStrategy#perform which returns verbatim GPT-4o-mini explanation text. All of these reasoning strings flow into CheckResult#reasons, which is joined and embedded in the validation error. This is NOT limited to the blocklist path; the synchronous check method explicitly calls run_ai_strategies (lines 32-41) if the blocklist passes, directly contradicting the verifier refutation that claimed only BlocklistStrategy runs in the sync path.

Why existing code does not prevent it

The leave_admin_comment helper (lines 76-92) already stores the detailed reasons in an internal admin note. That is the correct place for that information. But the same reasons array is simultaneously propagated to the seller-facing error message with no redaction.

What the impact would be

A seller can systematically probe the moderation system. For blocklist enumeration: submit products with candidate words one at a time; a Matched blocked word: [word] response confirms each entry in the blocklist, allowing construction of a complete map of prohibited terms and enabling evasion. For classifier threshold leaking: by varying content intensity and observing the returned score vs. threshold pairs such as sexual (score: 0.79, threshold: 0.8), an attacker can calibrate content to sit just below each category threshold. For AI reasoning leaking: verbatim GPT-4o-mini reasoning reveals exactly which framing the model objected to, enabling targeted rephrasing to evade detection.

How to fix it

Change line 1429 in link.rb (and the equivalent line in installment.rb) to return a generic message: errors.add(:base, Your content was flagged for policy violations. Please review our terms of service.). The detailed reasons are already preserved for admins via leave_admin_comment. No internal observability is lost.

Step-by-step proof

  1. Seller creates a product with a title containing a blocklisted word and calls the publish endpoint.
  2. validate :content_moderation_check fires; BlocklistStrategy matches the word and returns reasoning: [Matched blocked word: [word]].
  3. ModerateRecordService#check returns CheckResult.new(passed: false, reasons: [Matched blocked word: [word]]).
  4. content_moderation_check executes errors.add(:base, Content moderation failed: Matched blocked word: [word]).
  5. links_controller returns {success: false, error: Content moderation failed: Matched blocked word: [word]} to the seller.
  6. Seller now knows [word] is on the blocklist, removes it, retries with a different candidate, iterating to enumerate all blocked terms. The same technique works against classifier scores by submitting content of systematically varying severity.

Comment on lines +65 to +73
threads = strategies.map do |strategy|
Thread.new do
# Silence Ruby's stderr dump on thread death; Thread#value re-raises for the caller.
Thread.current.report_on_exception = false
strategy.perform
end
end

threads.map(&:value)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 When ClassifierStrategy raises (e.g., on a Faraday timeout), threads.map(&:value) propagates the exception after evaluating only threads[0].value, leaving threads[1] (PromptStrategy) running as an orphaned background thread. The PromptStrategy thread may continue making up to 4 sequential OpenAI API calls (each with a 10-second timeout) before it naturally terminates, silently wasting API credits with no way to cancel it.

Extended reasoning...

The bug: orphaned PromptStrategy thread on ClassifierStrategy failure

run_ai_strategies (line 73) calls threads.map(&:value) to join both threads. Ruby evaluates Array#map left-to-right, so it calls threads[0].value first. Thread#value blocks until the thread finishes and then re-raises any exception the thread raised. If ClassifierStrategy raises (the perform method re-raises all StandardError subclasses including Faraday::TimeoutError), the exception propagates out of threads[0].value immediately—before threads[1].value is ever called.

Why the PromptStrategy thread is already running

Both threads are spawned together in strategies.map { |s| Thread.new { s.perform } } at lines 65–71, so PromptStrategy is already executing by the time threads.map(&:value) begins. When ClassifierStrategy raises and the parent fiber unwinds, nobody ever calls threads[1].value or threads[1].kill, so the PromptStrategy thread runs to completion (or timeout) entirely unobserved.

Why existing code does not prevent this

The comment on line 67–68 (Thread.current.report_on_exception = false) only suppresses the stderr dump; it does not kill the thread. There is no ensure block or rescue handler in run_ai_strategies that would kill remaining threads. The Sidekiq job that calls this service will retry on the raised exception, potentially starting yet another pair of threads while the first PromptStrategy thread is still running.

Impact

PromptStrategy makes up to 4 sequential OpenAI API calls: evaluate_preset for adult_content, passes_uncertainty_check?, evaluate_preset for spam, and passes_uncertainty_check? again (see prompt_strategy.rb). Each call has a 10-second timeout, so a single orphaned thread can waste up to 40 seconds of OpenAI API time and 4 billed API calls. Under a sustained OpenAI degradation (which is also the scenario most likely to cause Faraday timeouts in ClassifierStrategy), Sidekiq retries multiply the effect: each retry spawns a new orphaned PromptStrategy thread before the previous one has finished.

Step-by-step proof

  1. Sidekiq calls ModerateRecordService.check(product, :product).
  2. run_ai_strategies spawns thread_classifier and thread_prompt; both begin executing immediately.
  3. ClassifierStrategy makes an HTTP call to the classifier service; Faraday raises Faraday::TimeoutError.
  4. thread_classifier terminates with an exception stored internally.
  5. threads.map(&:value) evaluates threads[0].value → re-raises Faraday::TimeoutError; threads[1].value is never reached.
  6. The exception propagates up through check, out of the Sidekiq job, and Sidekiq schedules a retry.
  7. thread_prompt is still alive, calls OpenAI up to 4 times, and silently terminates 0–40 seconds later.
  8. Meanwhile, Sidekiq retries have already spawned additional orphaned PromptStrategy threads.

Fix

Wrap the join loop in an ensure that kills any thread that has not yet finished:

def run_ai_strategies(content)
  # ... build threads ...
  begin
    threads.map(&:value)
  ensure
    threads.each { |t| t.kill unless t.stop? }
  end
end

Alternatively, collect results individually with rescue so all threads are joined before re-raising.

Comment thread .env.example
Comment on lines 137 to 143
CIRCLE_API_KEY=

# Iffy Configuration
IFFY_API_KEY=
IFFY_TOKEN=
IFFY_WEBHOOK_SECRET=

# Security/Encryption Configuration
MAILER_HEADERS_ENCRYPTION_KEY_V1=
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The PR removes the Iffy webhook controller and its route but leaves behind a fully functional admin authentication bypass. Any request presenting the correct IFFY_TOKEN value can still skip require_admin\! and directly invoke admin enforcement actions (suspend users, mark compliant, flag NSFW) on admin/users_controller.rb and admin/comments_controller.rb with no admin session required. Remove the skip_before_action :require_admin\!, if: :request_from_iffy? lines from both controllers, delete request_from_iffy? from admin/base_controller.rb, remove the three dead action methods and their routes, and drop IFFY_TOKEN= from both env example files.

Extended reasoning...

What the bug is and how it manifests

The PR's stated goal includes removing the request_from_iffy? admin auth bypass along with the Iffy webhook controller. The webhook controller (app/controllers/api/internal/iffy/webhook_controller.rb) and its route were indeed deleted. However, the corresponding skip_before_action bypasses in the admin controllers—and the request_from_iffy? predicate itself—were never removed. This means the admin endpoints remain reachable without an admin session as long as the caller supplies the correct IFFY_TOKEN value.

The specific code path that triggers it

app/controllers/admin/base_controller.rb lines 88–89 still define request_from_iffy?, which reads GlobalConfig.get("IFFY_TOKEN") and compares it to the request's auth_token parameter. app/controllers/admin/users_controller.rb line 7 still carries skip_before_action :require_admin\!, if: :request_from_iffy?, only: %i[suspend_for_fraud_from_iffy mark_compliant_from_iffy flag_for_explicit_nsfw_tos_violation_from_iffy], and the three corresponding action methods (lines 125, 160, 168) remain intact. app/controllers/admin/comments_controller.rb line 4 carries a broader skip_before_action :require_admin\!, if: :request_from_iffy? with no only: guard. The routes for mark_compliant_from_iffy, suspend_for_fraud_from_iffy, and flag_for_explicit_nsfw_tos_violation_from_iffy are still registered in config/routes/admin.rb (lines 55, 57, 59).

Why existing code doesn't prevent it

require_admin\! is the sole authentication gate for these admin controllers. Because both controllers unconditionally invoke skip_before_action when request_from_iffy? returns true, there is no fallback check. If IFFY_TOKEN is set in GlobalConfig (which .env.example and .env.production.example still document, since this PR removed IFFY_API_KEY and IFFY_WEBHOOK_SECRET but left IFFY_TOKEN=), any external party who knows or obtains the token value can bypass authentication entirely.

What the impact would be

An attacker with the IFFY_TOKEN value can: (1) suspend arbitrary users by POSTing to suspend_for_fraud_from_iffy; (2) mark suspended users compliant via mark_compliant_from_iffy; (3) set the NSFW TOS violation flag on any user via flag_for_explicit_nsfw_tos_violation_from_iffy; and (4) create admin comments on any resource via admin/comments#create. Before this PR, Iffy was the authorized caller of these endpoints. After this PR, the authorization token is an orphaned secret with no legitimate caller—meaning it is purely a backdoor.

How to fix it

Remove skip_before_action :require_admin\!, if: :request_from_iffy? from admin/users_controller.rb (line 7) and admin/comments_controller.rb (line 4). Delete request_from_iffy? from admin/base_controller.rb (lines 88–89). Remove the three dead action methods (suspend_for_fraud_from_iffy, mark_compliant_from_iffy, flag_for_explicit_nsfw_tos_violation_from_iffy) from admin/users_controller.rb. Remove their routes from config/routes/admin.rb. Remove IFFY_TOKEN= from both .env.example and .env.production.example.

Step-by-step proof

  1. Attacker reads .env.production.example (still contains IFFY_TOKEN=) and determines the token value from the running environment (e.g., leaked config, insider access, or brute-force of a weak value).
  2. Attacker sends: POST /admin/users/123/suspend_for_fraud_from_iffy?auth_token=<IFFY_TOKEN>.
  3. Rails resolves this route (still registered in config/routes/admin.rb), enters Admin::UsersController, and evaluates before_action callbacks.
  4. request_from_iffy? in admin/base_controller.rb returns true because params[:auth_token] == GlobalConfig.get("IFFY_TOKEN").
  5. skip_before_action suppresses require_admin\! for this action.
  6. suspend_for_fraud_from_iffy (still present at line 125) executes, suspending the target user—no admin session needed.

Comment thread spec/models/link_spec.rb
Comment on lines 375 to +400
end
end

describe "queue_iffy_ingest_job_if_unpublished_by_admin" do
let(:product) { create(:product) }
it "blocks publishing when ContentModeration::ModerateRecordService.check fails" do
allow(ContentModeration::ModerateRecordService).to receive(:check).with(product, :product, text_only: true).and_return(
ContentModeration::ModerateRecordService::CheckResult.new(passed: false, reasons: ["policy violation"])
)

it "enqueues an Iffy::Product::IngestJob when the product has changed and was already unpublished by admin" do
product.update!(is_unpublished_by_admin: true)
product.update!(description: "New description")
expect(Iffy::Product::IngestJob).to have_enqueued_sidekiq_job(product.id)
expect { product.publish! }.to raise_error(ActiveRecord::RecordInvalid)
expect(product.reload.purchase_disabled_at).not_to be(nil)
expect(product.errors.full_messages.to_sentence).to include("Content moderation failed: policy violation")
end

it "does not enqueue an Iffy::Product::IngestJob when the product is only unpublished by admin" do
expect do
product.unpublish!(is_unpublished_by_admin: true)
end.not_to change { Iffy::Product::IngestJob.jobs.size }
it "skips the content moderation check for VIP creators" do
allow_any_instance_of(User).to receive(:vip_creator?).and_return(true)
expect(ContentModeration::ModerateRecordService).not_to receive(:check)

expect { product.publish! }.to change { product.reload.purchase_disabled_at }.to(nil)
end

it "does not enqueue an Iffy::Product::IngestJob when the product is not unpublished by admin" do
expect do
product.update!(description: "New description")
end.not_to change { Iffy::Product::IngestJob.jobs.size }
it "publishes successfully when the content moderation check passes" do
allow(ContentModeration::ModerateRecordService).to receive(:check).with(product, :product, text_only: true).and_return(
ContentModeration::ModerateRecordService::CheckResult.new(passed: true, reasons: [])
)

expect { product.publish! }.to change { product.reload.purchase_disabled_at }.to(nil)
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Four test stubs across spec/models/link_spec.rb (lines 378, 395) and spec/models/installment_spec.rb (lines 785, 804) pass text_only: true to .with() when stubbing ContentModeration::ModerateRecordService.check, but the actual method signature is def self.check(record, entity_type) — no text_only keyword — and the production call sites pass only two positional arguments. RSpec's strict argument matching means these stubs never fire in CI, the real implementation runs instead, and the "blocks publishing" assertions fail because the service returns compliant when no OpenAI key is present. Fix: remove text_only: true from all four .with() matchers.

Extended reasoning...

What the bug is

Four RSpec stubs in the newly-added content-moderation tests pass a text_only: true keyword argument that does not exist in ContentModeration::ModerateRecordService.check's signature. The method is defined as:

# app/services/content_moderation/moderate_record_service.rb
def self.check(record, entity_type)
  new(record, entity_type).check
end

The stubbed call patterns in both spec files are:

# spec/models/link_spec.rb:378, 395
allow(ContentModeration::ModerateRecordService).to receive(:check)
  .with(product, :product, text_only: true).and_return(...)

# spec/models/installment_spec.rb:785, 804
allow(ContentModeration::ModerateRecordService).to receive(:check)
  .with(@installment, :post, text_only: true).and_return(...)

The specific code path that triggers it

RSpec's .with() matcher performs strict argument matching. When the production code calls ContentModeration::ModerateRecordService.check(self, :product) (two positional args, no keyword), RSpec compares that against the registered stub expectation (product, :product, text_only: true). The keyword argument causes the match to fail, so the stub is never activated and control falls through to the real implementation.

Why existing code doesn't prevent it

The call sites in both models pass exactly two positional arguments:

# app/models/link.rb:1426
result = ContentModeration::ModerateRecordService.check(self, :product)

# app/models/installment.rb:950
result = ContentModeration::ModerateRecordService.check(self, :post)

There is no text_only: parameter anywhere in the method signature or call chain, so the mismatch is silent at the Ruby level — it only manifests as a silent stub miss in RSpec.

Impact

In CI, where OPENAI_ACCESS_TOKEN is not set, the real ModerateRecordService#check skips AI-backed strategies and returns a compliant result. The four tests that expect publish\! to raise ActiveRecord::RecordInvalid will instead see a successful publish, causing them to fail. This means the content-moderation publish guard is effectively untested.

How to fix it

Remove text_only: true from all four .with() matchers so they match the actual two-argument call:

# Before
.with(product, :product, text_only: true)
# After
.with(product, :product)

# Before
.with(@installment, :post, text_only: true)
# After
.with(@installment, :post)

Step-by-step proof

  1. Test calls publish\! on a product/installment whose moderation check is stubbed to return a flagged result.
  2. publish\! internally calls ContentModeration::ModerateRecordService.check(self, :product) — two positional args.
  3. RSpec looks for a stub matching (product, :product, text_only: true) — three args.
  4. The argument counts differ; no stub fires.
  5. Real ModerateRecordService.check runs; no API key in CI → compliant result returned.
  6. publish\! succeeds; expect { publish\! }.to raise_error(ActiveRecord::RecordInvalid) fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants