feat: Phase 2A — graphile-presigned-url-plugin (requestUploadUrl, confirmUpload, downloadUrl)#895
feat: Phase 2A — graphile-presigned-url-plugin (requestUploadUrl, confirmUpload, downloadUrl)#895pyramation wants to merge 3 commits intomainfrom
Conversation
… mutations + downloadUrl field Phase 2A Step 2: Presigned URL plugin for PostGraphile v5 - requestUploadUrl mutation: presigned PUT URL generation with SHA-256 content-hash keys, bucket validation (RLS), MIME type enforcement, file size limits, deduplication - confirmUpload mutation: S3 HEAD verification, content-type check, status transition (pending -> ready), upload_request tracking - downloadUrl computed field: presigned GET URLs for private files, public URL prefix for public - StorageModuleCache: per-database LRU cache for storage module config (TTL-based) - S3 signer wrapper: AWS SDK v3 presigned URL generation (PUT/GET/HEAD) - PresignedUrlPreset factory for easy integration into ConstructivePreset Follows existing patterns: - extendSchema + grafast plans (same as PublicKeySignature) - GraphQLObjectType_fields hook for downloadUrl (same as graphile-search) - LRU cache with TTL (same as graphile-cache)
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
… detection Uses codec.extensions.tags.storageFiles instead of checking for 6 specific column names. The storage module generator in constructive-db will set this tag on the generated files table via a smart comment.
| return withPgClient(pgSettings, async (pgClient: any) => { | ||
| // --- Resolve storage module config --- | ||
| const databaseId = await resolveDatabaseId(pgClient); | ||
| if (!databaseId) { | ||
| throw new Error('DATABASE_NOT_FOUND'); | ||
| } | ||
|
|
||
| const storageConfig = await getStorageModuleConfig(pgClient, databaseId); | ||
| if (!storageConfig) { | ||
| throw new Error('STORAGE_MODULE_NOT_PROVISIONED'); | ||
| } | ||
|
|
||
| // --- Look up the bucket (RLS enforced) --- | ||
| const bucketResult = await pgClient.query( | ||
| `SELECT id, type, is_public, owner_id, allowed_mime_types, max_file_size | ||
| FROM ${storageConfig.bucketsQualifiedName} | ||
| WHERE key = $1 | ||
| LIMIT 1`, | ||
| [bucketKey], | ||
| ); | ||
|
|
||
| if (bucketResult.rows.length === 0) { | ||
| throw new Error('BUCKET_NOT_FOUND'); | ||
| } | ||
|
|
||
| const bucket = bucketResult.rows[0]; | ||
|
|
||
| // --- Validate content type against bucket's allowed_mime_types --- | ||
| if (bucket.allowed_mime_types && bucket.allowed_mime_types.length > 0) { | ||
| const allowed = bucket.allowed_mime_types as string[]; | ||
| const isAllowed = allowed.some((pattern: string) => { | ||
| if (pattern === '*/*') return true; | ||
| if (pattern.endsWith('/*')) { | ||
| const prefix = pattern.slice(0, -1); | ||
| return contentType.startsWith(prefix); | ||
| } | ||
| return contentType === pattern; | ||
| }); | ||
| if (!isAllowed) { | ||
| throw new Error(`CONTENT_TYPE_NOT_ALLOWED: ${contentType} not in bucket allowed types`); | ||
| } | ||
| } | ||
|
|
||
| // --- Validate size against bucket's max_file_size --- | ||
| if (bucket.max_file_size && size > bucket.max_file_size) { | ||
| throw new Error(`FILE_TOO_LARGE: exceeds bucket max of ${bucket.max_file_size} bytes`); | ||
| } | ||
|
|
||
| const s3Key = buildS3Key(contentHash); | ||
|
|
||
| // --- Dedup check: look for existing file with same content_hash in this bucket --- | ||
| const dedupResult = await pgClient.query( | ||
| `SELECT id, status | ||
| FROM ${storageConfig.filesQualifiedName} | ||
| WHERE content_hash = $1 | ||
| AND bucket_id = $2 | ||
| AND status IN ('ready', 'processed') | ||
| LIMIT 1`, | ||
| [contentHash, bucket.id], | ||
| ); | ||
|
|
||
| if (dedupResult.rows.length > 0) { | ||
| const existingFile = dedupResult.rows[0]; | ||
| log.info(`Dedup hit: file ${existingFile.id} for hash ${contentHash}`); | ||
|
|
||
| // Track the dedup request | ||
| await pgClient.query( | ||
| `INSERT INTO ${storageConfig.uploadRequestsQualifiedName} | ||
| (file_id, bucket_id, key, content_type, content_hash, size, status, expires_at) | ||
| VALUES ($1, $2, $3, $4, $5, $6, 'confirmed', NOW())`, | ||
| [existingFile.id, bucket.id, s3Key, contentType, contentHash, size], | ||
| ); | ||
|
|
||
| return { | ||
| uploadUrl: null, | ||
| fileId: existingFile.id, | ||
| key: s3Key, | ||
| deduplicated: true, | ||
| expiresAt: null, | ||
| }; | ||
| } | ||
|
|
||
| // --- Create file record (status=pending) --- | ||
| const fileResult = await pgClient.query( | ||
| `INSERT INTO ${storageConfig.filesQualifiedName} | ||
| (bucket_id, key, content_type, content_hash, size, filename, owner_id, is_public, status) | ||
| VALUES ($1, $2, $3, $4, $5, $6, $7, $8, 'pending') | ||
| RETURNING id`, | ||
| [ | ||
| bucket.id, | ||
| s3Key, | ||
| contentType, | ||
| contentHash, | ||
| size, | ||
| filename || null, | ||
| bucket.owner_id, | ||
| bucket.is_public, | ||
| ], | ||
| ); | ||
|
|
||
| const fileId = fileResult.rows[0].id; | ||
|
|
||
| // --- Generate presigned PUT URL --- | ||
| const uploadUrl = await generatePresignedPutUrl( | ||
| s3, | ||
| s3Key, | ||
| contentType, | ||
| size, | ||
| urlExpirySeconds, | ||
| ); | ||
|
|
||
| const expiresAt = new Date(Date.now() + urlExpirySeconds * 1000).toISOString(); | ||
|
|
||
| // --- Track the upload request --- | ||
| await pgClient.query( | ||
| `INSERT INTO ${storageConfig.uploadRequestsQualifiedName} | ||
| (file_id, bucket_id, key, content_type, content_hash, size, status, expires_at) | ||
| VALUES ($1, $2, $3, $4, $5, $6, 'issued', $7)`, | ||
| [fileId, bucket.id, s3Key, contentType, contentHash, size, expiresAt], | ||
| ); | ||
|
|
||
| return { | ||
| uploadUrl, | ||
| fileId, | ||
| key: s3Key, | ||
| deduplicated: false, | ||
| expiresAt, | ||
| }; | ||
| }); |
There was a problem hiding this comment.
🔴 Missing explicit transaction management for multi-step mutations
Both requestUploadUrl and confirmUpload perform multiple sequential write operations without wrapping them in an explicit BEGIN/COMMIT/ROLLBACK transaction. The existing PublicKeySignature plugin (graphile/graphile-settings/src/plugins/PublicKeySignature.ts:113-140) — which uses the identical withPgClient pattern — explicitly manages transactions because withPgClient does not automatically wrap callbacks in a transaction.
In confirmUpload, the file status is updated to 'ready' (line 387-391) before the upload request is updated to 'confirmed' (line 395-399). If the second UPDATE fails or the process crashes, the file is 'ready' but the upload_request remains 'issued' — a data inconsistency. Similarly in requestUploadUrl, if the file INSERT (line 262) succeeds but the upload_request INSERT (line 293) fails, an orphaned 'pending' file record exists with no tracking entry.
Expected pattern from PublicKeySignature
return withPgClient(pgSettings, async (pgClient: any) => {
await pgClient.query('BEGIN');
try {
// ... all queries ...
await pgClient.query('COMMIT');
return result;
} catch (err) {
await pgClient.query('ROLLBACK');
throw err;
}
});Prompt for agents
In graphile/graphile-presigned-url-plugin/src/plugin.ts, both mutation handlers need explicit transaction management matching the PublicKeySignature pattern.
For requestUploadUrl (line 179): Wrap the entire withPgClient callback body in a BEGIN/COMMIT/ROLLBACK try-catch block. After line 179 add: await pgClient.query('BEGIN'); then wrap the remaining logic (lines 180-306) in a try block that ends with await pgClient.query('COMMIT'); and a catch block with await pgClient.query('ROLLBACK'); throw err;
For confirmUpload (line 328): Same pattern. After line 328 add: await pgClient.query('BEGIN'); then wrap the remaining logic (lines 329-406) in a try block with COMMIT before returning and a catch block with ROLLBACK and re-throw.
This ensures that the multiple write operations (file INSERT + upload_request INSERT, or file UPDATE + upload_request UPDATE) are atomic.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if (filename) { | ||
| params.ResponseContentDisposition = `attachment; filename="${filename}"`; | ||
| } |
There was a problem hiding this comment.
🟡 Unsanitized filename in Content-Disposition header allows header value injection
User-provided filename is interpolated directly into the ResponseContentDisposition header value without sanitizing special characters. A filename containing " characters (e.g., evil"; malicious=true) produces a malformed Content-Disposition: attachment; filename="evil"; malicious=true" header in the S3 download response. The filename flows from user input (plugin.ts:156) → database storage (plugin.ts:273) → download URL generation (download-url-field.ts:81,101) → presigned URL header (s3-signer.ts:70).
| if (filename) { | |
| params.ResponseContentDisposition = `attachment; filename="${filename}"`; | |
| } | |
| if (filename) { | |
| const sanitized = filename.replace(/["\\\r\n]/g, '_'); | |
| params.ResponseContentDisposition = `attachment; filename="${sanitized}"`; | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
…hema_public.database The server middleware already sets jwt.claims.database_id in pgSettings, so jwt_private.current_database_id() is a cheap function call vs the previous approach of querying metaschema_public.database on every request.
Summary
New PostGraphile v5 plugin package (
graphile/graphile-presigned-url-plugin/) implementing Phase 2A of the presigned URL upload pipeline. Adds three capabilities to the GraphQL schema:requestUploadUrlmutation — Accepts a client-computed SHA-256 content hash, validates bucket access (RLS), checks MIME type / file size against bucket config, deduplicates against existing files, creates apendingfile record, generates a presigned PUT URL, and tracks the request inupload_requests.confirmUploadmutation — Verifies the file exists in S3 via HEAD, checks content-type matches, transitions file statuspending → ready, marks upload_request asconfirmed. Idempotent for already-confirmed files.downloadUrlcomputed field — Detected via@storageFilessmart tag on the codec (table). Returns public URL for public files, presigned GET URL for private files.Uses
extendSchema+ grafastlambda/objectplans (same pattern asPublicKeySignature),GraphQLObjectType_fieldshook for the computed field (same pattern asgraphile-search), and an LRU cache for per-database storage module config resolution.Note: This package is scaffolded and builds successfully but is not yet wired into
ConstructivePresetand has no integration tests yet. Those are planned as follow-up commits.Updates since last revision
jwt_private.current_database_id()for database resolution (plugin.ts):resolveDatabaseId()now callsSELECT jwt_private.current_database_id()— a cheap function reading the JWT context already set by the server middleware — instead of queryingmetaschema_public.database WHERE db_name = current_database()on every request. The storage module config LRU cache (storage-module-cache.ts) still runs its heavier metaschema join query only on cache miss.@storageFilessmart tag (download-url-field.ts): File table detection now checkscodec.extensions.tags.storageFilesinstead of heuristically matching 6 column names. The storage module generator in constructive-db emits this tag on the generated files table (see constructive-db#687).Review & Testing Checklist for Human
jwt_private.current_database_id()availability —resolveDatabaseId()assumes the server middleware setsjwt.claims.database_idin pgSettings. Verify this is always set for authenticated requests. If a request lacks this claim (e.g. misconfigured middleware), the plugin returnsDATABASE_NOT_FOUND. Checkgraphile.ts:159in the server.storage-module-cache.ts:89-91builds qualified table names via string interpolation from metaschema rows. These are then interpolated into raw SQL inplugin.ts(lines 193, 232, 247, 263, 294, 342, 375, 388, 396). Values come from trusted metaschema, but verify this is acceptable vs. parameterized identifiers.@storageFilessmart tag cross-repo contract —download-url-field.tsreadscodec.extensions.tags.storageFiles. This must match thesmart_tags.storageFileskey emitted bystorage_module.sqlin constructive-db#687. If renamed in one repo, it must be renamed in both.requestUploadUrl, a concurrent request with the same hash could cause a unique constraint violation. No explicit conflict handling (e.g.,ON CONFLICT).jest.config.jsexists with--passWithNoTests. Verify the plan for integration tests against MinIO before merge.Suggested test plan:
requestUploadUrl→ verify presigned PUT URL is returned andupload_requestsrow is createdconfirmUpload→ verify file status transitions toreadydownloadUrlfield appears (requires@storageFilestag on the table)requestUploadUrlwith the same content hash → verifydeduplicated: trueresponseNotes
upload_requeststable,filenamecolumn, and@storageFilessmart tag to the storage module.withPgClient(pgSettings, ...)(notnulllike PublicKeySignature) because RLS enforcement requires the user's JWT context for bucket/file access checks.as anycasts exist for Graphile scope types and AWS SDK type mismatches — these mirror patterns in existing plugins (graphile-search,graphile-upload-plugin) but are worth noting.downloadUrlresolver is async for private files (callsgeneratePresignedGetUrlper row), which means N+1 S3 signing calls on list queries. Acceptable for v1 but may need batching later.buildS3Key()inplugin.ts:52-54). If multiple logical buckets map to the same physical S3 bucket, keys could collide. Intentional simplification for v1.Link to Devin session: https://app.devin.ai/sessions/4c882ba2dfbf4045adf85fb83cde6f77
Requested by: @pyramation