Fix duplicate document error during database migration#157
Fix duplicate document error during database migration#157
Conversation
When batch inserting documents via createDocuments, a single duplicate causes the entire batch to fail. This catches DuplicateException and falls back to one-by-one insertion, skipping documents that already exist instead of failing the whole batch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WalkthroughThis pull request introduces a fallback mechanism for batch document insertion in Appwrite. The primary change wraps an existing batch Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/Migration/Unit/Destinations/AppwriteTest.php (2)
126-148: Assert that the happy path actually callscreateDocuments().Right now this only proves
createDocument()was not used. It would still pass ifcreateRecord()returned early before writing anything, so the batch call itself should be an explicit expectation.✅ Lock in the batch-success path
- $dbForDatabases->method('createDocuments') + $dbForDatabases->expects($this->once()) + ->method('createDocuments') ->willReturn(0);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/Migration/Unit/Destinations/AppwriteTest.php` around lines 126 - 148, The test testCreateRecordBatchSucceeds currently only asserts createDocument() is never called; add an explicit expectation on the $dbForDatabases mock that createDocuments() is invoked once (or with the expected arguments) when calling Appwrite::createRecord($row, true) to ensure the batch path is exercised; update the $dbForDatabases mock setup to expect createDocuments() (instead of only stubbing its return) and keep the createDocument() never() expectation, referencing the createRecord method on Appwrite and the createDocuments/createDocument mock methods.
82-121: This test doesn't prove the batch path was attempted or that fallback continues past a duplicate.Because
createDocuments()is only stubbed, a regression to “always insert one-by-one” would still pass. And with only two rows, abreak/early-return on the duplicate would also pass because there is no third row to verify continuation.✅ Tighten the duplicate-path test
- $dbForDatabases->method('createDocuments') + $dbForDatabases->expects($this->once()) + ->method('createDocuments') ->willThrowException(new DuplicateException('Document already exists')); // Fallback createDocument: first succeeds, second throws duplicate (skipped) $createDocumentCallCount = 0; $dbForDatabases->method('createDocument') @@ $row1 = new Row('row1', $table, ['field1' => 'value1']); $row2 = new Row('row2', $table, ['field1' => 'value2']); + $row3 = new Row('row3', $table, ['field1' => 'value3']); // Buffer row1 (not last) $result1 = $method->invoke($appwrite, $row1, false); $this->assertTrue($result1); - // Buffer row2 and flush (isLast=true) - should NOT throw - $result2 = $method->invoke($appwrite, $row2, true); + $result2 = $method->invoke($appwrite, $row2, false); $this->assertTrue($result2); + + // Flush on row3 - should continue after row2 duplicate + $result3 = $method->invoke($appwrite, $row3, true); + $this->assertTrue($result3); // Verify fallback was used - $this->assertEquals(2, $createDocumentCallCount); + $this->assertEquals(3, $createDocumentCallCount);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/Migration/Unit/Destinations/AppwriteTest.php` around lines 82 - 121, The test testCreateRecordHandlesDuplicateDocuments should ensure the batch path was attempted and that fallback continues after a duplicate: update the test to (1) assert createDocuments was invoked (or set the mock to expect a call) on the $dbForDatabases mock, and (2) exercise at least three rows (e.g., row1 buffered, row2 buffered, row3 flush with isLast=true) and adjust the createDocument callback to throw DuplicateException for only one of the rows so you can assert the fallback call count continues past the duplicate (e.g., assert createDocumentCallCount === 3). Ensure you reference the mocked methods createDocuments and createDocument and the Appwrite::createRecord invocation when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/Migration/Destinations/Appwrite.php`:
- Around line 1072-1089: The per-row DuplicateException in the fallback loop
over $this->rowBuffer currently just swallows duplicates, causing the migration
to report them as successful; modify the inner catch (DuplicateException) block
inside the skipRelationshipsExistCheck(fn () =>
$dbForDatabases->createDocument(...)) loop to mark the corresponding Row
resource as STATUS_SKIPPED using the same update routine/logic used in the
explicit duplicate branch (i.e., the code that updates the Row status to
STATUS_SKIPPED when a duplicate is detected earlier), so the migration
cache/results accurately reflect skipped duplicates.
---
Nitpick comments:
In `@tests/Migration/Unit/Destinations/AppwriteTest.php`:
- Around line 126-148: The test testCreateRecordBatchSucceeds currently only
asserts createDocument() is never called; add an explicit expectation on the
$dbForDatabases mock that createDocuments() is invoked once (or with the
expected arguments) when calling Appwrite::createRecord($row, true) to ensure
the batch path is exercised; update the $dbForDatabases mock setup to expect
createDocuments() (instead of only stubbing its return) and keep the
createDocument() never() expectation, referencing the createRecord method on
Appwrite and the createDocuments/createDocument mock methods.
- Around line 82-121: The test testCreateRecordHandlesDuplicateDocuments should
ensure the batch path was attempted and that fallback continues after a
duplicate: update the test to (1) assert createDocuments was invoked (or set the
mock to expect a call) on the $dbForDatabases mock, and (2) exercise at least
three rows (e.g., row1 buffered, row2 buffered, row3 flush with isLast=true) and
adjust the createDocument callback to throw DuplicateException for only one of
the rows so you can assert the fallback call count continues past the duplicate
(e.g., assert createDocumentCallCount === 3). Ensure you reference the mocked
methods createDocuments and createDocument and the Appwrite::createRecord
invocation when making these changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8d34308c-6939-401d-8cd4-5d295d5289c8
📒 Files selected for processing (2)
src/Migration/Destinations/Appwrite.phptests/Migration/Unit/Destinations/AppwriteTest.php
| try { | ||
| $dbForDatabases->skipRelationshipsExistCheck(fn () => $dbForDatabases->createDocuments( | ||
| $collectionId, | ||
| $this->rowBuffer | ||
| )); | ||
| } catch (DuplicateException) { | ||
| // Batch insert failed due to a duplicate document. | ||
| // Fall back to inserting one-by-one, skipping duplicates. | ||
| foreach ($this->rowBuffer as $row) { | ||
| try { | ||
| $dbForDatabases->skipRelationshipsExistCheck(fn () => $dbForDatabases->createDocument( | ||
| $collectionId, | ||
| $row | ||
| )); | ||
| } catch (DuplicateException) { | ||
| // Document already exists, skip it | ||
| } | ||
| } |
There was a problem hiding this comment.
Fallback duplicates are reported as success, not skipped.
This loop only has buffered UtopiaDocuments, so when a duplicate is skipped here there is no way to update the matching Row resource back to STATUS_SKIPPED like the explicit duplicate branch at Lines 989-994. In practice, the migration cache/result will report these rows as successfully imported even though they were only ignored as pre-existing.
💡 Sketch of a fix
- /**
- * `@var` array<UtopiaDocument>
- */
- private array $rowBuffer = [];
+ /**
+ * `@var` array<array{resource: Row, document: UtopiaDocument}>
+ */
+ private array $rowBuffer = [];
- $this->rowBuffer[] = new UtopiaDocument(\array_merge([
- '$id' => $resource->getId(),
- '$permissions' => $resource->getPermissions(),
- ], $data));
+ $this->rowBuffer[] = [
+ 'resource' => $resource,
+ 'document' => new UtopiaDocument(\array_merge([
+ '$id' => $resource->getId(),
+ '$permissions' => $resource->getPermissions(),
+ ], $data)),
+ ];
- $dbForDatabases->skipRelationshipsExistCheck(fn () => $dbForDatabases->createDocuments(
- $collectionId,
- $this->rowBuffer
- ));
+ $dbForDatabases->skipRelationshipsExistCheck(fn () => $dbForDatabases->createDocuments(
+ $collectionId,
+ \array_column($this->rowBuffer, 'document')
+ ));
- foreach ($this->rowBuffer as $row) {
+ foreach ($this->rowBuffer as $entry) {
try {
$dbForDatabases->skipRelationshipsExistCheck(fn () => $dbForDatabases->createDocument(
$collectionId,
- $row
+ $entry['document']
));
} catch (DuplicateException) {
- // Document already exists, skip it
+ $entry['resource']->setStatus(
+ Resource::STATUS_SKIPPED,
+ 'Row has already been created'
+ );
+ $this->cache->update($entry['resource']);
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/Migration/Destinations/Appwrite.php` around lines 1072 - 1089, The
per-row DuplicateException in the fallback loop over $this->rowBuffer currently
just swallows duplicates, causing the migration to report them as successful;
modify the inner catch (DuplicateException) block inside the
skipRelationshipsExistCheck(fn () => $dbForDatabases->createDocument(...)) loop
to mark the corresponding Row resource as STATUS_SKIPPED using the same update
routine/logic used in the explicit duplicate branch (i.e., the code that updates
the Row status to STATUS_SKIPPED when a duplicate is detected earlier), so the
migration cache/results accurately reflect skipped duplicates.
Summary
createDocuments(), a single duplicate document causes the entire batch to fail with "Document already exists"catch (DuplicateException)that falls back to inserting documents one-by-one viacreateDocument(), skipping any that already existTest plan
pint) passesphpstan) passesFixes: https://appwrite.sentry.io/issues/7289696520/
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests