Skip to content

[enhance](load) exclude version-gap replicas from success counting in quorum success#60953

Open
sollhui wants to merge 1 commit intoapache:masterfrom
sollhui:opt_quorum_success
Open

[enhance](load) exclude version-gap replicas from success counting in quorum success#60953
sollhui wants to merge 1 commit intoapache:masterfrom
sollhui:opt_quorum_success

Conversation

@sollhui
Copy link
Contributor

@sollhui sollhui commented Mar 2, 2026

Summary

When using majority write (quorum success), BE does not distinguish between replicas
with continuous versions and replicas with version gaps (lastFailedVersion >= 0).
This causes inconsistency with FE's commit check, which correctly excludes
version-gap replicas from success counting.

Bad Case

Consider 3 replicas on nodes 1, 2, 3 with load_required_replica_num = 2:

  1. First write: nodes 1,2 succeed, node 3 fails → overall success.
    Node 3 now has a version gap (lastFailedVersion >= 0).
  2. Second write: nodes 1,3 succeed, node 2 fails →
    • BE counts 2 successes (nodes 1,3), considers it quorum success.
    • FE commit only counts node 1 as success (node 3 has version gap),
      so successReplicaNum = 1 < 2, commit fails.
    • This wastes resources since BE already returned success to the client
      but FE rejects the transaction.

The correct behavior for the second write:

  • nodes 1,3 succeed → should FAIL (node 3 has version gap, only node 1 counts)
  • nodes 1,2 succeed → should SUCCEED (both have continuous versions)

Solution

Pass per-tablet version-gap backend information from FE to BE via a new thrift field
map<tablet_id, list<backend_id>> tablet_version_gap_backends in TOlapTablePartition.

On the BE side, when counting successful replicas for majority write in both
VTabletWriter (V1) and VTabletWriterV2, exclude version-gap backends from
the finished_tablets_replica counter. This makes BE's quorum check consistent
with FE's commit check.

Changes

  • Descriptors.thrift: Add tablet_version_gap_backends field to TOlapTablePartition
  • OlapTable.java: Add getPartitionVersionGapBackends() to compute gap backends per tablet
  • OlapTableSink.java: Populate the new field when building partition info
  • tablet_info.h/cpp: Parse and store gap backends from thrift
  • vtablet_writer.cpp: Exclude gap backends in _quorum_success
  • vtablet_writer_v2.cpp: Exclude gap backends in _quorum_success and _create_commit_info

@Thearas
Copy link
Contributor

Thearas commented Mar 2, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@sollhui
Copy link
Contributor Author

sollhui commented Mar 2, 2026

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.34% (1797/2265)
Line Coverage 64.75% (32181/49698)
Region Coverage 65.63% (16113/24550)
Branch Coverage 56.13% (8584/15294)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 59.09% (13/22) 🎉
Increment coverage report
Complete coverage report

@sollhui sollhui force-pushed the opt_quorum_success branch from 89f57c7 to 481f7e6 Compare March 2, 2026 10:54
@sollhui
Copy link
Contributor Author

sollhui commented Mar 2, 2026

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 28849 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 481f7e66984b8d3eaccc764717b3d41da85abbac, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17730	4480	4369	4369
q2	q3	10732	786	527	527
q4	4713	361	255	255
q5	8092	1218	1038	1038
q6	236	177	146	146
q7	828	854	678	678
q8	10518	1468	1315	1315
q9	6147	4780	4759	4759
q10	6864	1890	1634	1634
q11	476	270	259	259
q12	747	576	474	474
q13	17795	4230	3420	3420
q14	230	227	223	223
q15	987	797	784	784
q16	761	708	670	670
q17	754	905	436	436
q18	5981	5397	5230	5230
q19	1275	971	596	596
q20	508	489	396	396
q21	4588	1843	1403	1403
q22	339	275	237	237
Total cold run time: 100301 ms
Total hot run time: 28849 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4452	4360	4372	4360
q2	q3	1757	2174	1706	1706
q4	846	1202	773	773
q5	4001	4314	4343	4314
q6	180	171	138	138
q7	1714	1586	1483	1483
q8	2409	2613	2526	2526
q9	7907	7417	7401	7401
q10	2613	2886	2427	2427
q11	524	436	417	417
q12	501	605	487	487
q13	4080	4520	3627	3627
q14	289	303	284	284
q15	823	799	806	799
q16	730	755	734	734
q17	1217	1556	1386	1386
q18	7064	6826	6700	6700
q19	899	912	861	861
q20	2098	2181	2074	2074
q21	3999	3441	3425	3425
q22	465	427	370	370
Total cold run time: 48568 ms
Total hot run time: 46292 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184100 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 481f7e66984b8d3eaccc764717b3d41da85abbac, data reload: false

query5	4779	643	504	504
query6	330	227	214	214
query7	4221	482	291	291
query8	353	257	242	242
query9	8715	2787	2752	2752
query10	516	391	336	336
query11	16957	17447	17101	17101
query12	202	127	128	127
query13	1290	485	371	371
query14	6731	3364	3446	3364
query14_1	2952	3033	2903	2903
query15	212	197	179	179
query16	1042	545	472	472
query17	2048	737	653	653
query18	2832	476	376	376
query19	217	221	194	194
query20	153	139	134	134
query21	232	143	126	126
query22	5615	5860	4907	4907
query23	17179	16730	16570	16570
query23_1	16809	16779	16631	16631
query24	7124	1601	1239	1239
query24_1	1234	1242	1237	1237
query25	568	477	430	430
query26	1234	269	201	201
query27	2657	476	285	285
query28	4464	1892	1924	1892
query29	823	570	470	470
query30	310	244	211	211
query31	874	730	649	649
query32	80	79	84	79
query33	503	330	277	277
query34	913	919	562	562
query35	625	676	598	598
query36	1074	1136	993	993
query37	136	102	88	88
query38	2978	2912	2879	2879
query39	900	894	840	840
query39_1	835	825	837	825
query40	233	150	137	137
query41	67	61	58	58
query42	106	102	103	102
query43	361	386	345	345
query44	
query45	201	191	183	183
query46	875	976	626	626
query47	2099	2102	2037	2037
query48	310	317	223	223
query49	636	473	378	378
query50	695	280	217	217
query51	4098	4078	4076	4076
query52	108	108	99	99
query53	291	341	281	281
query54	290	280	272	272
query55	92	91	87	87
query56	322	340	303	303
query57	1335	1325	1274	1274
query58	291	279	276	276
query59	2635	2665	2593	2593
query60	343	345	333	333
query61	154	153	151	151
query62	609	577	552	552
query63	319	278	280	278
query64	4864	1280	1010	1010
query65	
query66	1378	459	349	349
query67	16443	16359	16331	16331
query68	
query69	395	306	279	279
query70	1014	990	960	960
query71	336	311	298	298
query72	2786	2639	2377	2377
query73	544	556	316	316
query74	10020	9950	9781	9781
query75	2825	2736	2482	2482
query76	2304	1025	673	673
query77	371	380	307	307
query78	11127	11387	10620	10620
query79	1125	809	608	608
query80	1342	648	534	534
query81	559	285	250	250
query82	992	158	114	114
query83	340	259	245	245
query84	252	111	103	103
query85	899	481	440	440
query86	439	327	306	306
query87	3084	3105	2971	2971
query88	3579	2663	2644	2644
query89	421	371	346	346
query90	2011	176	170	170
query91	167	162	136	136
query92	79	76	74	74
query93	967	828	509	509
query94	632	316	301	301
query95	590	343	377	343
query96	636	515	233	233
query97	2473	2500	2394	2394
query98	231	221	216	216
query99	1029	991	883	883
Total cold run time: 254992 ms
Total hot run time: 184100 ms

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.34% (1797/2265)
Line Coverage 64.76% (32185/49698)
Region Coverage 65.64% (16115/24550)
Branch Coverage 56.11% (8582/15294)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 59.09% (13/22) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 37.50% (15/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.57% (19635/37349)
Line Coverage 36.18% (183278/506555)
Region Coverage 32.49% (142239/437761)
Branch Coverage 33.43% (61673/184460)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 37.50% (15/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 51.12% (18695/36571)
Line Coverage 35.03% (176903/505002)
Region Coverage 30.98% (136899/441898)
Branch Coverage 32.23% (59625/185024)

@sollhui
Copy link
Contributor Author

sollhui commented Mar 3, 2026

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.34% (1797/2265)
Line Coverage 64.76% (32185/49698)
Region Coverage 65.60% (16104/24550)
Branch Coverage 56.09% (8579/15294)

@doris-robot
Copy link

TPC-H: Total hot run time: 29183 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 481f7e66984b8d3eaccc764717b3d41da85abbac, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17674	4704	4470	4470
q2	q3	10747	841	530	530
q4	4740	364	266	266
q5	8043	1219	1035	1035
q6	192	180	147	147
q7	800	870	673	673
q8	10523	1666	1349	1349
q9	6793	5100	4514	4514
q10	6876	1880	1657	1657
q11	479	256	238	238
q12	748	567	477	477
q13	17789	4274	3434	3434
q14	229	233	211	211
q15	949	802	788	788
q16	759	723	684	684
q17	709	873	433	433
q18	6007	5416	5279	5279
q19	1125	980	852	852
q20	635	589	424	424
q21	4603	1972	1457	1457
q22	365	312	265	265
Total cold run time: 100785 ms
Total hot run time: 29183 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4622	4612	4624	4612
q2	q3	1785	2208	1765	1765
q4	871	1212	793	793
q5	4074	4527	4421	4421
q6	191	186	144	144
q7	1762	1663	1586	1586
q8	2469	2757	2590	2590
q9	7514	7273	7328	7273
q10	2711	2890	2625	2625
q11	528	449	430	430
q12	510	581	451	451
q13	4022	4455	3628	3628
q14	293	313	289	289
q15	850	811	812	811
q16	737	804	737	737
q17	1237	1581	1329	1329
q18	7128	7008	6700	6700
q19	883	890	884	884
q20	2094	2179	2001	2001
q21	3980	3563	3378	3378
q22	478	422	388	388
Total cold run time: 48739 ms
Total hot run time: 46835 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183500 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 481f7e66984b8d3eaccc764717b3d41da85abbac, data reload: false

query5	4350	651	536	536
query6	336	215	207	207
query7	4206	482	277	277
query8	327	244	233	233
query9	8716	2807	2783	2783
query10	487	394	346	346
query11	17001	17733	17233	17233
query12	213	142	132	132
query13	1277	530	361	361
query14	6726	3320	2996	2996
query14_1	3007	2889	2885	2885
query15	237	205	187	187
query16	1156	499	452	452
query17	1476	733	640	640
query18	3286	476	358	358
query19	213	211	181	181
query20	160	141	139	139
query21	247	146	134	134
query22	5356	5053	4603	4603
query23	17215	16849	16553	16553
query23_1	16589	16685	16566	16566
query24	7248	1643	1220	1220
query24_1	1258	1255	1222	1222
query25	574	486	437	437
query26	1251	272	161	161
query27	2768	477	321	321
query28	4483	1927	1921	1921
query29	818	592	505	505
query30	316	247	213	213
query31	906	733	701	701
query32	79	71	72	71
query33	519	331	282	282
query34	900	915	571	571
query35	653	678	584	584
query36	1116	1115	988	988
query37	136	96	83	83
query38	3048	2911	2817	2817
query39	896	851	849	849
query39_1	841	832	854	832
query40	233	159	134	134
query41	64	60	58	58
query42	104	101	102	101
query43	371	382	349	349
query44	
query45	199	186	179	179
query46	873	984	601	601
query47	2119	2153	2067	2067
query48	311	307	232	232
query49	630	462	381	381
query50	695	283	214	214
query51	4144	4082	4077	4077
query52	106	112	99	99
query53	293	336	292	292
query54	301	282	256	256
query55	91	83	89	83
query56	310	310	317	310
query57	1377	1374	1270	1270
query58	283	290	275	275
query59	2601	2666	2485	2485
query60	371	336	330	330
query61	155	145	150	145
query62	636	594	572	572
query63	318	302	280	280
query64	4842	1254	989	989
query65	
query66	1376	453	351	351
query67	16469	16351	16254	16254
query68	
query69	387	321	294	294
query70	930	981	830	830
query71	339	316	319	316
query72	2861	2740	2435	2435
query73	547	555	330	330
query74	9978	9927	9759	9759
query75	2847	2749	2482	2482
query76	2285	1044	686	686
query77	362	415	315	315
query78	11176	11377	10629	10629
query79	1130	815	596	596
query80	1326	657	571	571
query81	561	285	264	264
query82	1000	157	118	118
query83	333	269	257	257
query84	269	126	105	105
query85	1003	571	440	440
query86	421	327	301	301
query87	3131	3111	3017	3017
query88	3611	2692	2677	2677
query89	425	376	343	343
query90	1955	179	177	177
query91	173	154	133	133
query92	78	77	74	74
query93	1078	853	511	511
query94	642	325	306	306
query95	577	404	313	313
query96	633	525	231	231
query97	2479	2482	2402	2402
query98	227	214	221	214
query99	997	996	902	902
Total cold run time: 254867 ms
Total hot run time: 183500 ms

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 37.50% (15/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.59% (19642/37351)
Line Coverage 36.22% (183504/506572)
Region Coverage 32.52% (142352/437766)
Branch Coverage 33.46% (61722/184462)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 37.50% (15/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 51.14% (18702/36573)
Line Coverage 35.07% (177130/505019)
Region Coverage 31.00% (137004/441903)
Branch Coverage 32.25% (59674/185026)

1 similar comment
@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 37.50% (15/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 51.14% (18702/36573)
Line Coverage 35.07% (177130/505019)
Region Coverage 31.00% (137004/441903)
Branch Coverage 32.25% (59674/185026)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 37.50% (15/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.86% (20064/36573)
Line Coverage 38.32% (193502/505019)
Region Coverage 34.63% (153052/441903)
Branch Coverage 35.41% (65511/185026)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 72.50% (29/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 56.97% (20834/36573)
Line Coverage 40.09% (202437/505019)
Region Coverage 36.69% (162132/441903)
Branch Coverage 37.32% (69058/185026)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 72.50% (29/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.23% (20930/36573)
Line Coverage 40.31% (203560/505019)
Region Coverage 36.94% (163232/441903)
Branch Coverage 37.62% (69605/185026)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 72.50% (29/40) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.15% (20900/36573)
Line Coverage 40.26% (203308/505019)
Region Coverage 36.91% (163110/441903)
Branch Coverage 37.56% (69489/185026)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants