Skip to content

[fix](statistics) Avoid re-estimating pruned partition predicates in stats#63764

Open
foxtail463 wants to merge 1 commit into
apache:masterfrom
foxtail463:fix/pruned-partition-filter-stats
Open

[fix](statistics) Avoid re-estimating pruned partition predicates in stats#63764
foxtail463 wants to merge 1 commit into
apache:masterfrom
foxtail463:fix/pruned-partition-filter-stats

Conversation

@foxtail463
Copy link
Copy Markdown
Contributor

Problem Summary:

For OLAP scans, partition pruning can already reduce the scan row count to the selected partitions. However, the original partition predicate is intentionally kept in the filter until post-processing so MV rewrite can still match the original query predicate.

During CBO stats calculation, this means the filter estimator may apply the same partition predicate again on top of the already-pruned scan row count, causing row count underestimation. For example, after pruning to one partition, id = 1 may already be reflected in the scan cardinality, but computeFilter still estimates selectivity for id = 1.

This change reuses the recorded PartitionPrunablePredicate on OLAP scans and skips those already-pruned conjuncts during filter statistics estimation, while preserving the existing plan shape and post-processing behavior.

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@foxtail463
Copy link
Copy Markdown
Contributor Author

run buildall

@foxtail463 foxtail463 changed the title [fix](fe) Avoid re-estimating pruned partition predicates in stats [fix](statistics) Avoid re-estimating pruned partition predicates in stats May 27, 2026
@foxtail463 foxtail463 force-pushed the fix/pruned-partition-filter-stats branch from 79c9f8e to 84aed36 Compare May 27, 2026 13:36
@foxtail463
Copy link
Copy Markdown
Contributor Author

run buildall

@foxtail463 foxtail463 force-pushed the fix/pruned-partition-filter-stats branch from 84aed36 to ee2971a Compare May 28, 2026 10:23
@foxtail463
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31208 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ee2971a54594227d99c4a186dfddfc9d940a33bd, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17726	4091	4022	4022
q2	q3	10746	1369	810	810
q4	4679	466	345	345
q5	7560	2229	2077	2077
q6	252	187	136	136
q7	992	779	655	655
q8	9361	1791	1657	1657
q9	6485	4929	4950	4929
q10	6419	2251	1867	1867
q11	440	278	244	244
q12	700	425	292	292
q13	18244	3389	2762	2762
q14	264	256	235	235
q15	q16	825	769	709	709
q17	945	979	981	979
q18	6800	5706	5566	5566
q19	1199	1282	1076	1076
q20	512	401	262	262
q21	5632	2615	2281	2281
q22	441	361	304	304
Total cold run time: 100222 ms
Total hot run time: 31208 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4329	4240	4283	4240
q2	q3	4535	4977	4367	4367
q4	2090	2169	1376	1376
q5	4398	4287	4292	4287
q6	236	218	216	216
q7	2139	1907	1708	1708
q8	2475	2158	2136	2136
q9	7992	7963	7904	7904
q10	4846	4788	4563	4563
q11	585	445	410	410
q12	733	780	540	540
q13	3197	3642	2944	2944
q14	300	294	277	277
q15	q16	714	735	669	669
q17	1356	1327	1389	1327
q18	7890	7277	6914	6914
q19	1131	1118	1088	1088
q20	2233	2222	1953	1953
q21	5247	4550	4467	4467
q22	527	473	413	413
Total cold run time: 56953 ms
Total hot run time: 51799 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172231 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ee2971a54594227d99c4a186dfddfc9d940a33bd, data reload: false

query5	4298	679	507	507
query6	340	220	197	197
query7	4234	581	310	310
query8	331	237	222	222
query9	8828	4050	4043	4043
query10	457	374	309	309
query11	5805	2549	2196	2196
query12	185	129	129	129
query13	1277	613	457	457
query14	6108	5415	5108	5108
query14_1	4451	4463	4417	4417
query15	211	208	185	185
query16	1019	470	430	430
query17	1123	738	577	577
query18	2503	479	351	351
query19	215	204	158	158
query20	136	133	131	131
query21	215	136	114	114
query22	13579	13548	13442	13442
query23	17258	16517	16253	16253
query23_1	16456	16271	16378	16271
query24	7694	1813	1297	1297
query24_1	1334	1326	1325	1325
query25	608	502	443	443
query26	1308	326	179	179
query27	2695	566	346	346
query28	4487	2016	1982	1982
query29	1019	651	515	515
query30	305	244	202	202
query31	1130	1088	962	962
query32	96	77	75	75
query33	549	367	311	311
query34	1183	1135	673	673
query35	793	830	704	704
query36	1458	1400	1257	1257
query37	159	111	93	93
query38	3248	3151	3065	3065
query39	939	924	903	903
query39_1	891	907	879	879
query40	231	151	130	130
query41	74	70	69	69
query42	111	117	113	113
query43	333	329	295	295
query44	
query45	218	208	201	201
query46	1098	1184	738	738
query47	2412	2418	2262	2262
query48	406	418	317	317
query49	669	515	407	407
query50	1014	345	267	267
query51	4340	4334	4342	4334
query52	111	105	99	99
query53	262	281	212	212
query54	329	280	281	280
query55	95	94	87	87
query56	314	329	320	320
query57	1431	1433	1356	1356
query58	309	290	283	283
query59	1586	1657	1501	1501
query60	336	344	325	325
query61	218	152	157	152
query62	703	644	584	584
query63	242	199	209	199
query64	2375	815	629	629
query65	
query66	1735	501	352	352
query67	29732	29770	29518	29518
query68	
query69	467	337	312	312
query70	1057	1076	977	977
query71	304	273	268	268
query72	3040	2697	2414	2414
query73	840	793	454	454
query74	5090	4939	4770	4770
query75	2677	2604	2270	2270
query76	2269	1148	777	777
query77	403	416	325	325
query78	12526	12402	11910	11910
query79	1508	1050	748	748
query80	1278	527	454	454
query81	515	281	239	239
query82	1036	158	125	125
query83	349	298	247	247
query84	264	145	109	109
query85	991	565	455	455
query86	468	319	309	309
query87	3431	3366	3232	3232
query88	3618	2736	2734	2734
query89	449	403	350	350
query90	1919	181	174	174
query91	180	172	145	145
query92	81	78	75	75
query93	1544	1520	929	929
query94	756	346	318	318
query95	686	456	350	350
query96	1018	771	344	344
query97	2767	2760	2608	2608
query98	235	227	225	225
query99	1174	1165	1039	1039
Total cold run time: 255285 ms
Total hot run time: 172231 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants