Skip to content

Revert "[Fix](variance) Fix sample variance/stddev NaN res for single value"#64099

Open
linrrzqqq wants to merge 1 commit into
apache:masterfrom
linrrzqqq:revert-63605-fix-sample-single-val
Open

Revert "[Fix](variance) Fix sample variance/stddev NaN res for single value"#64099
linrrzqqq wants to merge 1 commit into
apache:masterfrom
linrrzqqq:revert-63605-fix-sample-single-val

Conversation

@linrrzqqq
Copy link
Copy Markdown
Collaborator

Reverts #63605

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: request changes. The PR reverts the previous behavior but introduces a data correctness regression for variance_samp/stddev_samp on singleton groups: BE writes 0.0 for count < 2 in common grouped/non-null paths, which is not the SQL NULL result expected for sample variance/stddev.

Critical checkpoint conclusions:

  • Goal/test: The goal appears to revert #63605 and update expected outputs, but the new outputs demonstrate incorrect singleton sample aggregate results as 0.0. Existing tests were updated to accept the regression rather than prove correct behavior.
  • Scope: The implementation change is small, but it changes core aggregate semantics.
  • Concurrency/lifecycle/config/compatibility: No new concurrency, lifecycle, config, storage format, or FE-BE protocol compatibility concerns found.
  • Parallel paths: The issue affects grouped aggregate/window/intermediate aggregate-state paths where the result column is non-nullable or only the nested column is passed.
  • Test coverage/results: Regression outputs were modified, but they now encode the incorrect 0.0 singleton result; the removed one-row aggregate assertion also reduces direct coverage.
  • Observability/transactions/persistence: Not applicable.
  • Performance: No material performance issue found in this diff.

User focus: no additional user-provided review focus was specified.

Comment thread be/src/exprs/aggregate/aggregate_function_stddev.h
@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.92% (21050/39038)
Line Coverage 37.58% (200108/532436)
Region Coverage 33.66% (156907/466168)
Branch Coverage 34.62% (68659/198295)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (10/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.89% (27499/38254)
Line Coverage 55.42% (294353/531164)
Region Coverage 52.23% (245902/470790)
Branch Coverage 53.33% (106200/199144)

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29062 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 043f3c5d561a2cd07094bc4707449348b1442efd, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17629	4001	4100	4001
q2	q3	10823	1411	815	815
q4	4697	481	343	343
q5	7527	880	620	620
q6	186	177	143	143
q7	767	849	630	630
q8	9473	1450	1595	1450
q9	6367	4512	4469	4469
q10	6836	1832	1528	1528
q11	441	269	250	250
q12	631	417	298	298
q13	18153	3281	2757	2757
q14	270	255	252	252
q15	q16	821	774	711	711
q17	1043	941	963	941
q18	6924	5603	5618	5603
q19	1358	1343	1131	1131
q20	500	380	262	262
q21	6105	2781	2535	2535
q22	457	376	323	323
Total cold run time: 101008 ms
Total hot run time: 29062 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4857	4836	4740	4740
q2	q3	5082	5208	4631	4631
q4	2098	2211	1396	1396
q5	4878	4893	4728	4728
q6	230	183	137	137
q7	1874	1729	1560	1560
q8	2476	2102	2002	2002
q9	7500	7411	7381	7381
q10	4736	4675	4208	4208
q11	533	387	367	367
q12	723	734	520	520
q13	2971	3369	2813	2813
q14	278	293	258	258
q15	q16	689	709	613	613
q17	1283	1248	1255	1248
q18	7308	6965	6904	6904
q19	1111	1104	1095	1095
q20	2216	2221	1936	1936
q21	5308	4605	4515	4515
q22	517	461	416	416
Total cold run time: 56668 ms
Total hot run time: 51468 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169320 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 043f3c5d561a2cd07094bc4707449348b1442efd, data reload: false

query5	4328	658	490	490
query6	448	202	192	192
query7	4813	574	315	315
query8	378	217	240	217
query9	8745	4113	4096	4096
query10	457	313	267	267
query11	5925	2355	2162	2162
query12	147	103	104	103
query13	1263	584	428	428
query14	6419	5435	5169	5169
query14_1	4497	4542	4503	4503
query15	212	200	176	176
query16	1005	461	442	442
query17	1121	699	597	597
query18	2631	487	363	363
query19	208	187	151	151
query20	119	110	107	107
query21	230	141	121	121
query22	13633	13604	13513	13513
query23	17432	16478	16241	16241
query23_1	16292	16401	16308	16308
query24	7444	1783	1354	1354
query24_1	1329	1332	1309	1309
query25	556	461	387	387
query26	1402	318	168	168
query27	2710	591	361	361
query28	4401	2052	2004	2004
query29	1069	616	476	476
query30	313	228	197	197
query31	1116	1085	950	950
query32	114	61	59	59
query33	532	314	266	266
query34	1170	1112	674	674
query35	751	773	689	689
query36	1385	1436	1194	1194
query37	154	109	92	92
query38	3237	3146	3098	3098
query39	929	948	901	901
query39_1	889	879	873	873
query40	219	123	102	102
query41	65	63	65	63
query42	95	99	100	99
query43	334	339	293	293
query44	
query45	195	185	184	184
query46	1089	1206	742	742
query47	2342	2389	2193	2193
query48	421	425	299	299
query49	635	475	357	357
query50	955	342	259	259
query51	4325	4455	4339	4339
query52	87	90	79	79
query53	257	270	199	199
query54	279	226	224	224
query55	78	76	71	71
query56	255	234	228	228
query57	1420	1396	1315	1315
query58	244	221	212	212
query59	1592	1705	1450	1450
query60	285	256	237	237
query61	165	165	154	154
query62	708	667	591	591
query63	235	188	187	187
query64	2562	792	631	631
query65	
query66	1779	465	333	333
query67	29713	29624	28897	28897
query68	
query69	419	308	259	259
query70	968	968	938	938
query71	288	220	211	211
query72	2996	2973	2378	2378
query73	854	785	434	434
query74	5132	5000	4802	4802
query75	2670	2611	2255	2255
query76	2361	1170	811	811
query77	351	371	294	294
query78	12392	12267	11855	11855
query79	1324	1088	783	783
query80	537	489	400	400
query81	450	280	235	235
query82	236	155	127	127
query83	277	281	253	253
query84	262	143	116	116
query85	845	509	431	431
query86	329	304	287	287
query87	3403	3389	3211	3211
query88	3685	2770	2761	2761
query89	416	382	329	329
query90	2196	185	184	184
query91	181	167	138	138
query92	66	60	56	56
query93	1446	1401	881	881
query94	535	343	314	314
query95	686	381	349	349
query96	1043	795	359	359
query97	2706	2678	2581	2581
query98	214	207	216	207
query99	1170	1182	1035	1035
Total cold run time: 250991 ms
Total hot run time: 169320 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (10/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.93% (27517/38254)
Line Coverage 55.48% (294668/531164)
Region Coverage 52.33% (246381/470790)
Branch Coverage 53.39% (106326/199144)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants