Hive SQL( using TEZ as execution engine) not giving incorrect, or not expected results on empty partitions.
We got this issue on Hive version: 3.1.0.3.1.5.0-152 ( HDP Version 3.1.5.0-152)
To replicate the issue -
--Create external Table
1) Create external table test_tbl ( name string) partitioned by ( company string, processdate string) stored as orc location '/my/some/random/location';
-- Add partion
2) Alter table test_tbl add partition ( company='aquaifer', processdate='20220101');
-- Execute following SQL's which returns no records.
3) select max( company ) , processdate from test_tbl group by processdate ;
4) select max(processdate ) from test_tbl ;
Same SQL (#3 & #4 above) , when execute with SPARK, returns '0' count and '20220101' respectively.
So as a solution, we started using "spark-sql" instead of "hive/ beeline"
We didn't find a solution with hive for above inconsistency, and following bug has been raised for same HIVE-25838
Comments
Post a Comment