We had been writing to a Partitioned Hive Table and realized that data is being written has sub-folder.
For ex- Refer Table definition as below -
Create table T1 ( name string, address string) Partitioned by (process_date string) stored as parquet location '/mytable/a/b/c/org=employee';
While writing to table HDFS path being written looks something like this -
/mytable/a/b/c/org=employee/process_date=20220812/org=employee
The unnecessary addition of org=employee after process_date partition is because Hive Table has location consisting "=" operator, which Hive uses as syntax to determine partition column.
Re-defining Table resolves above problem -
Create table T1 ( name string, address string) Partitioned by (process_date string) stored as parquet location '/mytable/a/b/c/employee';
Comments
Post a Comment