We faced this error while querying Hive Table using Trino -
Error -
- SQL Error [16777223]:
Query failed (#20230505_155701_00194_n2vdp): ORC ACID file should have 6
columns, found 17
This was happening because Table being queried was Hive Managed Internal Table, which by default in CDP ( Cloudera ) distribution is ACID compliant.
Now, in order for a Hive Table to be ACID complaint -
- The underlying file system should be ORC,
- and there were a few a changes on ORC file structure like the root column should be a struct with 6 nested columns (which encloses the data and the type of operation). Something like below
For more ORC ACID related internals - please take a look here
- https://orc.apache.org/docs/acid.html
Now, problem in our case was that though Hive Table was declared Internal but ORC Files present were not having header column containing required 6 fields to be ACID compliant. Hence, SQL was failing from Trino.
Solution-
So, we created another Hive External Table on same HDFS location as Hive Internal Table and started executing SQL's on same.
Comments
Post a Comment