For a Structured Tabular Structure it is many a times required to transform Rows into Columns. This blog explains step by step process which can be executed as one SQL to achieve same.
Lets try to understand with help of below example: where -in , we want to implement / transform input Table into table structure mentioned as output.
INPUT_TABLE
topic | groupId | batchTimeMs | Partition | offset | Count |
---|---|---|---|---|---|
t1 | g001 | 1658173779 | 0 | 123 | 122 |
t1 | g001 | 1658173779 | 1 | 2231 | 100 |
t2 | g001 | 1658173779 | 0 | 12 | 11 |
rowkey:key | offset:0 | count:0 | offset:1 | count:1 |
---|---|---|---|---|
t1:g001:1658173779 | 123 | 122 | 2231 | 100 |
t2:g001:1658173779 | 12 | 11 | NULL | NULL |
FIRST STEP -
- Concat Topic, GroupID, and BatchTimeMS to create RowKey
- Create Columns - offsets:0, counts:0, offsets:1, counts:1. Such that Columns has value only when respective partition value matches with column name.
- SQL as below -
select concat_ws(':', topic,groupId,batchTimeMs) as rowkey,
case when partition='0' then offset else null end as `offsets:0`,
case when partition='0' then count else null end as `counts:0`,
case when partition='1' then offset else null end as `offsets:1`,
case when partition='1' then count else null end as `counts:1`
FROM INPUT_TABLE
rowkey | offset:0 | count:0 | offset:1 | count:1 |
---|---|---|---|---|
t1:g001:1658173779 | 123 | 122 | NULL | NULL |
t1:g001:1658173779 | NULL | NULL | 2231 | 100 |
t2:g001:1658173779 | 12 | 11 | NULL | NULL |
SECOND STEP-
- Bring in all values of a ROWKEY in to one row.
- SQL as below -
rowkey:key | offset:0 | count:0 | offset:1 | count:1 |
---|---|---|---|---|
t1:g001:1658173779 | [123, NULL] | [122, NULL] | [2231,NULL] | [100,NULL] |
t2:g001:1658173779 | [12] | [11] | [NULL] | [NULL] |
- Select only first value from Array of values.
- SQL as below resulting in final desired output -
rowkey:key | offset:0 | count:0 | offset:1 | count:1 |
---|---|---|---|---|
t1:g001:1658173779 | 123 | 122 | 2231 | 100 |
t2:g001:1658173779 | 12 | 11 | NULL | NULL |
Comments
Post a Comment