Say we have a JSON String like below -
{
"billingCountry":"US"
"orderItems":[
{
"itemId":1,
"product":"D1"
},
{
"itemId":2,
"product":"D2"
}
]
}
"billingCountry":"US"
"orderItems":[
{
"itemId":1,
"product":"D1"
},
{
"itemId":2,
"product":"D2"
}
]
}
And, our aim is to get output parsed like below -
itemId |
product |
1 |
D1 |
2 |
D2 |
- get_json_object(value, '$.orderItems.itemId') as itemId
- get_json_object(value, '$.orderItems.product') as product
Second, Above will result String value like "[1,2]". We want to convert it to Array as follows -
- split(regexp_extract(get_json_object(value, '$.orderItems.itemId'),'^\\["(.*)\\"]$',1),'","') as itemId
- split(regexp_extract(get_json_object(value, '$.orderItems.product'),'^\\["(.*)\\"]$',1),'","') as product
Third, We need to explode the multiple array columns to get final result. This can be done as follows-
- lateral view posexplode(product) tf as pos,val
Fourth, Now we can select individual elements of array using index / position "pos"
- product[pos] as product,
- itemId[pos] as itemId
Summary SQL -
select
product[pos] as product,
itemId[pos] as itemId
from(
SELECT
split(regexp_extract(get_json_object(value, '$.orderItems.product'),'^\\["(.*)\\"]$',1),'","') as product, split(regexp_extract(get_json_object(value, '$.orderItems.itemId'),'^\\["(.*)\\"]$',1),'","') as itemId
from test_json) T1
lateral view posexplode(product) tf as pos,val
Comments
Post a Comment