Description
string
parse_url(string urlString, string partToExtract [, string keyToExtract])
Returns the specified part from the URL. Valid values for partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO. For example, parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST') returns 'facebook.com'. Also a value of a particular key in QUERY can be extracted by providing the key as the third argument, for example, parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k1') returns 'v1'..
尖叫提示:
返回从URL中抽取指定部分的内容,参数url是URL字符串,而参数partToExtract是要抽取的部分,这个参数包含(HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO,
例如:parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST') ='facebook.com',如果参数partToExtract值为QUERY则必须指定第三个参数key 如:parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k1') =‘v1’
parse_url(‘http://facebook.com/path/p1.php?query=1‘, ‘HOST’)返回’facebook.com’ ,
parse_url(‘http://facebook.com/path/p1.php?query=1‘, ‘PATH’)返回’/path/p1.php’ ,
parse_url(‘http://facebook.com/path/p1.php?query=1‘, ‘QUERY’)返回’query=1’
hive>
select
parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k2') ,
parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k1') ,
parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY')
结果如下:
v2 v1 k1=v1&k2=v2
string
get_json_object(string json_string, string path)
Extracts json object from a json string based on json path specified, and returns json string of the extracted json object. It will return null if the input json string is invalid.
NOTE: The json path can only have the characters [0-9a-z_], i.e., no upper-case or special characters. Also, the keys *cannot start with numbers.
* This is due to restrictions on Hive column names..
从指定路径上的JSON字符串抽取出JSON对象,并返回这个对象的JSON格式,如果输入的JSON是非法的将返回NULL,注意此路径上JSON字符串只能由数字 字母 下划线组成且不能有大写字母和特殊字符,且key不能由数字开头,这是由于Hive对列名的限制
2.案例演示
1.get_Json_object
1.src_json表中数据格式:字段名json,如下
{"store":
{"fruit":\[{"weight":8,"type":"apple"},{"weight":9,"type":"pear"}],
"bicycle":{"price":19.95,"color":"red"}
"email":"amy@only_for_json_udf_test.net",
"owner":"amy"
2.代码演示
hive> SELECT get_json_object(src_json.json, '$.owner') FROM src_json;
hive> SELECT get_json_object(src_json.json, '$.store.fruit\[0]') FROM src_json;
{"weight":8,"type":"apple"}
hive> SELECT get_json_object(src_json.json, '$.non_exist_key') FROM src_json;