hive 学习笔记之七:内置函数,mybatis 防止 sql 注入原理
locate
log
log10
log2
lower
lpad
ltrim
map
map_keys
map_values
matchpath
max
min
minute
month
months_between
named_struct
negative
next_day
ngrams
noop
noopstreaming
noopwithmap
noopwithmapstreaming
not
ntile
nvl
or
parse_url
parse_url_tuple
percent_rank
percentile
percentile_approx
pi
pmod
posexplode
positive
pow
power
printf
radians
rand
rank
reflect
reflect2
regexp
regexp_extract
regexp_replace
repeat
reverse
rlike
round
row_number
rpad
rtrim
second
sentences
shiftleft
shiftright
shiftrightunsigned
sign
sin
size
sort_array
soundex
space
split
sqrt
stack
std
stddev
stddev_pop
stddev_samp
str_to_map
struct
substr
substring
sum
tan
to_date
to_unix_timestamp
to_utc_timestamp
translate
trim
trunc
ucase
unbase64
unhex
unix_timestamp
upper
var_pop
var_samp
variance
weekofyear
when
windowingtablefunction
xpath
xpath_boolean
xpath_double
xpath_float
xpath_int
xpath_long
xpath_number
xpath_short
xpath_string
year
|
~
Time taken: 0.003 seconds, Fetched: 216 row(s)
以 lower 函数为例,执行命令 describe function lower;即可查看 lower 函数的说明:
hive> describe function lower;
OK
lower(str) - Returns str with all characters changed to lowercase
Time taken: 0.005 seconds, Fetched: 1 row(s)
接下来从计算函数开始,体验常用函数;
先执行以下命令,使查询结果中带有字段名:
set hive.cli.print.header=true;
[](
)计算函数
加法+:
hive> select name, age, age+1 as add_value from student;
OK
name age add_value
tom 11 12
jerry 12 13
mike 13 14
john 14 15
mary 15 16
Time taken: 0.098 seconds, Fetched: 5 row(s)
减法(-)、乘法(*)、除法(/)的使用与加法类似,不再赘述了;
四舍五入 round:
hive> select round(1.1), round(1.6);
OK
_c0 _c1
1.0 2.0
Time taken: 0.028 seconds, Fetched: 1 row(s)
向上取整 ceil:
hive> select ceil(1.1);
OK
_c0
2
Time taken: 0.024 seconds, Fetched: 1 row(s)
向下取整 floor:
hive> select floor(1.1);
OK
_c0
1
Time taken: 0.024 seconds, Fetched: 1 row(s)
平方 pow,例如 pow(2,3)表示 2 的三次方,等于 8:
hive> select pow(2,3);
OK
_c0
8.0
Time taken: 0.027 seconds, Fetched: 1 row(s)
取模 pmod:
hive> select pmod(10,3);
OK
_c0
1
Time taken: 0.059 seconds, Fetched: 1 row(s)
[](
)字符函数
转小写 lower,转大写 upper:
hive> select lower(name), upper(name) from student;
OK
_c0 _c1
tom TOM
jerry JERRY
mike MIKE
john JOHN
mary MARY
Time taken: 0.051 seconds, Fetched: 5 row(s)
字符串长度 length:
hive> select name, length(name) from student;
OK
tom 3
jerry 5
mike 4
john 4
mary 4
Time taken: 0.322 seconds, Fetched: 5 row(s)
字符串拼接 concat:
hive> select concat("prefix_", name) from student;
OK
prefix_tom
prefix_jerry
prefix_mike
prefix_john
prefix_mary
Time taken: 0.106 seconds, Fetched: 5 row(s)
子串 substr,substr(xxx,2)表示从第二位开始到右边所有,substr(xxx,2,3)表示从第二位开始取三个字符:
hive> select substr("0123456",2);
OK
123456
Time taken: 0.067 seconds, Fetched: 1 row(s)
hive> select substr("0123456",2,3);
OK
123
Time taken: 0.08 seconds, Fetched: 1 row(s)
去掉前后空格 trim:
hive> select trim(" 123 ");
OK
123
Time taken: 0.065 seconds, Fetched: 1 row(s)
[](
)json 处理(get_json_object)
为了使用 json 处理的函数,先准备一些数据:
先创建表 t15,只有一个字段用于保存字符串:
create table t15(json_raw string)
row format delimited;
创建 t15.txt 文件,内容如下:
{"name":"tom","age":"10"}
{"name":"jerry","age":"11"}
加载数据到 t15 表:
load data
local inpath '/home/hadoop/temp/202010/25/015.txt'
into table t15;
使用 get_json_object 函数,解析 json_raw 字段,分别取出指定 name 和 age 属性:
select
get_json_object(json_raw, "$.name"),
get_json_object(json_raw, "$.age")
from t15;
得到结果:
hive> select
get_json_object(json_raw, "$.name"),
get_json_object(json_raw, "$.age")
from t15;
OK
tom 10
jerry 11
Time taken: 0.081 seconds, Fetched: 2 row(s)
[](
)日期
获取当前日期 current_date:
hive> select current_date();
OK
2020-11-02
Time taken: 0.052 seconds, Fetched: 1 row(s)
获取当前时间戳 current_timestamp:
hive> select current_timestamp();
OK
2020-11-02 10:07:58.967
Time taken: 0.049 seconds, Fetched: 1 row(s)
获取年份 year、月份 month、日期 day:
hive> select year(current_date()), month(current_date()), day(current_date());
OK
2020 11 2
Time taken: 0.054 seconds, Fetched: 1 row(s)
另外,year 和 current_timestamp 也能搭配使用:
hive> select year(current_timestamp()), month(current_timestamp()), day(current_timestamp());
OK
2020 11 2
Time taken: 0.042 seconds, Fetched: 1 row(s)
返回日期部分 to_date:
hive> select to_date(current_timestamp());
OK
2020-11-02
Time taken: 0.051 seconds, Fetched: 1 row(s)
[](
)条件函数
条件函数的作用和 java 中的 switch 类似,语法是 case X when XX then XXX else XXXX end;
示例如下,作用是判断 name 字段,如果等于 tom 就返回 tom_case,如果等于 jerry 就返回 jerry_case,其他情况都返回 other_case:
select name,
case name when 'tom' then 'tom_case'
when 'jerry' then 'jerry_case'
else 'other_case'
end
from student;
结果如下:
hive> select name,
case name when 'tom' then 'tom_case'
when 'jerry' then 'jerry_case'
else 'other_case'
end
from student;
OK
tom tom_case
jerry jerry_case
mike other_case
john other_case
mary other_case
Time taken: 0.08 seconds, Fetched: 5 row(s)
[](
)聚合函数
返回行数 count:
select count(*) from student;
触发 MR,结果如下:
评论