《Hadoop生态系统及开发 实训手册 实训12 HiveQL查询操作.docx》由会员分享,可在线阅读,更多相关《Hadoop生态系统及开发 实训手册 实训12 HiveQL查询操作.docx(7页珍藏版)》请在第一文库网上搜索。
1、实训5.4 HiveQL查询操作L实训I目的通过本实训,理解Hive的查询数据操作。2 .实训内容本实训主要是进行查询数据表的操作。3 .实训要求以小组为单元进行实训,每小组5人,小组自主协商选一位组长,由组长安 排和分配实训任务,要求对上一实训都操作正确。4 .准备知识(1) GROUP BY概SELECT coll ,co12 tcount( 1 ),sel_expr (聚合操作)FROM tableWHERE conditionMap 端执行GROUP BYcoll ,col2-Reduce 端执行HAVING-Reduce 端执行GROUPBY,即按照某区学段的值进行分施 者相同值放至
2、IFr起。本代码意思为:从表中读取数据,执行Where条件,以Coll列分组,把COIl 列的内容作为Key,其他列值作为VaIUe,上传到reduce,在reduce端执行聚合 操作和having过港5 .实训步骤(1)查询表查询那门编号为30的反工信息use test2;select * from emp where deptno=30;操作结果如图5-48所示。hive select * from emp where deptno=30; OK7499ALLENSALESMAN76981981-2-21600.30.37521WARDSALESMAN76981981-2-221250.0
3、500. 37654MARTINSALESMAN76981981-9-281250.1400. 37844TURNERSALESMAN76981981-9-281500.3Tilne taken: 0.493 seconds. Fetched: 4 row(s)图5-48查询指定部门编号的员工信息 查询姓名为SMITH的员工select * from emp where ename=SMITH; 操作结果如图5-49所示。hive select * from emp where ename,SMITH;OK7369 SMITH CLERK 792198-12-178, 2.0 NULLTime
4、 taken: 0.17 seconds, Fetched: 1 row(s)图5-49查询指定姓名的员工信息查询员工编号小于等于7766的员工select * from emp where empno select *from emp whereempno select *from emp wheresal between 100 and 150;Uix7521WARDSALESMAN76981981-2-221250.5,37654MARTINSALESMAN76981981-9-281250.1400.037844TURNERSALESMAN76981981-9-28150.00.037
5、876ADAMSCLERK 77881987-5-2311.2.NULL7934MILLERCLERK 77821982-1-23130.1,0NULLTimetaken: 0.126 seconds, Fetched: 5row(s)图5-51按工资范围查询查询前5条记录select * from emp limit 5;操作结果如图5-52所示。hive zselect *from emp limit5;U7369SMITHCLERK 7921980-12-1780.2.0NULL7499ALLENSALESMAN76981981-2-201600. 30.37521WARDSALESMA
6、N76981981-2-221250. 5,37566JONESMANAGER 78391981-4-22975.02.0NULL7654MARTINSALESMAN76981981-9-28125,140.3Timetaken: .17 seconds. Fetched: 5 row(s)图5-52按显示记录数量查询查询姓名为SCOTT或MARTIN的员工select * from emp where ename in (SCOTT,MARTIN); 操作寤果如图5-53所示。hive OK 76547788 Timeselect *from emp whereename in(,SC0,
7、MARTIN);MARTIN SALESMAN7698SCOTTANALYST 75661987-4taken: .19 seconds, Fetched: 21981-9-281250.0-1930.0 2,0row(s)1400.0NULL3图553多姓名查询查询有津贴的员工select * from emp where COmm is not null; 操作寤果如图5-54所示。hive select *from emp wherecomm is not null;Uix7369SMITHCLERK 792198-12-178,2.NULL7499ALLENSALESMAN769819
8、81-2-21600.0300.037521WARDSALESMAN76981981-2-221250.0500. 37566JONESMANAGER 78391981-4-22975.2.NULL7654MARTINSALESMAN76981981-9-281250. 1400. 37698BLAKEMANAGER 78391981-5-12850.03.0NULL7782CLARKMANAGER 78391981-6-92450.1.0NULL7788SCOTTANALYST 75661987-4-193.2.NULL7844TURNERSALESMAN76981981-9-28150.0
9、37876ADAMSCLERK 77881987-5-23110.020.0NULL79JAMESCLERK 76981981-12-395.3.0NULL792FORDANALYST 75661981-12-330.02,0NULL7934MILLERCLERK 77821982-1-23130.1.0NULLTimetaken: ,118 seconds, Fetched: 13 row(s)图5-54按有无薪资查询hive select max(sal),min(sal),avg(sal),sum(sal) from emp;Query ID = root_2O22111O113456_
10、f9O516el-adb2-4b60-b6df-a7bl5ff5bbc6Total jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec. reducers.bytes.per.reducer=In order to IiJnit the maximum number of reducers:set hive.exec . redu
11、cers.max=In order to set a constant number of reducers:set mapreduce.job.reduces=Starting Job = job l66847592915, Tracking URL = http;/master;8088/proxy/a pplicationl6680475920915Kill Command = optsoftwarehddo叩-3.3.4/bin/m叩red job -kill job l6684759209 1-05Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 222-11-1 11:35:05,064 Stage-I map = 0%, reduce = %222-11-10 11:35:10,184 Stage-I map = 10%, reduce = %, Cumulative CPU 1.41 se c2022-11-10 11:35:16,327 Stage-I map = 10%, reduce = 100%, Cumulative CPU 3.52 secMapRed