# 2020金象盃 大數據 實務能力競賽 > [name=Junter.wjs,] 編輯:JunterTW > [name=XiangOuO,] 作者:XiangOuO - [X] 版本:V 1.0.5 <font color="#f00">待更新</font> ==安裝教學== `VPN` `Putty` {%youtube zsERKtIFcco %} --- #### 第一題 > ==請問男性與女性的會員數分別是多少?== ``` sql= data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray); data1 = GROUP data BY gender; data2 = FOREACH data1 GENERATE group as gender,COUNT(data); rmf answer01; STORE data2 INTO 'answer01' using PigStorage('\t'); fs -cat answer01/* ``` ``` 答案: f 137 m 182 ``` #### 第二題 > ==會員數量排行,顯示前五名的國家與會員數?== ``` sql= data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray); data1 = GROUP data BY country; data2 = FOREACH data1 GENERATE group as country,COUNT(data) as num; data3 = limit (ORDER data2 BY num DESC) 5; rmf answer02; STORE data3 INTO 'answer02' using PigStorage('\t'); fs -cat answer02/* ``` ``` 答案: Canada 60 Germany 35 China 34 Taiwan 20 Poland 19 ``` #### 第三題 > ==年紀在20至30歲的會員人數共有多少位?== ``` sql= data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray); data1 = FILTER data BY (age>=20) AND (age<=30); data2 = GROUP data1 ALL; data3 = FOREACH data2 GENERATE COUNT(data1) as num; rmf answer03; STORE data3 INTO 'answer03' using PigStorage('\t'); fs -cat answer03/* ``` ``` 答案: 212 ``` #### 第四題 > ==年紀最長的會員為幾歲?== ``` sql= data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray); data1 = GROUP data BY age; data2 = FOREACH data1 GENERATE group as age; data3 = limit (ORDER data2 BY age DESC) 1; rmf answer04; STORE data3 INTO 'answer04' using PigStorage('\t'); fs -cat answer04/* ``` ``` 答案: 75 ``` #### 第五題 > ==熱門歌曲排行榜第7名歌曲名稱為何?== ``` sql= data = load 'userdemand' USING PigStorage ('\t') as (userid:chararray,time:chararray,artname:chararray,traname:chararray); data1 = FOREACH data GENERATE userid, artname, traname; data2 = GROUP data1 BY (artname, traname); data3 = FOREACH data2 GENERATE FLATTEN(group),COUNT(data1.userid) as num; data4 = order data3 by num desc,group::traname asc; data5 = limit data4 10; rmf answer05; STORE data5 INTO 'answer05' using PigStorage('\t'); fs -cat answer05/* ``` ``` 答案: Heartless ``` #### 第六題 > ==點播次數小於二的歌手有幾位?== ``` sql= data = load 'userdemand' USING PigStorage ('\t') as (userid:chararray,time:chararray,artname:chararray,traname:chararray); data1 = FOREACH data GENERATE userid,artname; data2 = DISTINCT data1; data3 = group data2 by artname; data4 = FOREACH data3 GENERATE FLATTEN(group),COUNT(data3.userid) as num; data5 = filter data4 by (num < 2); data6 = GROUP data5 All; data7 = foreach data6 generate COUNT(data5); rmf answer06; STORE data3 INTO 'answer06' using PigStorage('\t'); fs -cat answer06/* ``` ``` 答案: 8764 ``` #### 第七題 <font color="#f00">(先不要用還在測試)</font> > ==國名B開頭的各國熱門歌曲第一名的名稱分別是什麼?== ``` sql= data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray); data1 = load 'userdemand' USING PigStorage ('\t') as (userid:chararray,time:chararray,artname:chararray,traname:chararray); data2 = JOIN data BY userid LEFT OUTER,data1 BY userid; data3 = DISTINCT data2; data4 = FOREACH data3 GENERATE data::country as country, data1::artname as artname, data1::userid as userid; data5 = FILTER data4 BY country MATCHES 'B[A-Za-z]*'; data6 = GROUP data5 BY country, artname; data7 = FOREACH data6 GENERATE FLATTEN(group),COUNT(data5.userid) as num; data8 = limit(ORDER data7 BY num DESC) 10; rmf answer07; STORE data8 INTO 'answer07' using PigStorage('\t'); fs -cat answer07/* ``` ``` 答案: ``` #### 第八題 > ==請問女性的會員數是多少?== ``` sql= data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray); data1 = FILTER data BY (gender == 'f'); data2 = GROUP data1 BY gender; data3 = FOREACH data2 GENERATE group as gender,COUNT(data1) as num; rmf answer08; STORE data3 INTO 'answer08' using PigStorage('\t'); fs -cat answer08/* ``` ``` 答案: 137 ``` #### 第九題 > ==年紀最小的會員年紀為幾歲?== ``` sql= data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray); data1 = FOREACH data GENERATE age; data2 = limit (ORDER data1 BY age ASC) 2; rmf answer09; STORE data2 INTO 'answer09' using PigStorage('\t'); fs -cat answer09/* ``` ``` 答案: 3 ``` #### 第十題 > ==會員數量排行,人數第2多的國家為何?== ``` sql= data = LOAD 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray); data1 = GROUP data BY country; data2 = FOREACH data1 GENERATE group as country,COUNT(data) as num; data3 = ORDER data2 BY num DESC; data4 = limit data3 2; rmf answer10; STORE data4 INTO 'answer10' using PigStorage('\t'); fs -cat answer10/* ``` ``` 答案: Germany ``` :::info :bulb: **教學**: 如果不知道各個功能意思,[<i class="fa fa-file-text"></i> 點我!教學範例](/@stustmis/bigdatainfo)。 ::: :::warning :loudspeaker: 程式碼如有錯誤!請自行修正或聯絡我們。 :email: c109181108@nkust.edu.tw / 4a990025@stust.edu.tw ::: ###### Tags: `語法` `教學` `sql` Copyright © 2020 Junter.wjs All Rights Reserve.