# 2020金象盃 大數據 實務能力競賽
> [name=Junter.wjs,] 編輯:JunterTW
> [name=XiangOuO,] 作者:XiangOuO
- [X] 版本:V 1.0.5 <font color="#f00">待更新</font>
==安裝教學== `VPN` `Putty`
{%youtube zsERKtIFcco %}
---
#### 第一題
> ==請問男性與女性的會員數分別是多少?==
``` sql=
data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray);
data1 = GROUP data BY gender;
data2 = FOREACH data1 GENERATE group as gender,COUNT(data);
rmf answer01;
STORE data2 INTO 'answer01' using PigStorage('\t');
fs -cat answer01/*
```
```
答案:
f 137
m 182
```
#### 第二題
> ==會員數量排行,顯示前五名的國家與會員數?==
``` sql=
data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray);
data1 = GROUP data BY country;
data2 = FOREACH data1 GENERATE group as country,COUNT(data) as num;
data3 = limit (ORDER data2 BY num DESC) 5;
rmf answer02;
STORE data3 INTO 'answer02' using PigStorage('\t');
fs -cat answer02/*
```
```
答案:
Canada 60
Germany 35
China 34
Taiwan 20
Poland 19
```
#### 第三題
> ==年紀在20至30歲的會員人數共有多少位?==
``` sql=
data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray);
data1 = FILTER data BY (age>=20) AND (age<=30);
data2 = GROUP data1 ALL;
data3 = FOREACH data2 GENERATE COUNT(data1) as num;
rmf answer03;
STORE data3 INTO 'answer03' using PigStorage('\t');
fs -cat answer03/*
```
```
答案:
212
```
#### 第四題
> ==年紀最長的會員為幾歲?==
``` sql=
data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray);
data1 = GROUP data BY age;
data2 = FOREACH data1 GENERATE group as age;
data3 = limit (ORDER data2 BY age DESC) 1;
rmf answer04;
STORE data3 INTO 'answer04' using PigStorage('\t');
fs -cat answer04/*
```
```
答案:
75
```
#### 第五題
> ==熱門歌曲排行榜第7名歌曲名稱為何?==
``` sql=
data = load 'userdemand' USING PigStorage ('\t') as (userid:chararray,time:chararray,artname:chararray,traname:chararray);
data1 = FOREACH data GENERATE userid, artname, traname;
data2 = GROUP data1 BY (artname, traname);
data3 = FOREACH data2 GENERATE FLATTEN(group),COUNT(data1.userid) as num;
data4 = order data3 by num desc,group::traname asc;
data5 = limit data4 10;
rmf answer05;
STORE data5 INTO 'answer05' using PigStorage('\t');
fs -cat answer05/*
```
```
答案:
Heartless
```
#### 第六題
> ==點播次數小於二的歌手有幾位?==
``` sql=
data = load 'userdemand' USING PigStorage ('\t') as (userid:chararray,time:chararray,artname:chararray,traname:chararray);
data1 = FOREACH data GENERATE userid,artname;
data2 = DISTINCT data1;
data3 = group data2 by artname;
data4 = FOREACH data3 GENERATE FLATTEN(group),COUNT(data3.userid) as num;
data5 = filter data4 by (num < 2);
data6 = GROUP data5 All;
data7 = foreach data6 generate COUNT(data5);
rmf answer06;
STORE data3 INTO 'answer06' using PigStorage('\t');
fs -cat answer06/*
```
```
答案:
8764
```
#### 第七題 <font color="#f00">(先不要用還在測試)</font>
> ==國名B開頭的各國熱門歌曲第一名的名稱分別是什麼?==
``` sql=
data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray);
data1 = load 'userdemand' USING PigStorage ('\t') as (userid:chararray,time:chararray,artname:chararray,traname:chararray);
data2 = JOIN data BY userid LEFT OUTER,data1 BY userid;
data3 = DISTINCT data2;
data4 = FOREACH data3 GENERATE data::country as country, data1::artname as artname, data1::userid as userid;
data5 = FILTER data4 BY country MATCHES 'B[A-Za-z]*';
data6 = GROUP data5 BY country, artname;
data7 = FOREACH data6 GENERATE FLATTEN(group),COUNT(data5.userid) as num;
data8 = limit(ORDER data7 BY num DESC) 10;
rmf answer07;
STORE data8 INTO 'answer07' using PigStorage('\t');
fs -cat answer07/*
```
```
答案:
```
#### 第八題
> ==請問女性的會員數是多少?==
``` sql=
data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray);
data1 = FILTER data BY (gender == 'f');
data2 = GROUP data1 BY gender;
data3 = FOREACH data2 GENERATE group as gender,COUNT(data1) as num;
rmf answer08;
STORE data3 INTO 'answer08' using PigStorage('\t');
fs -cat answer08/*
```
```
答案:
137
```
#### 第九題
> ==年紀最小的會員年紀為幾歲?==
``` sql=
data = load 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray);
data1 = FOREACH data GENERATE age;
data2 = limit (ORDER data1 BY age ASC) 2;
rmf answer09;
STORE data2 INTO 'answer09' using PigStorage('\t');
fs -cat answer09/*
```
```
答案:
3
```
#### 第十題
> ==會員數量排行,人數第2多的國家為何?==
``` sql=
data = LOAD 'userprofile' USING PigStorage ('\t') as (userid:chararray,gender:chararray,age:int,country:chararray,registered:chararray);
data1 = GROUP data BY country;
data2 = FOREACH data1 GENERATE group as country,COUNT(data) as num;
data3 = ORDER data2 BY num DESC;
data4 = limit data3 2;
rmf answer10;
STORE data4 INTO 'answer10' using PigStorage('\t');
fs -cat answer10/*
```
```
答案:
Germany
```
:::info
:bulb: **教學**: 如果不知道各個功能意思,[<i class="fa fa-file-text"></i> 點我!教學範例](/@stustmis/bigdatainfo)。
:::
:::warning
:loudspeaker: 程式碼如有錯誤!請自行修正或聯絡我們。
:email: c109181108@nkust.edu.tw / 4a990025@stust.edu.tw
:::
###### Tags: `語法` `教學` `sql`
Copyright © 2020 Junter.wjs All Rights Reserve.