---
tags: AT_DH_2022
---

# Jagatud märkmed - ATDH-2022-2 (HVEE.00.016)
Püsilink: https://hackmd.io/@OGZFb2mRSA65ybyn4hSfkQ/BJ1DbApgs/edit

Pildi jagamiseks paremklõps pildi peal (all paremal nurgas) -> copy image.
Ja CTRL+V (CMD+V) siin tekstikastis. Hackmd laeb faili üles ja siis kuvab lingi markdown formaadis. Joonise järgi saab kirjutada oma nime.

Joonis 1. Näitepilt
## 13.09.2022
```
# Kui palju on jõululaule?
# Kas jõululood on stabiilsemad (muutuvad vähem)?
# Kas jõulud pikenevad?
# Tegelikult peaks teadma, kust andmed pärinevad.
# Kui palju on ühehitibände - kas viimasel ajal rohkem?
#1975-1985 hüpe - oli detsentraliseeritum? ühe-hitibändid?
```
```
billboard_data %>%
mutate(primary=str_remove(str_extract(artist,"^(.*)( featuring)")," featuring")) %>%
group_by(primary) %>%
summarise(n=n())
```
`# For example first and last year for each artist
billboard_data %>%
group_by(artist) %>%
summarise(n=n(),firstyear=min(year),lastyear=max(year)) %>%
arrange(desc(n)) %>%
filter(lastyear>2010)
`
```
billboard_data %>%
filter(artist=="madonna"|artist=="elton john"|artist=="mariah carey"|artist=="stevie wonder") %>%
ggplot(aes(x=year,y=-rank))+
geom_point()+
#stat_smooth(method="loess")+
facet_wrap(~artist)
billboard_data %>%
filter(str_detect(lyrics,"baby|yeah|love|war|peace|kill|darling|phone|car")) %>%
mutate(found=str_extract(lyrics,"baby|yeah|love|war|peace|kill|darling|phone|car")) %>%
ggplot(aes(x=year,y=-rank))+
stat_smooth(method="lm")+
geom_point(size=0.1)+
facet_wrap(~found)+
theme_classic()
billboard_data %>%
filter(str_detect(lyrics,"love",negate=T)) %>%
#mutate(found=str_extract(lyrics,"baby|yeah|love|war|peace|kill|darling|phone|car")) %>%
ggplot(aes(x=year,y=-rank))+
stat_smooth(method="lm")+
geom_point(size=0.1)+
#facet_wrap(~found)+
theme_classic()
billboard_data %>%
filter(str_detect(lyrics,"wind|earth|fire|water")) %>%
mutate(found=str_extract(lyrics,"wind|earth|fire|water")) %>%
ggplot(aes(x=year,y=-rank))+
stat_smooth(method="lm")+
geom_point(size=0.1)+
facet_wrap(~found)+
theme_classic()
billboard_data %>%
filter(str_detect(lyrics,"christmas")) %>%
mutate(found=str_extract(lyrics,"christmas")) %>%
ggplot(aes(x=year,y=-rank))+
stat_smooth(method="lm")+
geom_point(size=0.1)+
facet_wrap(~found)+
theme_classic()
```
```
billboard_data %>%
group_by(artist) %>%
summarise(n=n(),firstyear=min(year),lastyear=max(year)) %>%
arrange(desc(n)) %>%
head(50) %>%
arrange(desc(firstyear)) %>% # here we sort them by the first year before we set the factor levels
mutate(artist=factor(artist,levels=unique(artist))) %>%
ggplot(aes(y=artist))+
geom_segment(aes(yend=artist,x=firstyear,xend=lastyear),size=3)+
geom_text(aes(label=n,x=1963))+
theme_minimal()
```
## 16.09.2022
Kui tahame sõnu võrrelda, siis asendame rank telje word teljega ja siis on samad sõnad kõrvuti. Sõnad võib põhimõtteliselt järjestada rank järgi (fct_reorder funktsioon) - aga kuna igal arte ei ole nii päris puhas.
```
# Top words for 4 selected artists
artists_topwords %>%
filter(n_rank<26) %>%
filter(artist=="madonna"|artist=="rihanna"|artist=="aretha franklin"|artist=="britney spears") %>%
#filter(word=="love"|word=="you"|word=="baby"|word=="time") %>%
ggplot(aes(x=word,y=artist,fill=word,label=word))+
geom_tile(alpha=0.4)+
geom_text()+
guides(fill=FALSE)+
coord_flip()
# Top words for 4 selected artists
artists_topwords %>%
filter(n_rank<26) %>%
filter(artist=="madonna"|artist=="rihanna"|artist=="aretha franklin"|artist=="britney spears") %>%
#filter(word=="love"|word=="you"|word=="baby"|word=="time") %>%
ggplot(aes(x=fct_reorder(word,n_rank),y=artist,fill=word,label=word))+
geom_tile(alpha=0.4)+
geom_text()+
guides(fill=FALSE)+
coord_flip()
```







Siin on lood järjestatud viimase asukoha järgi, teeb puhtama graafiku.
```
billboard_tokens %>%
filter(word=="party") %>%
ggplot(aes(x=loc_perc,y=fct_reorder(song,loc_perc,.fun=max)))+
geom_point()
```
"Begin"

"Coming"
