---
title: |
| Portfolio 3
| **Do you mean what you say?**
| Semantic ambiguity and emoji use
author:
- Sofie Bøjgaard Thomsen (202206226)
- Laura Sørine Voldgaard (202207128)
- Ulyana Maslouskaya (202000242)
- Daniella Varga (202204615)
output:
pdf_document: default
word_document: default
header-includes:
- \usepackage{fancyhdr}
- \usepackage{fvextra}
- \pagenumbering{gobble}
---
\pagestyle{fancy}
\fancyhead[HL]{Portfolio Assignment 3}
\fancyhead[HR]{202206226, 202000242, 202207128, 202204615}
\DefineVerbatimEnvironment{Highlighting}{Verbatim}{breaklines,commandchars=\\\{\}}
```{r setup, include=FALSE, warning=FALSE}
knitr::opts_chunk$set(echo = FALSE)
# setting working directory
knitr::opts_knit$set(root.dir = '/work/CogSci_Methods01/portfolio_assignment_03-soerinev/logfiles')
```
```{r, include=FALSE}
# loading packages
pacman::p_load(tidyverse)
library("moments")
library("dplyr")
library("readr")
library("pastecs")
library("WRS2")
library("stringr")
library("ggpubr")
library("dunn.test")
library("ggstatsplot")
library("png")
# clear environment.
rm(list=ls()) # clears global workspace.
```
## **Introduction** <br>
## **(Daniella Varga)** <br>
Messaging is one of the most often used ways of communication nowadays, however because we are not in a face to face conversation, we can’t read each other’s body language and facial expressions, therefore the tone of the message is missing. Seemingly the usage of emojis is a good way to overcome these problems. In addition to other reasons that research has found behind for emoji use, such as communicating a self-image and community identity, emojis have been found to be largely used to reduce semantic ambiguity of the message (Derks et al., 2008; Kaye et al., 2016; Tang and Hew, 2019, cited in Liu and Sun 2020, p.2). However, as a linguistic modality of their own, emojis have the potential to convey meaning independently of the text. Since they express subtler emotional cues and richer semantics, emojis have the potential to not only contribute but also intervene with the message produced by plain text (Jibril and Abdullah, 2013, López and Cap, 2017, Ai et al., 2017, cited in Bai et al 2019). <br>
A new issue in understanding arises when words and emojis contradict one another. In a real life conversation people often use a sarcastic tone to change the meaning of their words. We found ourselves wondering if we can achieve the same effect by using emojis and started investigating what happens when the meaning conveyed by plain text is incongruent with the one implied by emojis. Our research questions whether emojis alternate the interpretation of our words in text messages, and we hypothesize that the usage of emojis influences how people interpret the meaning of messages.
## **Methods** <br>
## **(Sofie Bøjgaard Thomsen)** <br>
**Participants** <br>
The experiment was conducted on 15 participants ranging from 19 to 51 years of age (M_age = 25, SD_age = 7.5). There were 8 males and 7 females. On average participants rated their frequency of emoji usage in everyday life 3.9 on a scale from 1 to 5, which implies that they all had previous knowledge about emojis in general before the test. <br>
**Materials/Stimuli** <br>
In this experiment we had three conditions, and each participant was randomly assigned to one of the three conditions. For each condition the choice of emoji is the independent variable (IV). The participants afterwards rated the level of genuineness of the reply on a scale from 1 to 7, which is our dependent variable (DV).
```{r}
#Showing the picture of the conditions for the methods -> materials/Stimuli section
Picture <- readPNG("Conditions.png", native = TRUE)
plot(0:1,0:1,type = "n",ann = FALSE, axes = FALSE)
rasterImage(Picture,0,0,1,1)
```
**Procedure** <br>
We made the experiment in PsychoPy, where the participants typed in their name, age and gender before starting the experiment. Entering the experiment, participants were first instructed to imagine themselves in the following made up scenario: <br>
“It’s 5 pm on a Friday night and you have to cancel the plans you made some days ago with your best friend to hang out and catch up with each other. The reply you get makes you wonder…” <br>
Afterwards, the participants were shown one of the three conditions and made sure that each participant was tested in only one condition to make it a between-subject-design. <br>
Then, they were asked to rate the genuineness of the reply on a scale from 1 to 7, where 1 was not genuine at all and 7 was very genuine. The participants pressed the number on the keyboard as their answer. The outcome of this is an ordinal variable of discrete numbers (more on that in the analysis section). <br>
Finally, the participants were asked how often they use emojis when messaging on a scale of 1 to 5, where 1 was never and 5 was very often. Again, they pressed the number on the keyboard to answer. This question is not part of our dependent variable, but used for participant inspection. <br> <br>
## **Analysis**
## **(Laura Sørine Voldgaard)** <br>
As part of preprocessing our data, we anonymized the log files, compiled them into one data frame and removed the unnecessary default punctuation from the columns used for inspection. <br>
Then we tested if our data met the assumptions of ANOVA. <br>
First, by checking if it was normally distributed, which the data in the congruent condition was not. <br>
```{r}
# load files
files <-
list.files(path = "/work/CogSci_Methods01/portfolio_assignment_03-soerinev/logfiles", pattern = "*.csv", full.names = TRUE)
```
```{r}
# anonymize data - I comment this chunk out after running it to be able to knit.
#data_out <- list()
# num_files <- length(files)
# rand_ids <- sample(seq(1,num_files,1))
# cnt_f <- 0
# for (f in files){
# cnt_f <- cnt_f + 1
# data_out[[f]] <- read_csv(file = f, col_names = TRUE)
# data_out[[f]]$ID <- paste(c("snew", rand_ids[cnt_f]), collapse = "")
# out_name <- paste(c('/work/CogSci_Methods01/portfolio_assignment_03-soerinev/logfiles', "/logfile_", unique(data_out[[f]]$ID[1]), ".csv"), collapse = "")
# write_csv(data_out[[f]], out_name, na = "NA")
# file.remove(f)
# }
```
```{r, message=FALSE}
# compile all the logfiles into one csv file
files <- list.files(path = getwd(), pattern = "*logfile_snew*", full.names = T)
data <- map_dfr(files, read_csv)
```
```{r}
# removing punctuation from the emoji_use column in the data frame
data$emoji_use <- gsub(pattern = "\\'|\\[|\\]", replacement = "", as.character(data$emoji_use))
data$emoji_use = tolower(data$emoji_use)
# removing punctuation from the genuineness column in the data frame
data$genuineness <- gsub(pattern = "\\'|\\[|\\]", replacement = "", as.character(data$genuineness))
data$genuineness = tolower(data$genuineness)
#removing first column
data <- data[-1]
```
```{r}
# prepping data for tests
# changing the data type of columns
data$emoji_use <- as.numeric(data$emoji_use)
data$genuineness <- as.numeric(data$genuineness)
data$condition <- as.factor(data$condition)
data$ID <- as.factor(data$ID)
data$gender <- as.factor(data$gender)
```
```{r}
#dividing the data into 3 different data frames grouped by condition
data_congr <- filter(data, condition == "congruent")
data_incongr <- filter(data, condition == "incongruent")
data_neutral <- filter(data, condition == "neutral")
```
```{r, include=FALSE}
# Making a probability density histogram of the genuineness belief of neutral condition
ggplot(data_neutral, aes(x = genuineness)) +
geom_histogram(aes(y = ..density..), binwidth = 0.3) +
ggtitle("Probability Density of genuineness belief in neutral condition") +
stat_function(fun = dnorm,
args = list(mean = mean(data_neutral$genuineness, na.rm = TRUE),
sd = sd(data_neutral$genuineness, na.rm = TRUE)),
colour= "orange", size = 1) +
theme_classic()
#Making a QQ-plot of genuineness belief of neutral condition
qqnorm(data_neutral$genuineness)
qqline(data_neutral$genuineness)
#Performing the Shapiro-Wilks test of genuineness belief of neutral condition
shapiro.test(data_neutral$genuineness)
```
```{r, include=FALSE}
# Making a probability density histogram of the genuineness belief of congruent condition
ggplot(data_congr, aes(x = genuineness)) +
geom_histogram(aes(y = ..density..), binwidth = 0.3) +
ggtitle("Probability Density of genuineness belief in congruent condition") +
stat_function(fun = dnorm,
args = list(mean = mean(data_congr$genuineness, na.rm = TRUE),
sd = sd(data_congr$genuineness, na.rm = TRUE)),
colour= "orange", size = 1) +
theme_classic()
#Making a QQ-plot of genuineness belief of congruent condition
qqnorm(data_congr$genuineness)
qqline(data_congr$genuineness)
#Performing the Shapiro-Wilks test of genuineness belief of congruent condition
shapiro.test(data_congr$genuineness)
```
```{r, include=FALSE}
# Making a probability density histogram of the genuineness belief of incongruent condition
ggplot(data_incongr, aes(x = genuineness)) +
geom_histogram(aes(y = ..density..), binwidth = 0.3) +
ggtitle("Probability Density of genuineness belief in incongruent condition") +
stat_function(fun = dnorm,
args = list(mean = mean(data_incongr$genuineness, na.rm = TRUE),
sd = sd(data_incongr$genuineness, na.rm = TRUE)),
colour= "orange", size = 1) +
theme_classic()
#Making a QQ-plot of genuineness belief of incongruent condition
qqnorm(data_incongr$genuineness)
qqline(data_incongr$genuineness)
#Performing the Shapiro-Wilks test of genuineness belief of incongruent condition
shapiro.test(data_incongr$genuineness)
```
Because of this, we tried to transform the data from all the conditions to make it normally distributed and meet the assumptions of ANOVA. However, our values were measured as a rating scale, making them discrete and not continuous, and thus not fit for transformation. <br>
If our data had met the assumption of a continuous dependent variable we would check for homogeneity. The p-value in the homogeneity test gave us non-significant results, therefore variances could be assumed to be equal (again, only valid if we had designed our experiment to actually meet the assumptions of an ANOVA). <br>
```{r, include=TRUE}
# Checking for homogeneity
### Even though the data looks normally distributed we'll test for homogeneity to be sure that our data meets the assumptions of ANOVA.
bartlett.test(genuineness ~ condition, data=data)
```
To meet the requirements of this assignment we proceeded to conduct a one-way ANOVA, because we only had one independent variable. If our experiment design had met the assumptions of ANOVA, it could now be interpreted that there was a difference between the means of our conditions and therefore an “omnibus-effect”. <br>
```{r}
# Performing one-way ANOVA
aov1 <- aov(genuineness ~ condition, data=data)
summary(aov1)
```
```{r, include=FALSE}
# Seeing the means for each category
model.tables(aov1, "means")
```
<br>
This led us to using a pairwise post-hoc comparison test, in this case Tukey’s post-hoc test, to inspect which groups differed from each other. The results of this suggested a significant difference between all pairs of conditions, the biggest one being between the congruent condition and the incongruent condition. <br>
```{r}
# running Tukey's post hoc test
TUKEY <- TukeyHSD(aov1)
TUKEY
```
```{r, fig.align = 'center'}
# plot Tukey's post hoc test
plot(TUKEY, las = 1, col = "brown", cex.axis = 0.35)
```
<br>
Our data was neither normally distributed nor fit for transformation, so it did not meet the assumptions of ANOVA. Therefore, we used a non-parametric test; specifically the Kruskal-Wallis test, because it makes no assumptions of the data being normally distributed or the variance being equal, and it can be used for both continuous and ordinal-level dependent variables. <br>
```{r}
kruskal <- kruskal.test(genuineness ~ condition, data = data)
kruskal
```
<br>
Seeing as the Kruskal-Wallis test only tells us whether Seeing as the Kruskal-Wallis test only told us that there was a significant difference between at least one of our means, but not specifically between which means the significant difference was, we used the non-parametric equivalent of a pairwise post-hoc test, called Dunn’s test, to specify which mean or means differed from the others. We also used a Bonferroni correction to adjust the p-level for the number of tests, so we could account for the family-wise error. <br>
```{r}
# performing Dunn test to see which are significantly different
dunn.test(data$genuineness, data$condition, method = "bonferroni")
```
## **Results** <br>
A significant difference in genuineness response across conditions was found in the Kruskal-Wallis test (chi-square = 12.05, df = 2, p = 0.002413), and the subsequent Dunn’s test with Bonferroni adjustment specified that the significant result was observed between the congruent condition and the incongruent condition (p = 0.0008). No significant difference was suggested between the neutral and incongruent conditions (p = 0.1947) nor between the neutral and the congruent conditions (p = 0.0772). The significant results from the Kruskal-Wallis test and Dunn’s test - as well as the parametric ANOVA - allow us to reject the null hypothesis that the usage of emojis does not influence how people interpret the meaning of messages. <br>
```{r, warning=FALSE}
# visualizing
ggboxplot(data, x = "condition", y = "genuineness", color="condition") +
scale_x_discrete(limits = c("congruent", "neutral", "incongruent")) +
scale_y_discrete(limits = c(1,2,3,4,5,6,7)) +
stat_summary(fun = mean, geom = "point", shape = 23, colour = "Black") +
ggtitle("Visualization of the data") +
theme_minimal()
```
```{r}
ggline(data, x="condition", y= "genuineness",
add = "mean_se",
order = c("congruent", "neutral", "incongruent"),
ylab = "genuineness", xlab = "condition") +
geom_point() +
ggtitle("Visualization of the condition means")
```
## **Discussion** <br>
## **(Ulyana Maslouskaya)** <br>
In our experiment, participants had to assess the genuineness of the message, whether the person replying actually means what they say. Due to the communication noise and many emotional, social, cultural, interpersonal and other factors that ge
t in the way of communication, one can argue that it is impossible for the perceiver to assess the intended meaning of the message in principle. Moreover, our experimental design presented a participant with a small excerpt of the conversation taken out of context. Thus, the only way a participant could relate to the stimulus was through their ability to emphasize with an imagined scenario. Since we did not account for this, our findings have little explanatory power over the social and cultural implications of emoji use. The scope of our study and the small number of participants limited us in assessing the influence of many factors that shape meaning extracted from emojis, such as age, gender, interpersonal relationship, frequency and main of emoji use, personality type, degree of social media usage, cultural background, living environment etc. The possible future studies would involve more participants, potentially more personal engagement with the topic and more account for wider sociocultural context. Furthermore, due to the flaws in our experimental design, we violated several assumptions of ANOVA and parametric post-hoc tests, so further studies should also consider changing the type of variable to enable an ANOVA and parametric post-hoc test. <br>
In conclusion, the results from the Kruskal-Wallis test and Dunn’s test suggest a significant difference and support our hypothesis that the usage of emojis influences how people interpret the meaning of messages. <br> <br>
## **References** <br>
Bai, Q., Dan, Q., Mu, Z., & Yang, M. (2019). A Systematic Review of Emoji: Current Research and Future Perspectives . Frontiers in Psychology , 10. https://doi.org/10.3389/fpsyg.2019.02221
$\linebreak$
$\linebreak$
Liu, S., & Sun, R. (2020). To Express or to End? Personality Traits Are Associated With the Reasons and Patterns for Using Emojis and Stickers. Frontiers in Psychology, 11. https://doi.org/10.3389/fpsyg.2020.01076