<style>
.spinner {
/* margin: 20px;
width: 100px; */
/* height: 100px; */
/* background: #f00; */
-webkit-animation-name: spin;
-webkit-animation-duration: 4000ms;
-webkit-animation-iteration-count: infinite;
-webkit-animation-timing-function: linear;
-moz-animation-name: spin;
-moz-animation-duration: 4000ms;
-moz-animation-iteration-count: infinite;
-moz-animation-timing-function: linear;
-ms-animation-name: spin;
-ms-animation-duration: 4000ms;
-ms-animation-iteration-count: infinite;
-ms-animation-timing-function: linear;
animation-name: spin;
animation-duration: 4000ms;
animation-iteration-count: infinite;
animation-timing-function: linear;
}
.spinner-hover:hover {
/* margin: 20px;
width: 100px; */
/* height: 100px; */
/* background: #f00; */
-webkit-animation-name: spin;
-webkit-animation-duration: 4000ms;
-webkit-animation-iteration-count: infinite;
-webkit-animation-timing-function: linear;
-moz-animation-name: spin;
-moz-animation-duration: 4000ms;
-moz-animation-iteration-count: infinite;
-moz-animation-timing-function: linear;
-ms-animation-name: spin;
-ms-animation-duration: 4000ms;
-ms-animation-iteration-count: infinite;
-ms-animation-timing-function: linear;
animation-name: spin;
animation-duration: 4000ms;
animation-iteration-count: infinite;
animation-timing-function: linear;
}
@-ms-keyframes spin {
from { -ms-transform: rotate(0deg); }
to { -ms-transform: rotate(360deg); }
}
@-moz-keyframes spin {
from { -moz-transform: rotate(0deg); }
to { -moz-transform: rotate(360deg); }
}
@-webkit-keyframes spin {
from { -webkit-transform: rotate(0deg); }
to { -webkit-transform: rotate(360deg); }
}
@keyframes spin {
from {
transform:rotate(0deg);
}
to {
transform:rotate(360deg);
}
}
</style>
<p class="spinner" style="color: indigo; background: linear-gradient(45deg, #ff0000, #ff7700, #ffcc00, #33cc33, #3366cc, #9900cc); text-align: center; font-family: Consolas, sans-serif; font-size: 120px; font-weight: bold; margin-bottom: 20px;">星期三專題報告<p>
<p style="background: linear-gradient(45deg, #ff0000, #ff7700, #ffcc00, #33cc33, #3366cc, #9900cc); -webkit-background-clip: text; color: transparent; text-align: center; font-family: consolas, sans-serif; font-size: 24px; font-weight: bold; margin-bottom: 20px;">final-edit<5>th</p>
<img class="spinner" style="width: 300px;" src="https://hackmd.io/_uploads/ByqAHX4Bp.png">
---
<p>Project:</p>
<div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;">o</div>
<div>
<p>MANGA PREDICTION 【漫畫劇情預測與生成】</p>
<div class="spinner" style="color: #f00; margin: 50px; background: #f00; width: 100px; height: 100px;"><div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;">yya<div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;">o</div><div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;">o</div><div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;"><div class="spinner" style="color: #fff; margin: 20px; background: green; width: 30px; height: 30px;">o</div></div></div></div>
</div>
---
<p>研究動機</p>
<div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;">o</div>
<p style="font-size: 30px;">從前,有一位天生害羞的台灣學生,名叫蔡欣翰。蔡欣翰性格孤僻,當獨自一人時,總是沉浸在漫畫的世界,生活充滿了迷人的幻想和色彩。然而,他的家境並不富裕,經濟上的壓力使他無法盡情追逐自己的興趣。
有一天,蔡欣翰神情愁容滿面,彷彿隨時有一場風暴在他心中蔓延。老師留心到他的情緒變化,為了幫助他,特地派來一位經驗豐富的教官。這位教官了解到蔡欣翰的處境,發現他家裡一片冷清,生活陷入經濟的迫切需求中。蔡欣翰的唯一慰藉就是漫畫,但因為經濟原因,沒有辦法有多餘的餘裕讓他從事他的興趣--看漫畫。但是 他覺得如果他沒辦法看到漫畫的結局,他這生就白活了。所以我們決定透過我們的研究來幫助他,讓她不會自尋死路,或是成為曝險少年。</p>
---
<p>研究目的</p>
<div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;">o</div>
1. 透過各種seq2seq模型預測並畫出漫畫的後續與結局
2. 滿足可憐的窮學生的夢想
---
<p>研究器材</p>
<div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;">o</div>
1. Tensorflow/PyTorch
2. Colab
3. DiscordFS-SFTP ([https://github.com/TWTom041/DiscordFS-SFT](https://www.youtube.com/watch?v=mwSq9dhpn8Y)[P](https://github.com/TWTom041/DiscordFS-SFTP))
---
<p>過程與方法</p>
<div class="spinner" style="color: #fff; margin: 20px; background: indigo; width: 50px; height: 50px;">o</div>
---
<h2 class="spinner-hover" style=" font-family: Papyrus, fantasy; font-style: normal; font-variant: normal; font-weight: 400; line-height: 20px;">
dataset
</h2>
<span class="spinner-hover" style="font-family: Papyrus, fantasy; font-size: 42px; font-style: normal; font-variant: normal; font-weight: 400; line-height: 20px;">
[Kaggle Japanese Manga Dataset](https://www.kaggle.com/datasets/chandlertimm/unified/data)
</span>
<img class="spinner-hover" src="https://hackmd.io/_uploads/SkGCi74r6.jpg" style="width: 600px;">
---
## related research

<img src="https://media.discordapp.net/attachments/1181782367694761994/1181782519100739715/image.png?ex=65824fbd&is=656fdabd&hm=854de47ad267b28ea5c6145865d8feeaa980cba0957529ca03d7a69b9c6d5bf7&=&format=webp&quality=lossless&width=1754&height=694">
---
<img src="https://media.discordapp.net/attachments/1181782367694761994/1181782414805192816/electronics-11-00764-g001.png?ex=65824fa4&is=656fdaa4&hm=0f5d9f68d21ef0cdb5792b989b2506c2131779187f40f65b354f5fc03c1c00cc&=&format=webp&quality=lossless&width=2074&height=1276">
---
## model

---
<style>
#title {
background-clip: border-box;
font-family: Consolas;
background: linear-gradient(45deg, #ff0000, #ff7700, #ffcc00, #33cc33, #3366cc, #9900cc);
},
#hahaha {
background-clip: border-box;
font-family: Consolas;
background: linear-gradient(135deg, #ff0000, #ff7700, #ffcc00, #33cc33, #3366cc, #9900cc);
}
</style>
<h2 id="title">
modifying the model
</h2>
<p id="hahaha" style="font-weight: 999; -webkit-text-stroke: 2px #ff7f00;background-clip: border-box;
background: linear-gradient(120deg, #7a42f4, #e34a33, #56b4e9, #feb24c, #4daf4a);">
1. Use ViT to replace the encoder <br>
2. upsample the output to make it to an image <br>
3. Transfer learning
</p>

---
## brief inference workflow
1. image to sequence of feature
2. features to text
3. text + features to image
training:
1. train a model that can classify if the image is manga-style.
2. RLHF the generated sequence.
---
## Current Progress
----
```scala
Sequential(
(0): Conv2d(3, 128, kernel_size=(3, 3), stride=(1, 1))
(1): BertForMaskedLM(
(bert): BertModel(
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
(cls): BertOnlyMLMHead(
(predictions): BertLMPredictionHead(
(transform): BertPredictionHeadTransform(
(dense): Linear(in_features=768, out_features=768, bias=True)
(transform_act_fn): GELUActivation()
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
)
(decoder): Linear(in_features=768, out_features=30522, bias=True)
)
)
)
)
```
---
<p class="spinner" style="font-size: 200px; color: pink;">
THANKS FOR LISTENING
</p>
{"title":"hehhehehhkakakkak.stupid.wednesday.report","description":"透過各種seq2seq模型預測並畫出漫畫的後續與結局","contributors":"[{\"id\":\"e0c7dab8-be31-4e15-a8e7-22968131e3e1\",\"add\":8467,\"del\":744},{\"id\":\"3a8e1e13-2506-428c-840c-6048b9a10676\",\"add\":1111,\"del\":39}]"}