Attention Is All You Need

# Book_論文翻譯 ###### tags: `book` 神經網路相關論文翻譯 LLM --- - [Attention Is All You Need](https://hackmd.io/@shaoeChen/BkxGXkS96) - [DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning](https://hackmd.io/@shaoeChen/r1UWj4XYkx) - [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://hackmd.io/@shaoeChen/BkjbSpWcye) - [Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention(機翻未調校)](https://hackmd.io/@shaoeChen/H1G6azXqye) - [OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER(機翻未調校)](https://hackmd.io/@shaoeChen/rk_eqe4c1g) GAN --- - [DCGANs_Paper(翻譯)](https://hackmd.io/@shaoeChen/B1_b6g3WS) - [WGAN_Paper(翻譯)](https://hackmd.io/@shaoeChen/ryT0HZtXr) - [Improved Training of Wasserstein GANs_Paper(翻譯)](https://hackmd.io/@shaoeChen/H1fpco3rB) - [Wasserstein GAN and the Kantorovich-Rubinstein Duality(翻譯)](https://hackmd.io/@shaoeChen/H1pT3o2Br) - [A Wasserstein GAN model with the total variational regularization(翻譯)](https://hackmd.io/@shaoeChen/Sk5tnUByO) - [Progressive Growing of GANs for Improved Quality, Stability, and Variation(翻譯)](https://hackmd.io/@shaoeChen/ryIH43v9n) - [A Style-Based Generator Architecture for Generative Adversarial Networks(翻譯)](https://hackmd.io/@shaoeChen/r1DOGOSCp) - [Analyzing and Improving the Image Quality of StyleGAN(翻譯)](https://hackmd.io/@shaoeChen/rJWBrzae0) - [Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks(翻譯)](https://hackmd.io/@shaoeChen/BkMuLJNPR) - [Image-to-Image Translation with Conditional Adversarial Networks(翻譯)](https://hackmd.io/@shaoeChen/HJ-UN4fO0) - [Perceptual Losses for Real-Time Style Transfer and Super-Resolution(翻譯)](https://hackmd.io/@shaoeChen/r1fHVEzO0) Stable Diffusion --- - [High-Resolution Image Synthesis with Latent Diffusion Models](https://hackmd.io/@shaoeChen/HkPV-K4PJe) RL --- - [Ride-Hailing Order Dispatching at DiDi via Reinforcement Learning(1)(翻譯)](https://hackmd.io/@shaoeChen/r1T5dCzVO) - [Ride-Hailing Order Dispatching at DiDi via Reinforcement Learning(2)(翻譯)](https://hackmd.io/@shaoeChen/r1Q6TzyHO) - [Ride-Hailing Order Dispatching at DiDi via Reinforcement Learning(Appendix)(翻譯)](https://hackmd.io/@shaoeChen/r1YO_Rz8d) - [A Deep Value-network Based Approach for Multi-Driver Order Dispatching(翻譯)](https://hackmd.io/@shaoeChen/S1sRFzzuc) - [Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning(1)](https://hackmd.io/@shaoeChen/Hy48RPzwO) - [Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning(2)](https://hackmd.io/@shaoeChen/HyHRHiD6K) - [Learning Options in Reinforcement Learning(翻譯中)(放生)](https://hackmd.io/@shaoeChen/SkEUDfqJ9) - [A Reinforcement Learning Environment For Job-Shop Scheduling(翻譯)](https://hackmd.io/@shaoeChen/S1UmWvfN9) - [Actor-Critic Algorithms](https://hackmd.io/@shaoeChen/BJvQl5Zq5) - [DQN] - [Soft Actor-Critic] CNN --- - [Gradient-Based Learning Applied to Document Recognition_Paper(LeNet-5)(翻譯)(I, II)](https://hackmd.io/@shaoeChen/rJvD_alOS) - [Gradient-Based Learning Applied to Document Recognition_Paper(LeNet-5)(翻譯)(III, IV)](https://hackmd.io/@shaoeChen/B1gid86cB) - [Gradient-Based Learning Applied to Document Recognition_Paper(LeNet-5)(翻譯)(V, VI)](https://hackmd.io/@shaoeChen/SyjI6W2zB) - [Gradient-Based Learning Applied to Document Recognition_Paper(LeNet-5)(翻譯)(VII, VIII)](https://hackmd.io/@shaoeChen/SyGkzHge8) - [Gradient-Based Learning Applied to Document Recognition_Paper(LeNet-5)(翻譯)(IX)](https://hackmd.io/@shaoeChen/ry4vJ7lG8) - [Gradient-Based Learning Applied to Document Recognition_Paper(LeNet-5)(翻譯)(X)](https://hackmd.io/@shaoeChen/SyGIGnUM8) - [Gradient-Based Learning Applied to Document Recognition_Paper(LeNet-5)(翻譯)(XI)](https://hackmd.io/@shaoeChen/ryu_wMKML) - [ImageNet Classification with Deep Convolutional Neural Networks(AlexNet)(翻譯)](https://hackmd.io/@shaoeChen/SJK_0YJmI) - [Very Deep Convolutional Networks for Large-Scale Image Recognition(VGG16)(翻譯)](https://hackmd.io/@shaoeChen/BJ2DMA7QU) - [Going deeper with convolutions(Inception-v1)(翻譯))](https://hackmd.io/@shaoeChen/rkIGBzWEI) - [Rethinking the Inception Architecture for Computer Vision(Inception-v2)](https://arxiv.org/abs/1512.00567) - [Network In Network(翻譯)](https://hackmd.io/@shaoeChen/HJ19NfW4U) - [Deep Residual Learning for Image Recognition(ResNet)(翻譯)](https://hackmd.io/@shaoeChen/Sy_e1mCEU) - [Identity Mappings in Deep Residual Networks(翻譯)](https://hackmd.io/@shaoeChen/HkRA9oxLI) - [CSPNET: A NEW BACKBONE THAT CAN ENHANCE LEARNING CAPABILITY OF CNN(翻譯)](https://hackmd.io/@shaoeChen/S1hSH4Dvj) Visualization --- - [Visualizing and Understanding Convolutional Networks(翻譯)\_wait](https://hackmd.io/@shaoeChen/BkJPNfWN8) Object Detection --- - [You Only Look Once: Unified, Real-Time Object Detection(YOLOv1)(翻譯)](https://hackmd.io/@shaoeChen/Hy6kUMWNI) - [YOLO9000: Better, Faster, Stronger(YOLOv2)(翻譯)](https://hackmd.io/@shaoeChen/r1TTbG2OL) - [YOLOv3: An Incremental Improvement(翻譯)](https://hackmd.io/@shaoeChen/ryHg904h9) - [YOLOv3實作整理](https://hackmd.io/@shaoeChen/HkrRPNGas) - [YOLOv4: Optimal Speed and Accuracy of Object Detection(翻譯)](https://hackmd.io/@shaoeChen/Skiym0XEi) - [YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors] Face Recognition --- - [ArcFace: Additive Angular Margin Loss for Deep Face Recognition(翻譯)](https://hackmd.io/@shaoeChen/S1SZAZwg1g) Semantic Segmentation --- - [Fully Convolutional Networks for Semantic Segmentation(翻譯)](https://hackmd.io/@shaoeChen/BJB0NfZVL) Knowledge Distillation --- - [Distilling the knowledge in a neural network(翻譯)](https://hackmd.io/@shaoeChen/Hyn9Udkja) ML --- - [Efficient and Robust Automated Machine Learning](https://proceedings.neurips.cc/paper_files/paper/2015/file/11d0e6287202fced83f79975ec59a3a6-Paper.pdf) Trick --- - [Instance Normalization: The Missing Ingredient for Fast Stylization(翻譯)](https://hackmd.io/@shaoeChen/H1O6dP5lA) 待讀論文 --- - [Learning Confidence for Out-of-Distribution Detection in Neural Networks_wait]() - [FGSM](https://arxiv.org/abs/1412.6572) - [Basic iterative method](https://arxiv.org/abs/1607.02533) - [L-BFGS](https://arxiv.org/abs/1312.6199) - [Deepfool](https://arxiv.org/abs/1511.04599) - [JSMA](https://arxiv.org/abs/1511.07528) - [C&W](https://arxiv.org/abs/1608.04644) - [Elastic net attac](https://arxiv.org/abs/1709.04114) - [Spatially Transformed](https://arxiv.org/abs/1801.02612) - [One Picel Attack](https://arxiv.org/abs/1710.08864) - [Object Detection Networks on Convolutional Feature Maps_\wait] - [Understanding the difficulty of training deep feedforward neural networks(翻譯)\_wait](https://hackmd.io/@shaoeChen/SkpxEfZVL) - [Residual Networks Behave Like Ensembles of Relatively Shallow Networks(翻譯)_\wait](https://hackmd.io/@shaoeChen/B1gbP9bLL) - [Squeeze-and-Excitation Networks(SENet)(翻譯\_wait)](https://arxiv.org/pdf/1709.01507.pdf) - [Selective Kernel Networks(SKNet)(翻譯)\_wait](https://arxiv.org/abs/1903.06586) - [Mistral 7B](https://arxiv.org/pdf/2310.06825.pdf) - [Mixtral of Experts](https://arxiv.org/pdf/2401.04088.pdf) - [PhotoMaker](https://arxiv.org/pdf/2312.04461.pdf) - [Stable Code 3B]() - [Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model](https://arxiv.org/pdf/2401.09417.pdf) - [Self-Rewarding Language Models](https://arxiv.org/abs/2401.10020) - [I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models](https://arxiv.org/pdf/2312.16693.pdf) - [TinyLlama: An Open-Source Small Language Model](https://arxiv.org/pdf/2401.02385.pdf) - [MEDUSA: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads](https://arxiv.org/pdf/2401.10774.pdf) - [FitNets: Hints for Thin Deep Nets] - [Knowledge distillation for natural language processing.] - [A survey on knowledge distillation] - [Knowledge distillation: A survey of recent advances.] - [Knowledge distillation for computer vision.] - [ALOHA2](https://aloha-2.github.io/assets/aloha2.pdf) - [YOLOv9](https://arxiv.org/pdf/2402.13616.pdf) - [Stable Cascade](https://stability.ai/news/introducing-stable-cascade) - [GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection](https://arxiv.org/abs/2403.03507) - [Scaling Rectified Flow Transformers for High-Resolution Image Synthesis](https://stabilityai-public-packages.s3.us-west-2.amazonaws.com/Stable+Diffusion+3+Paper.pdf) - [Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention](https://arxiv.org/pdf/2404.07143.pdf) - [Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis](https://arxiv.org/abs/2404.13686) - [Diffusion Models for Video Generation(文章)](https://lilianweng.github.io/posts/2024-04-12-diffusion-video/) - [What are Diffusion Models?(文章)](https://lilianweng.github.io/posts/2021-07-11-diffusion-models/) - [Attention as an RNN](https://arxiv.org/pdf/2405.13956) - [YOLOv10](https://arxiv.org/pdf/2405.14458)

Book_論文翻譯

tags: book

LLM

GAN

Stable Diffusion

RL

CNN

Visualization

Object Detection

Face Recognition

Semantic Segmentation

Knowledge Distillation

ML

Trick

待讀論文

tags: `book`