Ping Wang

@pingw220

Joined on Apr 13, 2021

  • SALLOC Request for GPU salloc --partition=gpu-l40 --account=stf --mem=10G --gres=gpu:1 --cpus-per-task=1 --time=2:00:00 Check if GPU is requested scontrol show job 24333466 | grep gpu Conda Reinstall rm -rf '/gscratch/scrubbed/andysu/miniconda3' bash Miniconda3-latest-Linux-x86_64.sh -p /gscratch/scrubbed/andysu/miniconda3
     Like  Bookmark
  • Introduction This article delves into the architecture and mechanics of decoder-only transformers, which are a crucial component of many large language models (LLMs). It highlights the structure, attention mechanisms, and embedding techniques that make these models effective for various natural language processing (NLP) tasks. Decoder-Only Transformer Architecture Overview Decoder-only transformers, unlike the traditional encoder-decoder structure, use only the decoder component to process and generate text. This architecture is particularly suited for tasks that involve sequential generation, such as text completion and language modeling. Structure The decoder-only transformer consists of multiple layers, each containing self-attention mechanisms and feed-forward neural networks.
     Like  Bookmark
  • Introduction This article provides a visual and intuitive explanation of the transformer architecture, which has revolutionized natural language processing (NLP) by enabling efficient handling of sequential data through self-attention mechanisms. It covers the structure, mechanics, and key components such as the encoder, decoder, attention mechanisms, and embeddings. Transformer Architecture Overview The transformer architecture, introduced by Vaswani et al. in 2017, eliminates the need for recurrent layers by using self-attention mechanisms, allowing for parallel processing and better handling of long-range dependencies. Encoder-Decoder Structure The transformer model consists of an encoder-decoder architecture, each composed of multiple layers.
     Like  Bookmark
  • Introduction This article offers a detailed explanation of transformers, a revolutionary architecture in natural language processing (NLP) that has significantly advanced the capabilities of large language models (LLMs). It covers the structure, mechanics, and key components of transformers, including the encoder, decoder, attention mechanisms, and embeddings. Transformer Architecture Overview Transformers, introduced by Vaswani et al. in 2017, have transformed NLP by enabling parallel processing and effectively capturing long-range dependencies in text. They utilize self-attention mechanisms to process input sequences more efficiently than traditional recurrent neural networks (RNNs). Encoder-Decoder Structure The transformer model consists of an encoder-decoder architecture, each composed of multiple layers.
     Like  Bookmark
  • Introduction This article provides a foundational overview of large language models (LLMs) and the transformer architecture, explaining their structure, mechanics, and key components such as the encoder, decoder, attention mechanisms, and embeddings. Large Language Models (LLMs) Definition and Purpose LLMs are advanced neural networks trained on massive datasets to understand and generate human language. They are designed to handle various natural language processing (NLP) tasks such as translation, summarization, and question answering. Evolution LLMs have evolved significantly, with the introduction of models like BERT, GPT, and T5, which leverage transformer architectures to achieve state-of-the-art performance in many NLP benchmarks.
     Like  Bookmark
  • Introduction This article provides a comprehensive visualization of the mechanics behind neural machine translation (NMT) models, focusing on sequence-to-sequence (Seq2Seq) architectures with attention mechanisms. It covers key components such as the encoder, decoder, attention mechanisms, and embeddings, which are crucial for understanding how these models process and translate text. Seq2Seq Model Architecture Encoder The encoder processes the input sequence and converts it into a fixed-length context vector that encapsulates the meaning of the entire sequence. Role: Encodes the source sentence into a set of vectors. Structure: Typically consists of recurrent neural networks (RNNs) such as LSTM or GRU layers. Output: Generates a context vector that summarizes the input sequence.
     Like  Bookmark
  • Introduction This article provides an overview of the Open Pre-trained Transformers (OPT) Library, focusing on its architecture, mechanics, and the role of transformers in language models. It covers the structural elements such as the encoder, decoder, attention mechanisms, and embeddings that are integral to OPT models. Understanding OPT Transformer Architecture The OPT library utilizes the transformer architecture, which has become a standard for building large language models due to its efficiency and performance in handling sequential data. Encoder-Decoder Framework The transformer model originally employs an encoder-decoder framework, though many modern implementations like OPT focus on specific parts depending on the task.
     Like  Bookmark
  • Introduction This article provides an in-depth look at the evolution of language models, particularly focusing on the structures, mechanics, and the transformer architecture behind GPT (Generative Pre-trained Transformer) and GPT-2. It explores the encoder-decoder framework, attention mechanisms, and embeddings that form the backbone of these models. GPT Architecture Transformer Architecture GPT models utilize a transformer architecture that relies on self-attention mechanisms to process and generate text. Decoder-Only Architecture Unlike the original transformer, which uses both an encoder and a decoder, GPT models employ a decoder-only setup. This setup is designed for unidirectional text generation, making it highly effective for tasks that require predicting the next word in a sequence.
     Like  Bookmark
  • Introduction This article explores the evolution and key components of Open Source Large Language Models (LLMs). It delves into the structural aspects, mechanics, and the role of transformers in LLMs, with a focus on the encoder, decoder, attention mechanisms, and embeddings. The Language Modeling Objective Language models are trained to predict the next word in a sequence, a process known as the language modeling objective. This foundational concept underpins the training and functioning of LLMs. Structure and Mechanics of LLMs Transformers Transformers have revolutionized the development of LLMs. They consist of an encoder-decoder architecture that processes sequences of text data.
     Like  Bookmark
  • Projects Courses MIT Deep Learning reference Linear Regression RNN Backpropagation LSTM
     Like  Bookmark
  • Tree Graph Topological Sort DAG: 有向無環圖 in degree: 被邊指到的數量 依序拿掉 in degree 為 0 的地方 拿出得順序就是托墣排序 Dijkstra
     Like  Bookmark
  • Deep Learning DL & ML Learning Notes reference Tennis Court Classification Projects reference Handwritten Digit Recognition Project reference
     Like  Bookmark
  • Prefix Sum Practice Problems: CSES Static Range Sum Queries #include <iostream> #include <algorithm> #include <vector> #include <math.h> #include <string.h>
     Like  Bookmark
  • Introduction Data Structure Tree and Graph Divide and Conquer Dynamic Programming Problem Solving
     Like  Bookmark
  • Leetcode Leetcode 50 Leetcode 704 Leetcode 1095 APCS 實作題 - 物品堆疊 - APCS - by Peter Wang 實作題 - 棒球遊戲 - APCS - by Peter Wang 資料結構題 - 定時K彈 - APCS - by Peter Wang 實作題 - 數字龍捲風 - APCS - by Peter Wang
     Like  Bookmark
  • Resource Course Website Syllabus 2024/1/3 Leture Course materials slides reading: 2.1 ~ 3 Database Management System (DBMSs)
     Like  Bookmark
  • Big O Data Structure vector list stack queue deque 1670. Design Front Middle Back Queue heap
     Like  Bookmark
  • Prerequisite: Recursion Practice Problems: LeetCode 779 LeetCode 51 Divide and Conquer Practice Problems: LeetCode 50: Pow(x, n)Solution
     Like  Bookmark
  • Classic basic DP Introduction Practice Problems: LeetCode 70 LeetCode 53 LeetCode 198 LeetCode 279 class Solution { public:
     Like  Bookmark
  • LeetCode 50: Pow(x, n) Exponentiation by squaring class Solution { public: double myPow(double x, long long n) { if(n == 0 || x == 1) return 1; if(n < 0) { return 1.0 / myPow(x, -n);
     Like  Bookmark