David Ho
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # 利用TensorFlow以及PyTorch建立張量 $\qquad$在進行模型訓練之前,資料的預處理以及準備合適的變數用以輸入到模型是必經的步驟。 在TensorFlow以及PyTorch平台上,建立符合輸入模型要求的變數規則大不相同。 ## 利用NumPy建立一個陣列 $\qquad$在建立符合兩個平台要求的變數前,我們可以先利用NumPy來建立基礎的一維陣列。 ```python= import numpy as np array = np.linspace(0,100,100) print(array) #array = [0,1,2,3,......,99] ``` 在以上的範例中,我們使用了NumPy的`linspace`函數建立了一個具有100個元素的陣列。這個陣列的型別為`numpy.ndarray`,是常見的資料型別。 ## 符合TensorFlow規範的模型輸入變數 ### 定義說明 $\qquad$在TensorFlow官方的文件中,我們可以找到`.fit()`這個訓練模型所需要的函數的[說明](https://www.tensorflow.org/api_docs/python/tf/keras/Model?hl=zh-tw#fit)。其中,關於輸入的變數,其要求如下: ``` Input data. It could be: 1. A Numpy array (or array-like), or a list of arrays (in case the model has multiple inputs). 2. A TensorFlow tensor, or a list of tensors (in case the model has multiple inputs). 3. A dict mapping input names to the corresponding array/tensors, if the model has named inputs. 4. A tf.data dataset. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights). 5. A generator or keras.utils.Sequence returning (inputs, targets) or (inputs, targets, sample_weights). 6. A tf.keras.utils.experimental.DatasetCreator, which wraps a callable that takes a single argument of type tf.distribute.InputContext, and returns a tf.data.Dataset. DatasetCreator should be used when users prefer to specify the per-replica batching and sharding logic for the Dataset. See tf.keras.utils.experimental.DatasetCreator doc for more information. A more detailed description of unpacking behavior for iterator types (Dataset, generator, Sequence) is given below. If using tf.distribute.experimental.ParameterServerStrategy, only DatasetCreator type is supported for x. ``` 這段話告訴了我們TensorFlow接受的變數型態: 1. 一個陣列(Array)或是類似陣列(Array-like)的變數,或是一個由陣列所組成的串列(List)。 2. 一個`TensorFlow tensor`物件,或是由前述物件所構成的串列。 3. 一個由字典(Dictionary)。當模型對輸入變數有命名時,其內容需由「輸入名稱」以及「對應資料」所構成。 4. 一個`tf.data.dataset`物件。其回傳值為一個由`(inputs, targets)`或是`(inputs, targets, sample_weights)`所組成的元組(Tuple)。 5. 一個生成器(Generator)物件或是繼承`keras.utils.Sequence`屬性且以符合規範方式回傳結果的物件 6. 一個包含`tf.distribute.InputContext`物件及符合其規範的`tf.keras.utils.experimental.DatasetCreator`物件,其回傳值為`tf.data.Dataset`物件。 $\qquad$從上面的說明我們可以得知,對於TensorFlow而言,有多種不同的變數可以作為模型的輸入。不管是`numpy.ndarray`或是`list`乃至於自己定義的生成器物件的可以作為合乎規範的模型輸入被使用。 ### 建立變數 $\qquad$在這裡,我們可以簡單的實作上方的前四種變數,而第五種將在[資料流](/86N1lvF2SdaRkcnuuWeJTg)的章節提及。 1. 建立一個陣列或是類似陣列的變數: $\qquad$建立陣列類的物件是最簡單的一種,無論是Python內建的串列(List)或是元組(Tuple)物件,或是利用NumPy所建立的`numpy.ndarray`物件都屬於符合規範的陣列類物件。這裡我們就不多做示範。 2. 建立`TensorFlow tensor`物件[^1]: $\qquad$一個`TensorFlow tensor`物件的建立,可以由`tf.constant`函式來達成。對於用`tf.constant`所建立的張量,其形狀以及型別(i.e. `float32` 或 `float64`)必須相同,每個元素內所含的元素數量必須相同(e.g. 一個二維張量其第二維度的元素總是相同)。以下是範例: ```python= import tensorflow as tf import numpy as np # Construct a scalar (i.e. 0-rank tensor) scalar = tf.constant(4) # Construct a 2-D tensor tensor = tf.constant([[0,1,2], [1,1,2]]) ``` 一個張量也可以由`numpy.ndarray`轉換而來: ```python= # Construct an array by numpy array = np.linspace(0,100,100) tensor = tf.constant(array) ``` $\qquad$對於需要建立一個不規則張量或是內容型別不盡相同的張量,可以透過建立`tf.RaggedTensor`或是`tf.sparse.SparseTensor`物件來達成。比較特別的部分是`tf.sparse.SparseTensor`的建立必須要透過宣告其形狀、元素的位置以及元素的值來建立;在顯示`tf.sparse.SparseTensor`時,可以用`tf.sparse.to_dense`函數來輸出。 ```python= import tensorflow as tf # Construct a ragged list ragged_list = [[0, 1, 2, 3], [4, 5], [6, 7, 8], [9]] # Construct a tf.RaggedTensor object ragged_tensor = tf.ragged.constant(ragged_list) # Construct a sparse tensors # Sparse tensors store values by index in a memory-efficient manner sparse_tensor = tf.sparse.SparseTensor(indices=[[0, 0], [1, 2]], values=[1, 2], dense_shape=[3, 4]) # We can convert sparse tensors to dense print(tf.sparse.to_dense(sparse_tensor)) ``` 3. 建立一個字典物件: $\qquad$Python本身的字典物件即是TensorFlow所接受的字典物件。假設模型有兩個變數(input_A, input_B)以及對應的標籤(label_A, label_B),其字典物件如以下所示: ```python= dic = { 'input_A': input_A, 'input_B': input_B, 'label_A': label_A, 'label_B': label_B, } ``` 4. 建立一個`tf.data.dataset`物件[^2]: $\qquad$`tf.data.dataset`是一個用來建立高效能資料流的API解決方案。其特點為可以從你既有的資料集中建立一個資料流,並結合預處理以及其他功能來達成更有效率的資料導管。 $\qquad$關於`tf.data.dataset`物件,其用途以及使用方式非常廣泛,例如可以從既有變數或是物件來建立,也可以直接從磁碟上的資料進行導入。 最常用的幾個函數為: * `tf.data.Dataset.from_tensor_slices` 由現有變數(無論是陣列或串列皆可)來生成`tf.data.dataset`物件。 * `tf.data.TextLineDataset` 可以由磁碟上的文字檔案進行資料導入並生成`tf.data.dataset`物件。 * `tf.data.TFRecordDataset` 可以由磁碟上的`.tfrecords`檔案進行資料導入並生成`tf.data.dataset`物件。 * `tf.data.Dataset.list_files` 可以從一個給定路徑中輸入符合條件的檔案並生成`tf.data.dataset`物件。 * `tf.data.Dataset.from_generator` 可以從自訂生成器物件來建立`tf.data.dataset`物件。 以下是利用上述函式建構`tf.data.dataset`物件的範例: ```python= import tensorflow as tf # Construct a tf.data.dataset from tensor from_slices = tf.data.Dataset.from_tensor_slices([1,2,3,4]) # Construct a tf.data.dataset from a set of txt file txt_dataset = tf.data.TextLineDataset(["file1.txt", "file2.txt"]) # Construct a tf.data.dataset from a set of .tfrecords files. tfrecord_dataset = tf.data.TFRecordDataset(["file1.tfrecords", "file2.tfrecords"]) # Create a dataset of all files matching a pattern list_dataset = tf.data.Dataset.list_files("/path/*.txt") ``` 如果要從自訂生成器來建立`tf.data.dataset`物件,需要先定義生成器,下面是一個例子: ```python= import tensorflow as tf def random_generator(): while True: randn_dist = np.random.randn(100) yield randn_dist rand_tf_dataset = tf.data.Dataset.from_generator(random_generator, output_types=tf.float64, output_shapes=(100,)) #output sample: list(rand_tf_dataset.take(1)) #[<tf.Tensor: shape=(100,), dtype=float64, numpy= #array([-0.80286473, 1.19648989, 0.8689962 , -1.25128701, -0.33173589, # 0.73953706, -0.77314326, -0.15734528, 1.04179809, -2.10082701, # ......., # 1.02326549, -0.53087628, -0.64370571, -1.23626384, 1.0770091 ])>] ``` > 編按:其實上述的例子已經和在本章的「定義說明」段落中提到的第五種TensorFlow所接受的變數型態有點類似。 ## 符合PyTorch規範的模型輸入變數 ### 定義說明 $\qquad$相對於TensorFlow,PyTorch對於輸入變數具有相對嚴格的規範以及要求。PyTorch要求輸入模型的變數須為一個張量(Tensor),且在輸入模型前必須被送至與模型相同之裝置(Device)。一個張量可以利用`torch.tensor`模組進行建立,或是由`numpy.ndarray`轉換而來;對於PyTorch而言,可用的裝置可以分成兩大類:CPU以及GPU。用戶必須宣告每一個存在的張量以及模型位於哪一個裝置上,位於與模型不同裝置的變數無法直接送入模型進行訓練。 ### 建立變數 $\qquad$在PyTorch中,建立張量主要有兩種方式:(一)從既有變數轉換;(二)利用PyTorch內建的函數進行張量建立。 1. 從既有變數轉換成張量 ```python= import numpy as np import torch #Construct a numpy array origin = np.linspace(0,100,100) # Convert numpy array to torch tensor # Method 1: Use torch.tensor function tensor_1 = torch.tensor(origin) #Method 2: Use torch.from_numpy function tensor_2 = torch.from_numpy(origin) ``` 在上述範例中,`tensor_1`與`tensor_2`是完全等價的,讀者可以自行嘗試。 2. 利用PyTorch內建函數建立張量 $\qquad$在PyTorch中,有許多的函數[^3]可以協助用戶建立張量。在此,我們會介紹最簡單的三個函數:`torch.ones`、`torch.zeros`以及`torch.eye` * `torch.ones`:此方法可建立一個張量,其元素均為1。 ```python= import torch one_tensor = torch.ones(10) #one_tensor: tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]) ``` * `torch.zeros`:此方法可建立一個張量,其元素均為0。 ```python= import torch zeros_tensor = torch.zeros(10) #zeros_tensor: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]) ``` * `torch.eye`:此方法可建立一個二維張量,其對角元素均為1,其餘非對角元素均為0。 ```python= import torch identity_tensor = torch.eye(3) #zeros_tensor: tensor([[1., 0., 0.], # [0., 1., 0.], # [0., 0., 1.]]) ``` ## 結語 $\qquad$以上就是在TensorFlow以及PyTorch建立張量的一些方式。關於更詳細的方法,可以到相關文件中查看。 [^1]: [TenosrFlow的Tensor介紹教學頁面](https://www.tensorflow.org/guide/tensor) [^2]: [TenosrFlow的tf.data.Dataset 介紹教學頁面](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) [^3]: [更多PyTorch張量相關的函數介紹](https://pytorch.org/docs/stable/torch.html) ###### tags: `Machine Learning` `Notebook` `技術隨筆` `機器學習` `Python` `TensorFlow` `PyTorch`

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully