Jiayong
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- disqus: hackmd --- Introduction to NoSQL Databases <br> WEEK_1 - Introducing NoSQL ==== ###### tags: `IBM Data Engineering Professional Certificate`,`Reading Note`,`Coursera`,`Introduction to NoSQL Databases` ### Overview >* 講解 RDBMS 與 NoSQL 的差別,如何在不同場景做選擇。 >* 深入探討 NoSQL 的功能與特性。 >* 講解 ACID 與 BASE 兩個模型之間的差異與效能優勢。 <br> ## Basic of NoSQL ### 1. Overview of NoSQL * What is NoSQL? * NoSQL 可稱為 Not only SQL。 * 儲存資料的方式與技術都有別於關聯式資料庫。 * 非關聯式 * 沒有正式的 row 與 column * 新的方式儲存和提取資料。 * 適合用於處理大數據資料。 * 比起關聯式資料庫更容易開發應用程式。 * History of NoSQL <br>![](https://i.imgur.com/HM8n3fO.png =700x) * 因應資料儲存需求,開發出RDBMS資料庫。 * 大數據資料飛漲,各個網絡公司為了解決龐大的資料儲存,隨後發表關於易於擴展的NoSQL技術的白皮書。 * NoSQL開源技術陸續被開發。 * 雲端公司推出NoSQL託管服務。 ### 2. Characteristics of NoSQL Databases * NoSQL Database Categories(類別) * Key-Value * Document * Column * Graph * NoSQL Database Characteristics(特徵) * 有自家的開源社群。 * 大部分的NoSQL是以開源的方式提供使用。 * 以開源的方式作為商業基礎。 * NoSQL的開發公司多數會同時提供商業版及開源版本。 * 每家公司都會有自家的獨特技術,但是還是有部分共同的技術,如: * 水平式擴展。 * 比 RDBMS 更容易資料共享。 * 使用 unique key 作為資料分片。 * 比 RDBMS 更多開發案例 * 更容易開發。 * 配合敏捷開發需求。 * Benefits of NoSQL Databases * Scalability <br>![](https://i.imgur.com/GcuVYs8.png =500x) * 水平擴張,從 Server 擴展至 Server Cluster, Server Racks 或最終至 Data Centers。 * Performance <br>![](https://i.imgur.com/hefF9Hz.png =400x) * 快速的回應速度 * 高併發 * Availability * 具有多個資料副本的資料庫集群,比單一資料庫的使用更為彈性。 * Cloud Architecture * 部署資料庫集群在雲端上,比傳統的部署方式更為省錢,效能更好。 * Flexible Schema * 靈活與直觀的資料架構使得開發人員在開發上更為輕便。 * NoSQL的靈活模式,在部署新應用程式的期間不需進行停機或任何資料庫鎖定。 * Varied Data Structures * Key-Value 的快速搜尋資料。 * 文件檔儲存。 * 關聯資料的圖形資料庫。 * Specialized Capabilities * 特定的索引和查詢功能 * 健壯的資料複製 * 現代的 HTTP API 請求功能 ### 3. NoSQL Database Categories - Key-Value * Key-Value NoSQL Database Architecture <br>![](https://i.imgur.com/R69jfo8.png =300x) * Least complex * Represented as hashmap * Ideal for basic CRUD operations * Scale well * Shard easily * Not intended for complex queries * Atomic for single key opeartions only * Value blobs are opaque to database * Less flexible data indexing and querying * Key-Value NoSQL Database Use Cases * Suitable Use Cases * 在 non-interconnected 資料中進行快速的 CRUD 作業,如: * 在網站上的個人資料或個人偏好設定。 * 購物車中的資料。 * Unsuitable Use Cases * 資料與資料之間是 many-to-many 的關係,如: * 社群網絡 * 推薦系統引擎 * 在併發作業的過程中,依然可以為此資料的一致性。 * 提供 ACID transaction 功能的資料庫 * Apps 基於 value 和 key 運行資料請求 * 請求 Document 資料 * Key-Value NoSQL Database Examples * AWS DynamoDB * Oracle NoSQL Database * Redis * Riak * Memcached ### 4. NoSQL Database Categories - Document * Document-Based NoSQL Database Architecture * Values 可視且可被訪問。 * 每筆資料是以文件檔的形式被儲存。 * 資料的格式通常是 JSON 或 XML。 * 每個文件檔都是一個彈性的模式。 * 可用 Key 和 Value 可搜尋並訪問資料。 * 可用 MapReduce 訪問資料。 * 水平式擴展。 * 可把分片資料儲存至各個Nodes當中。 * Document-Based NoSQL Database Use Cases * Suitable Use Cases * Event logging for apps and processes * each event instance is represented by a new document. * Online blogs * each user, post, comment, like, or action is represented by a document * Operational datasets and metadata for web and mobile apps * designed with Internet in. mind (JSON, RESTful APIs, unstructured data) * Unsuitable Use Cases * 在需要 ACID transactions 的場景 * Document databases 無法承載多個 documents 的 transaction。 * 關聯式資料庫更適合 ACID transactions 的任務。 * 資料是以 aggregate-oriented 方式設計 * 資料需要進行 normalized。 * * 關聯式資料庫更適合 ACID transactions 的任務。 * Document-Based NoSQL Database Example * IBM Cloudant * MongoDB * CouchDB * Terrastore * Couchbase ### 5. NoSQL Database Categories - Column * Column-Based NoSQL Database Architecture * 從 Google Bigtable 衍生而來。 * 別稱為 Bigtable clones。 * 儲存資料至 columns 當中 。 * Column 'families' are several rows, with unique keys, belonging to one or more columns * Grouped in families as often accessed together * Rows in a column family are not required to share the same columns * Can share all, a subset, or none * Columns can be added to any number of rows, or not * Suitable Use Cases * 適合大量的 sparse data。 * Column databases 適合分散儲存至各個節點中。 * 適合儲存 event logging 和 blogs。 * Counters are a unique use case for column databases。 * Columns can have a TTL parameter, making them useful for data with an expiration value. * Unsuitable Use Cases * ACID transactions * 適合 row level 進行讀寫作業。 * 在前期開發過程中,可能要對Columns進行增減,這可能會造成成本增加及產品的開發時間。 * Column-Based NoSQL Database Examples * Cassandra * Apache HBASE ### 6. NoSQL Database Categories - Graph * Graph NoSQL Database Architecture * Graph databases 儲存資料至 nodes, 儲存 node 之間的關係至 edges。 * 較為困難把資料分散儲存到多個Server上,否者會影響到資料庫的效能。 * Graph databases 適合用於 ACID transaction。 * Suitable Use Cases * 適合關係性高的資料。 * 社群網絡。 * Rounting, spatial, and map apps * 推薦系統引擎 * Unsuitable use Cases * 水平式擴展 * 更新全部資料或部分 nodes 裡的資料。 * Graph NoSQL Database Examples * neo4j * AWS Neptune <br> ## Working with Distributed Data ### 1. ACID vs BASE * ACID vs BASE consistency models <br>![](https://i.imgur.com/gy6dzwr.png =500x) * ACID definition <br>![](https://i.imgur.com/sqOaEKM.png =500x) * Atomic: 不可分割性 * Consistent: 一致性 * Isolated: 隔離性 * Durable: 持久性 * ACID consistency model * 使用在關聯式資料庫。 * 確保 data transaction 的一致性。 * 多使用在: * 金融業系統 * 資料倉庫 (Data Warehousing) * BASE definition <br>![](https://i.imgur.com/tfC5leO.png =500x) * Basically Available: 保持服務基本可用 * Soft state: 狀態可以有一段時間的不同步 * Eventuually consistent: 雖然有一段時間不同步,但追求最後結果一致 * BASE consistency model * 對於資料的一致性,即時更新和精準度的要求不高。 * 具有彈性和擴展性。 * 多使用在: * 電商公司 * 社群網絡平台 ### 2. Distributed Databases * Concepts of distributed systems * Distributed database * b將資料依照特性分散儲存在不同的資料庫伺服器,再以網路將這些伺服器連接起來。 * 把資料庫分散到各地區中。 * Fragmentation and replication * BASE 模型。 * Fragmentation <br>![](https://i.imgur.com/tFAjNhN.png =500x) <br>![](https://i.imgur.com/idFPVrM.png =500x) <br>![](https://i.imgur.com/iumYgpJ.png =500x) * 把資料分片儲存至各個資料庫裡。 * Replication <br>![](https://i.imgur.com/ZAFgi5o.png =500x) <br>![](https://i.imgur.com/NqERHLU.png =500x) <br>![](https://i.imgur.com/vuiH1Pd.png =500x) <br>![](https://i.imgur.com/X48fMzW.png =500x) * Advantages of distributed systems * 可靠性及彈性高。 * 效能被提升。 * 縮短訪問資料的時間。 * 簡易的提升及擴張資料庫 ### 3. The CAP Theorem * CAP Theorem <br>![](https://i.imgur.com/iy1qNw9.png =500x) * Partition Tolerance * Partition * a lost or temporarily delayed connection between nodes. * Partition tolerence * the cluster must work despite network issues * Distributed systems cannot avoid partitions and must be partition tolerant. * Partition tolerance * basic feature of NoSQL * NoSQL: CP or AP <br>![](https://i.imgur.com/BhYSsty.png =400x) ### 4. Challenges in Migrating from RDBMS to NoSQL Databases * RDMS or NoSQL <br>![](https://i.imgur.com/JKJ7oVr.png =500x) * RDBMS to NoSQL : a mindset change * Data driven model to Query driven data model * RDBMS: * Starts from the data integrity, relations between entities. * NoSQL: * Starts from your queries, not from your data.Models based on the way the application interacts with the data. * Normalized to Denormalized data * NoSQL: * Think how data can be structured based on your queries. * RDBMS: * Start from your normalized data and then build the queries. * From ACID to BASE model * Availability vs Consistency * CAP Theorem * choose between availability and consistency * Availability, performance, geographical presence, high data volumes * NoSQL systems, by design, do not support transactions and joins(except in limited cases) ## Summary and Highlights * 課程完整整理的內容,所以把它記錄下來 * Basics of NoSQL >* NoSQL means Not only SQL. >* NoSQL databases have their roots in the open source community. >* NoSQL database implementations are technically different from each other. >* There are several benefits of adopting NoSQL databases including storing and retrieving session information, and event logging for apps. >* The four main categories of NoSQL database are Key-Value, Document, Wide Column, and Graph. >* Key-Value NoSQL databases are the least complex architecturally. >* Document-based NoSQL databases use documents to make values visible for queries. >* In document-based NoSQL databases, each piece of data is considered a document, which is typically stored in either JSON or XML format. >* Column-based databases spawned from the architecture of Google’s Bigtable storage system. >* The primary use cases for column-based NoSQL databases are event logging and blogs, counters, and data with expiration values. >* Graph databases store information in entities (or nodes) and relationships (or edges). * Working with Distributed Data >* ACID stands for Atomicity, Consistency, Isolated, Durable. >* BASE stands for Basic Availability, Soft-state, Eventual Consistency. >* ACID and BASE are the consistency models used in relational and NoSQL databases. >* Distributed databases are physically distributed across data sites by fragmenting and replicating the data. >* Fragmentation enables an organization to store a large piece of data across all the servers of a distributed system by breaking the data into smaller pieces. >* You can use the CAP Theorem to classify NoSQL databases. >* Partition Tolerance is a basic feature of NoSQL databases. >* NoSQL systems are not a de facto replacement of RDBMS. >* RDBMS and NoSQL cater to different use cases, which means that your solution could use both RDBMS and NoSQL.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully