程式語言Ch1

# 程式語言Ch1 ## Reasons for Studying Concepts of Programming Languages 1. 程式語言經常受control structures, data structures, abstraction所限制，了解不同種類的程式語言可以減少此類限制 2. 某些languages constructs可以被模擬在不支援這些constructs的語言中，但通常這些模擬經常 - less elegant - more cumbersome麻煩 - less safe 3. 更快了解門新的語言 4. 更好了解程式碼如何運行、以及如何產生bug --- ## Programming Domains 以下是不同領域所注重的地方 :::info ### Scientific applications 1. 大量浮點數計算 2. 簡單的data structures 3. ex: Fortran ::: :::info ### Business applications 1. 展現精心製作的報告 2. 儲存資料 3. 十進位運算 4. ex: COBAL ::: :::info ### Artificial intelligence 1. 操控Symbols, consisting of names 2. ex: LISP ::: :::info ### Systems programming 1. systems software包括 - operation system(including kernel, shell commands, GUI, and so on) - programming support tools (compilers, interpreters, libraries, and so on) 2. system software經常被使用，所以要確保其具有效率 3. 對於這個領域，程式語言需要具備 - 快速的執行速度 - 有著low-level feature，使其在不同裝置上有便攜性 4. ex: C語言，Unix幾乎都是使用C語言搭建出來 ::: :::info ### Web Software 1. markup: XHTML 2. scripting: PHP 3. general-purpose: Java ::: --- ## Language Evaluation Criteria評價標準 ### Readability: 是否容易被閱讀與理解 :::warning maintenance(維護)是一個軟體生命週期裡很重要的一部份，因此一個好了解的程式碼更好被維護，可以減少所花費的成本 ::: :::success ### Simplicity 簡潔性 1. A manageable set of features and constructs 當程式碼的作者使用了讀者看不懂的結構或語法時，就會有Readability Problem 2. Few feature multiplicity (同個功能用不同方式表達) 在Java中執行+1的不同表達方式 ![image](https://hackmd.io/_uploads/Hys7flxJ0.png) 3. Minimal operator overloading (一個符號有多個用法) Overloading可以減少一個語言的符號複雜度；但可能會減少Readability，若讀者使用自己的overloading 4. Excessive Simplicity 過多的簡潔性過度簡潔也會影響到可讀性，舉例像是Assembly language，雖然十分簡潔，但卻不好理解，因為其缺少執行複雜動作的符號，導致程式碼的結構不好被辨認 ::: ::: success ### Control statements 一個常見的Control statement: ```while``` 這讓程式碼可以從top讀到bottom，而不用從一個statement跳到另一個不相鄰(nonadjacent)的statement，當你想模擬程式碼執行時的順序 #### Data Types and Structures :::warning **Ex:** 若一個語言沒有布林值，那他可能需要使用數字來表達true、false ![image](https://hackmd.io/_uploads/H1Lr8exJ0.png) 但有布林值的話，則可以更讓人理解，不容易誤會 ![image](https://hackmd.io/_uploads/H1POLgxJA.png) ::: :::success ### Syntax Considerations 1. **Identifier forms: flexible composition** 太短的的識別符號會影響到可讀性 2. **Special Word** special word組成程式的外觀，因此程式的可讀性受其影響很深 3. **methods of forming compound statements(複合語句)** C使用大括號來標示複合語句 4. **Form and meaning** self-descriptive constructs, meaningful keywords ::: ### Writability: 是否容易被用來做應用 :::warning **一個語言的Readability通常也會影響Writability** ::: :::success ### Support for Abstraction 1. **the ability to define and use complex structures or operations** **抽象化**-將複雜的東西簡化給使用者使用，不用去了解背後複雜的細節 2. **Programming languages can support two distinct categories of abstraction:** - Process, **Ex:** 使用subprogram來執行排序 - Data: :::warning ### Data Abstraction 資料抽象化將資料和操作資料的方法封裝起來，我們仍可以操控它，但其背後的實現方法被封裝起來 **Ex: A binary tree** - Fortran 77 – use integer arrays to implement - C++ and Java – use a class with two pointers (or references) and an integer ::: 3. **Expressivity** 對於某些運算使用較方便的運算元 **Ex:** 使用```for```來實作counting loop比```while```方便 ::: ### Reliability: 受否照其規格運行 :::warning A program is said to be reliable if it performs to its specifications under all conditions. ::: :::success ### Type Checking 1. Testing for type errors in a given program, either by the compiler or during program execution. - 運行時類型檢查成本高。 - 在編譯器進行期間類型檢查比較理想 - 錯誤越早發現越好 ::: :::success ### Exception Handling異常處理攔截錯誤 -> 採取糾正措施 -> 繼續運行 ::: :::success ### Aliasing(別名) 同個記憶體位置可以透過不同名稱來訪問 1. **Aliasing** 現已被普遍認為是危險的特性，因為透過其中一個名稱修改值，可能會不被察覺 2. 大部分的語言允許一些aliasing，**Ex:** 兩個不同的pointer可以指向相同位址 ::: :::success ### Readability and Writability A language that does not support “natural” ways of expressing an algorithm will necessarily use “unnatural” approaches, and hence reduced reliability 自然語言表達就是較貼近我們平常使用的語言 ::: ### Cost: 最終總成本 :::warning 訓練工程師使用語言寫程式 Compiling programs 執行程式 Language implementation system: availability of free compilers Reliability: poor reliability leads to high costs 維護程式 ::: ### Others :::success ### Portability The ease with which programs can be moved from one implementation to another ::: :::success ### Generality 有著廣泛的應用 ::: :::success ### Well-definedness 語言定義的十分精確與明確 ::: --- ## Influences on Language Design 什麼影響程式語言設計 :::info ### Computer Architecture 語言根據當前流行的電腦結構發展，**Ex:** *von Neumann* architecture ::: :::info ### Programming Methodologies 程式設計方法論新的程式設計方法論出來後(像是物件導向)，將會導致新的程式語言出現 ::: --- ## *Von Neumann* :::warning 過去50年流行的語言大多是為了*Von Neumann*架構設計，稱之為**imperative** languages(命令式語言) :::spoiler 關於imperative languages 1. Data and programs are stored in the same memory 2. Memory is separate from CPU 3. Instructions and data are transmitted from memory to CPU 4. Results of operations in the CPU must be moved back to memory 5. **Ex:** C、Java、Python和C++ ::: :::info ### Motherboard ![圖片1](https://hackmd.io/_uploads/ryTh5zgJC.jpg) ::: ### The von Neumann Architecture ![image](https://hackmd.io/_uploads/HynkjfxJR.png) ### Imperative Languages的主要特色 :::success ### Variables model memory cells ### Assignment statements model piping ::: ==迭代指令執行速度快，因為instruction儲存在相鄰的記憶體單元，因此要重複同一段程式碼只需要簡單的branch指令== ### 程式如何在Von Neumann Computer執行 1. machine code program 在von Neumann 執行是在一個process，名為**fetch-execute cycle** 2. 每一個被執行的instruction必須從memory移到processor 3. 下一個要執行的instruction的位址被program counter紀錄著 :::info ### Fetch-execute-cycle (on a von Neumann Architecture) ![image](https://hackmd.io/_uploads/Sy_eZQeJR.png) ::: ### Functional Language Programs Executed on a Von Neumann Machine functional language的計算方式主要是將function應用於給定的參數 :::info ### 可以使用function language的程式 1. 沒有imperative中使用的variables 2. 沒有assignment statements 3. 沒有迭代，改用遞迴 ::: 雖然functional language有許多有優點，但直到設計出能高效運行functional language的Von Neumann computer之前，它都不太可能取代imperative language ## 程式設計方法論的演化 Programming Methodologies ### 1950s and early 1960s: - 簡單的應用 - 擔心機器的效能 ### 1970s: - 硬體成本減少 - 程序猿成本上升 - 使用電腦來解決複雜的問題 - 結構化編程 - 由上而下的設計和逐步細化 - type checking 仍不完善 ### Late 1970s: - 從procedure-oriented 轉到 data-oriented - 強調data design，使用abstract data types來解決問題 - 大部分的程式語言設計開始支援data abstraction ### Middle 1980s: Object-oriented programming - data abstration - 封裝資料物件 - 控制資料的存取 - Inheritance 繼承 - 增強已存在軟體的再使用性，提高軟體的生產力 - dynamic method binding - 更彈性的使用繼承 - overloaded method 覆載 - overridden method 覆寫 --- ## 程式語言種類 ### imperative 命令式語言 :::info 1. variables 2. assignment 3. iteraton ### Ex: C, Pascal ::: :::success ### Subcategories of Imperative Languages ### Visual Languages 1. Visual BASIC and Visual BASIC .NET 2. 通常提供可拖拉程式碼方塊的開發環境，方便上手 3. 曾經被稱為第四代語言 4. 提供簡單的方式讓使用者開發視覺化界面 ![image](https://hackmd.io/_uploads/HJL8a4W1C.png) ### Scripting Languages 1. Ex: Perl, Javascript, Ruby 2. executes tasks within a special environments，使用interpreter而不是compiler 3. Such environments include software applications, web pages, and even embedded systems in operating system shells and games. 4. They are usually short and interpreted from source code or bytecode. ::: ### Functional :::info 運用function來對給定的參數作運算 ### : LISP, Scheme ::: ### Logic :::info 程式由一群宣告所組成，而不是指定敘述或控制流程 ### Ex: Prolog ::: ### Object-oriented :::info 1. 資料抽象化 2. 繼承 3. late binding ### Ex: Java, C++ ::: --- ## 程式如何運行 :::info ### 在imperative language 1. 算法被詳細規定 2. 必須包括指令或語句的具體執行順序 ::: :::info ### 在rule-based language 1. 不按照特定的順序 2. 程式的運行環境必須選擇運行順序 ::: --- ## Markup Programming hybrid languages **不是programing language**，主要是用來指定網頁文件的排版資訊 **Ex: XHTML, XML** --- ## 模塊設計(Modular Design)的好處 1. 小模塊更簡單快速編寫 2. 模塊可被重複利用，leading to faster development of subsequent programs. 3. 模塊可以獨立測試，減少Debugging的時間 --- ## 語言設計的權衡Language Design Trade-offs :::warning 程式語言評估標準為語言設計提供框架，然而此框架是自相矛盾的 ::: :::info ### Reliability vs. cost of execution **Java**對所有陣列元素的引用進行檢查，以確保index在其合法範圍內，但這導致執行成本增加 ::: :::info ### Readability vs. writability **APL**提供許多有力的運算元，提升程式能作的運算，但卻降低其可讀性 ::: :::info ### Writability (flexibility) vs. reliability **C++** 的pointer十分有用，但卻不太可靠 ::: --- ## 電腦裡的主要元件 ### Internal Memory 儲存資料和程式 ### Processor 一個電路的集合，用於實現一組基本操作或機器指令，例如算術和邏輯操作 --- ## Machine Language of a Computer 1. 電腦唯一能只接理解的語言 2. 提供常見的基本操作 3. 高階語言需要system software (language implementation systems)來轉換成相對應的machinec language --- ## Operation System :::warning 提供較高階的基礎操作 **Ex:** 1. 系統資源管理 2. I/O operation 3. 檔案管理系統 4. 文件或程式編輯器 5. 一些常見的函式 ::: ### Language Implementation Systems實現系統因為會使用到很多operation system facilities，所以他們直接利用OS完成作業，而不是使用code來直接和硬體溝通 --- ## Implemention Method 1. Compilation - 程式碼被轉換成機器語言，就可以在電腦上直接執行 - Slow translation, fast execution 2. Pure Interpretation - 程式碼被另一個程式(**interpreter**) 解釋 3. Hybrid Implementation Systems - compiler 和 interpreter 的折衷 - Java: ```.java```可以被compiled成```.class```(java bytecode)然後再JVM上執行 --- ## Compilation :::info ### lexical analysis 將源代碼轉成lexical unit的結構，lexical unit包括identifiers(變量名稱、函式名稱), special words, operators and punctuation symbols(標點符號) ### syntax analysis 將lexical unit轉成parse tree，用樹狀結構來檢查語法是否正確 ### intermediate code generation 將源代碼轉換成中間碼，中間碼比源代碼抽象但比機器語言更容易，利於後續的分析和轉換 ### semantics analysis語意分析對源代碼進行靜態分析，例如類型錯誤、未定義變數...這些都是在語法分析中難以被檢測的 ### code generation 最終產生機器語言machine code ::: :::info ### Optimization優化 1. 通常在中間碼階段會將程式碼優化，更小、更快 2. 有些compilers無法進行重要優化 3. 優化可能會 - 省略一些你的程式碼 - 改變程式碼的執行順序 - 特別是要進行同步進程時，可能會產生源代碼檢查不到的bug ![image](https://hackmd.io/_uploads/rkqD9iGyA.png) - 若a有與其他process share memory可能會產生問題，因為a的值已被改變 - 若將a設為volatile變量，將可以避免這種被優化而產生的問題 ::: :::info ### Symbol Table 1. serves as a database for the compilation process. 2. 裡面存著程式碼裡的user-defined type and attribute 3. 這些資訊lexical and syntax analyzers 階段時被設置，而被semantic analyzer and the code generator使用 ::: ![image](https://hackmd.io/_uploads/Bkz86sfy0.png) --- ## User Program Supporting Code 雖然機器語言可以直接被電腦硬體執行，但它其實還是要靠其他程式碼幫忙，而這些function大多來自OS --- ## Linking Operation 1. 在機器語言可以被執行前，我們需要將它與OS的Function link起來 2. The linking operation connects the user program to the system functions by placing the addresses of the entry points of the system functions ![image](https://hackmd.io/_uploads/SJ0Rbnfy0.png) 3. Load module - the user and system code together ![image](https://hackmd.io/_uploads/r1yt4hzJA.png) 4. Linking and loading (linking) - collecting system functions and linking them to user programs - Accomplished by a *systems program* called a **linker** --- ## Libraries 除了*systems program*之外還需要連結==libraries==裡的user function -The linker not only links a given program to system functions, it may also link it to other user functions --- ## Von Neumann Bottleneck瓶頸程式指令執行速度通常比connection(memory and processor) speed快很多，因此導致bottleneck，讓效能下降 --- ## interpreter interpreter是軟體模擬電腦程序，它使用高階語言來處理fetch-execute cycle而不是機器指令 This software simulation obviously provides a **virtual machine** for the language :::success ### Advantage 執行時發生的錯誤可以直接指向源代碼 **Ex:** 若有array發生out of range，則錯誤訊息可以輕易的指出錯誤行號 ::: :::success ### Disadvantage 1. 比compiled programs慢很多(10 to 100 times slower) - 解讀高階語言比機器語言複雜 - 每個指令每次都要再解讀一次，不論執行過多少次 - **statement decoding**, rather than the connection between the processor and memory, is the bottleneck of a pure interpreter. 2. 需要更多空間 - 除了source program之外，還有在解釋期間需要的symbol table - 源代碼以較容易取用和修改的形式儲存，而不會壓小size ::: :::success ### Popularity 1. 1960s: 一些簡單的語言APL, SNOBOL and LISP 2. by the 1980s: 高階語言已很少使用這種方式 3. 近年來: 因為網頁語言(JavaScript, PHP)的出現，interpretation又強勢回歸 ::: ![image](https://hackmd.io/_uploads/rJwuU7myC.png) --- ## Hybrid Implementation Systems 1. 介於compilers 和 interpreters之間 2. 高階語言被轉成中間碼，就能簡單的被解釋 3. 比直接interpretation快 4. Ex: **Perl** program - 在interpretation之前先compilation來檢查錯誤，這樣可以簡化interpreter 5. Ex: Initial implementations of **Java ** - 將Java先轉換為byte code，使得其可以在任何有byte code interpreter 和 Java class library的機器上運行 6. Java bytecode example ![image](https://hackmd.io/_uploads/rk5f1gVyA.png) --- ## Java Virtual Machine 1. JVM 是一個虛擬運算機器，computing machine ≡ computer 2. three notions of the JVM - specification(規範) - implementation(實現) - and instance(實例)，An instance of the JVM is a process that executes a computer program compiled into Java bytecode. ### Java Runtime Environment 1. Oracle公司擁有Java商標 2. Oracle分發Java虛擬機實現HotSpot，以及Java類庫的實現。 3. The JVM and the Java class library are named Java Runtime Environment (JRE) ### Java Class Library 1. The Java Class Library (JCL)是動態載入libraries，Java應用在執行時可以呼叫它 2. 因為Java平台不依賴於任何特定作業系統，導致其應用無法依賴任何platform-native libraries(平台原生庫) 3. Java平台提供了一套全面的標準類庫，其中包含了現代操作系統常見的功能。 ### Hybrid Implementation Process ![image](https://hackmd.io/_uploads/BJbiOl4yA.png) ### Just-in-Time (JIT) Implementation Systems 1. 先將代碼轉成中間碼，接著及時解讀成機器碼並執行它 2. 若執行過太多次的程式碼，則會將其機器碼存在memory中，優化速度 3. JIT systems are widely used for Java programs 4. .NET languages are implemented with a JIT system --- ## Preprocessors 在程式碼被compiled之前，processor會先處理程式 (e.g. C preprocessor: gcc -E) ### Processor Instructions 1. 通常用來告訴程式include了什麼檔案 ```#include myLib.c``` 2. 其他用法還有define symbols ```#define max(A, B) ((A) > (B) ? (A): (B))``` --- ## Programming Environments 1. 軟體開發的工具包括 - file system - text editor - linker - compiler 2. 或者是可能會包含大量集成的工具集，通過統一的用戶介面來訪問 ### Example 1. UNIX 2. Borland JBuilder - An integrated development environment for Java 3. Microsoft Visual Studio.NET - Used to program in C#, Visual BASIC.NET, Jscript, J#, or C++