--- tags: HTML4 SPEC --- # Section 3: On SGML and HTML [link](https://www.w3.org/TR/html401/intro/sgmltut.html) #### Ref - https://www.easyatm.com.tw/wiki/SGML ## Outline - This section of the document introduces SGML and discusses its relationship to HTML ## 3.1 Introduction to SGML - SGML, Standard Generalized Markup Language. - SGML is a system for defining "markup" language. - 國際上定義電子檔案結構和內容的標準 - Each markup language defined in SGML is called an SGML application - An SGML declaration. - A document type definition (DTD). - A specification - Document instances containing data (content) and markup. ## 3.2 SGML constructs used in HTML ### 3.2.1 Element - Definition: - ***element types*** is to represent structures or desired behavior - Each element type declaration generally describes **three parts** 1. a start tag 2. content 3. an end tag. - The HTML DTD indicates for each element type whether the start tag and end tag are required. > **"Elements are not tags."** Some people refer to elements as tags (e.g., "the P tag"). Remember that the element is one thing, and the tag (be it start or end tag) is another. ### 3.2.2 Attributes - Elements may have associated properties, called attributes - Example: The **id** attribute is set for an **H1** element: ``` <H1 id="section1"> This is an identified heading thanks to the id attribute </H1> ``` ### 3.2.3 Character references - Character references are **numeric or symbolic names for characters** that may be included in an HTML document - It is useful for referring to rarely used characters. - In this doucument, they begin with a "&" sign and end with a semi-colon (;). - `&lt;` represents the < sign. ### 3.2.4 Comments - HTML comments have following syntax: ``` <!-- this is a comment --> <!-- and so is this one, which occupies more than one line --> ``` - White space is **not permitted** between the markup declaration open delimiter("`<!`") and the comment open delimiter ("`--`"), but is **permitted** between the comment close delimiter ("`--`") and the markup declaration close delimiter ("`>`"). - Comments has no special meaning (e.g., character references are not interpreted as such). ## 3.3 How to read the HTML DTD ### 3.3.1 DTD Comments - In the DTD, comments are delimited by a pair of `--` marks ``` <!ELEMENT PARAM - O EMPTY -- named property value --> ``` ### 3.3.2 Parameter entity definitions - A parameter entity definition begins with the keyword <!ENTITY % followed by the entity name, the quoted string the entity expands to, and finally a closing >. `<!ENTITY % fontstyle "TT | I | B | BIG | SMALL"> ` ### 3.3.3 Element declarations - The <!ELEMENT keyword begins a declaration and the > character ends it. ``` <!ELEMENT ... > ``` - Between these are specified: 1. The element's **name**. 2. Whether the element's **tags** are optional. - `- -` means must have the start and end tag. - `- O` means that the end tag can be omitted. - `O O` means that both the start and end tags can be omitted. 3. The element's **content**, if any. - The allowed content for an element is called its ***[content model](#Content-model-definitions)***. The content model for empty elements types is declared using the keyword "EMPTY". - Example: ``` <!ELEMENT UL - - (LI)+> ``` 1. element name: UL 2. both the start tag `<UL>` and the end tag `</UL>` are required. 3. The content model is declared to be "at least one LI element". #### Content model definitions - The content model describes **what may be contained** by an instance of an element type. | content model | description | | -------- | -------- | | (...) | Delimits a group. | | A | A must occur, one time only.| | A+| A must occur one or more times.| | A? | A must occur zero or one time.| |A* |A may occur zero or more times.| | +(A) | A may occur.| | -(A) | A must not occur.| | A , B| Both A and B must occur, in that order.| | A & B | Both A and B must occur, in any order.| | A\|B | Either A or B must occur, but not both.| - Example: `<!ELEMENT DL - - (DT|DD)+>` - The DL element must contain one or more DT or DD elements in any order. - A few HTML element types use an additional SGML feature to exclude elements from their content model. Excluded elements are preceded by a hyphen. Like: ` <!ELEMENT A - - (%inline;)* -(A)>` ### 3.3.4 Attribute declarations ``` <!ATTLIST ... > ``` 1. The name of an attribute. 2. The type of the attribute's value or an explicit set of possible values. 3. Whether the default value of the attribute - is implicit (keyword "#IMPLIED") - always required (keyword "#REQUIRED") - or fixed to the given value (keyword "#FIXED"). - Example: ``` <!ATTLIST MAP name CDATA #IMPLIED> ``` ``` rowspan NUMBER 1 -- number of rows spanned by cell -- ``` #### Boolean attributes - Their appearance in the start tag of an element implies that the value of the attribute is "true". Their absence implies a value of "false".