---
tags: HTML4 SPEC
---
# Section 3: On SGML and HTML
[link](https://www.w3.org/TR/html401/intro/sgmltut.html)
#### Ref
- https://www.easyatm.com.tw/wiki/SGML
## Outline
- This section of the document introduces SGML and discusses its relationship to HTML
## 3.1 Introduction to SGML
- SGML, Standard Generalized Markup Language.
- SGML is a system for defining "markup" language.
- 國際上定義電子檔案結構和內容的標準
- Each markup language defined in SGML is called an SGML application
- An SGML declaration.
- A document type definition (DTD).
- A specification
- Document instances containing data (content) and markup.
## 3.2 SGML constructs used in HTML
### 3.2.1 Element
- Definition:
- ***element types*** is to represent structures or desired behavior
- Each element type declaration generally describes **three parts**
1. a start tag
2. content
3. an end tag.
- The HTML DTD indicates for each element type whether the start tag and end tag are required.
> **"Elements are not tags."** Some people refer to elements as tags (e.g., "the P tag"). Remember that the element is one thing, and the tag (be it start or end tag) is another.
### 3.2.2 Attributes
- Elements may have associated properties, called attributes
- Example: The **id** attribute is set for an **H1** element:
```
<H1 id="section1">
This is an identified heading thanks to the id attribute
</H1>
```
### 3.2.3 Character references
- Character references are **numeric or symbolic names for characters** that may be included in an HTML document
- It is useful for referring to rarely used characters.
- In this doucument, they begin with a "&" sign and end with a semi-colon (;).
- `<` represents the < sign.
### 3.2.4 Comments
- HTML comments have following syntax:
```
<!-- this is a comment -->
<!-- and so is this one,
which occupies more than one line -->
```
- White space is **not permitted** between the markup declaration open delimiter("`<!`") and the comment open delimiter ("`--`"), but is **permitted** between the comment close delimiter ("`--`") and the markup declaration close delimiter ("`>`").
- Comments has no special meaning (e.g., character references are not interpreted as such).
## 3.3 How to read the HTML DTD
### 3.3.1 DTD Comments
- In the DTD, comments are delimited by a pair of `--` marks
```
<!ELEMENT PARAM - O EMPTY -- named property value -->
```
### 3.3.2 Parameter entity definitions
- A parameter entity definition begins with the keyword <!ENTITY % followed by the entity name, the quoted string the entity expands to, and finally a closing >.
`<!ENTITY % fontstyle "TT | I | B | BIG | SMALL">
`
### 3.3.3 Element declarations
- The <!ELEMENT keyword begins a declaration and the > character ends it.
```
<!ELEMENT ... >
```
- Between these are specified:
1. The element's **name**.
2. Whether the element's **tags** are optional.
- `- -` means must have the start and end tag.
- `- O` means that the end tag can be omitted.
- `O O` means that both the start and end tags can be omitted.
3. The element's **content**, if any.
- The allowed content for an element is called its ***[content model](#Content-model-definitions)***. The content model for empty elements types is declared using the keyword "EMPTY".
- Example:
```
<!ELEMENT UL - - (LI)+>
```
1. element name: UL
2. both the start tag `<UL>` and the end tag `</UL>` are required.
3. The content model is declared to be "at least one LI element".
#### Content model definitions
- The content model describes **what may be contained** by an instance of an element type.
| content model | description |
| -------- | -------- |
| (...) | Delimits a group. |
| A | A must occur, one time only.|
| A+| A must occur one or more times.|
| A? | A must occur zero or one time.|
|A* |A may occur zero or more times.|
| +(A) | A may occur.|
| -(A) | A must not occur.|
| A , B| Both A and B must occur, in that order.|
| A & B | Both A and B must occur, in any order.|
| A\|B | Either A or B must occur, but not both.|
- Example:
`<!ELEMENT DL - - (DT|DD)+>`
- The DL element must contain one or more DT or DD elements in any order.
- A few HTML element types use an additional SGML feature to exclude elements from their content model. Excluded elements are preceded by a hyphen. Like:
` <!ELEMENT A - - (%inline;)* -(A)>`
### 3.3.4 Attribute declarations
```
<!ATTLIST ... >
```
1. The name of an attribute.
2. The type of the attribute's value or an explicit set of possible values.
3. Whether the default value of the attribute
- is implicit (keyword "#IMPLIED")
- always required (keyword "#REQUIRED")
- or fixed to the given value (keyword "#FIXED").
- Example:
```
<!ATTLIST MAP name CDATA #IMPLIED>
```
```
rowspan NUMBER 1 -- number of rows spanned by cell --
```
#### Boolean attributes
- Their appearance in the start tag of an element implies that the value of the attribute is "true". Their absence implies a value of "false".