---
tags: HTML4 SPEC
---
# Section 7: The global structure of an HTML document
[link](https://www.w3.org/TR/html401/struct/global.html)
## 7.1 The structure of an HTML document
- An HTML 4 document is composed of 3 parts:
1. HTML **version**
2. a declarative **header** section `<head>`
3. a **body**, which contains the document's actual content. `<body>`
## 7.2 HTML version information
- HTML 4.01 specifies **three** DTDs, so authors must include one of the following document type declarations in their documents.
1. The HTML 4.01 **Strict DTD**
- includes all elements and attributes that have not been deprecated or do not appear in frameset documents.
```
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
```
2. The HTML 4.01 **Transitional DTD**
- the strict DTD + deprecated elements and attributes.
```
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
```
3. The HTML 4.01 **Frameset DTD**
- transitional DTD + frames
```
<html>
<head>
...
</head>
<frameset>
...
</frameset>
</html>
```
#### But, the `<frame>` and `<frameset>` are deprecated now. [[ref]](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/frame)
- if need to declare:
```
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd">
```
## 7.3 The `HTML` element
```
<!ENTITY % html.content "HEAD, BODY">
<!ELEMENT HTML O O (%html.content;) -- document root element -->
<!ATTLIST HTML
%i18n; -- lang, dir -->
```
- Start tag: optional, End tag: optional
- Attribute definitions:
- **lang** (language information)
- **dir** (text direction)
## 7.4 The document head
### The `HEAD` element
```
<!-- %head.misc; defined earlier on as "SCRIPT|STYLE|META|LINK|OBJECT" -->
<!ENTITY % head.content "TITLE & BASE?">
<!ELEMENT HEAD O O (%head.content;) +(%head.misc;) -- document head -->
<!ATTLIST HEAD
%i18n; -- lang, dir --
profile %URI; #IMPLIED -- named dictionary of meta info --
>
```
- Start tag: optional, End tag: optional
- Attribute definitions:
- **profile** = *uri* [CT]
- specifies the location of one or more meta data profiles, separated by white space.
- Attributes defined elsewhere
- **lang** (language information)
- **dir** (text direction)
- The HEAD element contains **information about the current document**, such as its title, keywords that may be useful to search engines, and other data that is not considered document content.
### The `TITLE` element
```
<!-- The TITLE element is not considered part of the flow of text.
It should be displayed, for example as the page header or
window title. Exactly one title is required per document.
-->
<!ELEMENT TITLE - - (#PCDATA) -(%head.misc;) -- document title -->
<!ATTLIST TITLE %i18n>
```
- Start tag: required, End tag: required
- Attributes:
- **lang** (language information)
- **dir** (text direction)
- Every HTML document **must** have a TITLE element in the HEAD section.
- **The contents of a page title is very important for search engine optimization (SEO)!** The page title is used by search engine algorithms to decide the order when listing pages in search results. [[ref]](https://www.w3schools.com/tags/tag_title.asp)
- Authors should use the TITLE element to defines the title of the document and **provide context-rich titles** such as "Introduction to Medieval Bee-Keeping" instead.
- Here is a sample document title:
```
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>A study of population dynamics</TITLE>
... other head elements...
</HEAD>
<BODY>
... document body...
</BODY>
</HTML>
```
### The title attribute
- Attribute definitions: **title** = *text* [CS]
- [CS] -> case-sensitive
- It specifies extra information about an element.
- The title attribute may annotate **any number of elements**.
- Example:
```
...some text...
Here's a photo of
<A href="http://someplace.com/neatstuff.gif" title="Me scuba diving">
me scuba diving last summer
</A>
...some more text...
```
- The title attribute may be rendered by user agents in a variety of ways. For instance,
- visual browsers frequently display the title as a "tool tip" [ref](https://www.w3schools.com/tags/att_global_title.asp)
- Audio user agents may speak the title information
- The title attribute has an additional role when used with the LINK element to designate an external style sheet.
### Meta data
#### Specifying meta data
- Involves two steps:
1. **Declaring a property and a value for that property.** This may be done in two ways:
- From within a document, via the `META` element.
- From outside a document, by linking to meta data via the `LINK` element.
2. **Referring to a profile** where the property and its legal values are defined. To designate a profile, use the `profile` attribute of the `HEAD` element. Like below:
- Note that a profile is defined for the HEAD element, the same profile applies to all META and LINK elements in the document head.
```
<head profile="https://www.w3resource.com/profiles.html">
```
#### The `META` element
```
<!ELEMENT META - O EMPTY -- generic metainformation -->
<!ATTLIST META
%i18n; -- lang, dir, for use with content --
http-equiv NAME #IMPLIED -- HTTP response header name --
name NAME #IMPLIED -- metainformation name --
content CDATA #REQUIRED -- associated information --
scheme CDATA #IMPLIED -- select form of content --
>
```
- Start tag: required, End tag: forbidden
- Attribute definitions
For the following attributes, the permitted values and their interpretation are profile dependent:
- **name** = *name* [CS] (可自己定義)
This attribute identifies a property name.
- **content** = *cdata* [CS]
This attribute specifies a property's value.
- **scheme** = *cdata* [CS]
This attribute names a scheme to be used to interpret the property's value (see the section on profiles for details).
- **http-equiv** = *name* [CI] (browser and server 同步更新的概念)
This attribute may be used in place of the name attribute. HTTP servers use this attribute to gather information for HTTP response message headers.
- Attributes defined elsewhere
- **lang** (language information)
- **dir** (text direction)
Each META element specifies a property/value pair. The name attribute identifies the property and the content attribute specifies the property's value.
- Example: the following declaration sets a value for the Author property:
```
<META name="Author" content="Dave Raggett">
```
- Example: The **lang** attribute can be used with META to specify the language for the value of the content attribute, which enables speech synthesizers to apply language dependent pronunciation rules.
```
<META name="Author" lang="fr" content="Arnaud Le Hors">
```
##### [Note] The META element is a generic mechanism for specifying meta data. However, some HTML elements and attributes already handle certain pieces of meta data:
- the `TITLE` element
- the `ADDRESS` element
- the `INS` and `DEL` elements
- the **title** attribute
- the **cite** attribute.
##### [Note] When a property specified by a META element takes a value that is a **URI**, some authors prefer to specify the meta data via the LINK element. Thus
```
<META name="DC.identifier"
content="http://www.ietf.org/rfc/rfc1866.txt">
```
###### might also be written:
```
<LINK rel="DC.identifier"
type="text/plain"
href="http://www.ietf.org/rfc/rfc1866.txt">
```
#### `META` and HTTP headers
- HTTP servers may use the property name specified by the http-equiv attribute to create an [RFC822]-style header in the HTTP response.
- [HTTP Header](https://zh.wikipedia.org/wiki/HTTP%E5%A4%B4%E5%AD%97%E6%AE%B5) -> 內容以外的資訊
- The following sample `META` declaration:
```<META http-equiv="Expires" content="Tue, 20 Aug 1996 14:25:27 GMT">```
will result in the HTTP header:
```Expires: Tue, 20 Aug 1996 14:25:27 GMT```
- [http-equiv reference](https://www.w3schools.com/tags/att_meta_http_equiv.asp)
#### `META` and search engines
- A common use for `META` is to specify keywords that a search engine may use to improve the quality of search results.
- When several `META` elements provide language-dependent information about a document, search engines may filter on the lang attribute to display search results using the language preferences of the user. For example,
```
<-- For speakers of US English -->
<META name="keywords" lang="en-us"
content="vacation, Greece, sunshine">
```
[name attribute reference](https://www.w3schools.com/tags/att_meta_name.asp)
#### `META` and PICS
- The Platform for Internet Content Selection (PICS) is an infrastructure for associating labels (meta data) with Internet content.
- It also facilitates other uses for labels, including code signing, privacy, and intellectual property rights management.
```
<HEAD>
<META http-equiv="PICS-Label" content='
(PICS-1.1 "http://www.gcf.org/v2.5"
labels on "1994.11.05T08:15-0500"
until "1995.12.31T23:59-0000"
for "http://w3.org/PICS/Overview.html"
ratings (suds 0.5 density 0 color/hue 1))'>
<TITLE>... document title ...</TITLE>
</HEAD>
```
#### `META` and default information
The `META` element may be used to specify the default information for a document in the following instances:
* The default scripting language.
* The default style sheet language.
* The document character encoding.
The following example specifies the character encoding for a document as being ISO-8859-5
`<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-5"> `
#### Meta data profiles
- The profile attribute of the HEAD specifies the location of a meta data profile.
- The value of the **profile** attribute is a URI.
- User agents may use this URI in two ways:
1. As a globally unique name.
- For instance, search engines could provide an interface for searching through catalogs of HTML documents, where these documents all use the same profile for representing catalog entries.
2. As a link.
- User agents may perform some activity based on the actual definitions within the profile (e.g., authorize the usage of the profile within the current HTML document)
- The **scheme** attribute allows authors to provide user agents more context for the correct interpretation of meta data.
- At times, such additional information may be critical, as when meta data may be specified in different formats.
- For example, an author might specify a date in the (ambiguous) format "10-9-97"; does this mean 9 October 1997 or 10 September 1997? The scheme attribute value "Month-Day-Year" would disambiguate this date value.
- At other times, the scheme attribute may provide helpful but non-critical information to user agents.
- For example, the following scheme declaration may help a user agent determine that the value of the "identifier" property is an ISBN code number:
- `<META scheme="ISBN" name="identifier" content="0-8230-2355-9">`
## 7.5 The document body
### The `BODY` element
- The body of a document contains the document's content
```
<!ELEMENT BODY O O (%block;|SCRIPT)+ +(INS|DEL) -- document body -->
<!ATTLIST BODY
%attrs; -- %coreattrs, %i18n, %events --
onload %Script; #IMPLIED -- the document has been loaded --
onunload %Script; #IMPLIED -- the document has been removed -->
```
- Start tag: optional, End tag: optional
- Since style sheets are now the preferred way to specify a document's presentation, the presentational attributes of BODY have been deprecated.
- Using style sheets, the same effect could be accomplished as follows:
```
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>A study of population dynamics</TITLE>
<STYLE type="text/css">
BODY { background: white; color: black}
A:link { color: red }
A:visited { color: maroon }
A:active { color: fuchsia }
</STYLE>
</HEAD>
<BODY>
... document body...
</BODY>
</HTML>
```
- Using external (linked) style sheets gives you the flexibility to change the presentation without revising the source HTML document:
```
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>A study of population dynamics</TITLE>
<LINK rel="stylesheet" type="text/css" href="smartstyle.css">
</HEAD>
<BODY>
... document body...
</BODY>
</HTML>
```
### Element identifiers: the **id** and **class** attributes
**id** = *name* [CS]
- This attribute assigns a name to an element. This name must **be unique in a document**.
```
<P id="myparagraph"> This is a uniquely named paragraph.</P>
```
#### The **id** attribute has several roles in HTML:
* As a style sheet selector.
* As a target anchor for hypertext links.
* As a means to reference a particular element from a script.
* As the name of a declared OBJECT element.
* For general purpose processing by user agents (e.g. for identifying fields when extracting data from HTML pages into a database, translating HTML documents into other formats, etc.).
**class** = *cdata-list* [CS]
- This attribute assigns **a class name** or **set of class names** to an element. Any number of elements may be assigned the same class name or names. Multiple class names must be separated by **white space** characters.
- A class name may be shared by several element instances.
#### The **class** attribute has several roles in HTML:
* As a style sheet selector (when an author wishes to assign style information to a set of elements).
* For general purpose processing by user agents.
### Block-level and inline elements
#### - Content model
Generally, block-level elements may contain inline elements and other block-level elements. Generally, inline elements may contain only data and other inline elements. Inherent in this structural distinction is the idea that block elements create "larger" structures than inline elements.
#### - Formatting
Generally, block-level elements begin on new lines, inline elements do not.
#### - Directionality
For technical reasons involving the [UNICODE] bidirectional **text algorithm**, block-level and inline elements differ in how they inherit directionality information.
* Style sheets provide the means to specify the rendering of arbitrary elements, including whether an element is rendered as block or inline. In some cases, such as an inline style for list elements, this may be appropriate, but generally speaking, authors are discouraged from overriding the conventional interpretation of HTML elements in this way.
* The **alteration** of the traditional presentation idioms for block level and inline elements also **has an impact on the bidirectional text algorithm**. See the section on the effect of style sheets on bidirectionality for more information.
### Grouping elements: the DIV and SPAN elements
```
<!ELEMENT DIV - - (%flow;)* -- generic language/style container -->
<!ATTLIST DIV
%attrs; -- %coreattrs, %i18n, %events -->
<!ELEMENT SPAN - - (%inline;)* -- generic language/style container -->
<!ATTLIST SPAN
%attrs; -- %coreattrs, %i18n, %events -->
```
```
<!ENTITY % flow "%block; | %inline;">
```
- The `DIV` and `SPAN` elements offer a generic mechanism for adding structure to documents.
- These elements define content to be inline (`SPAN`) or block-level (`DIV`) but impose no other presentational idioms on the content.
- Thus, authors may use these elements with style sheets, the lang attribute, etc., to tailor HTML to their own needs and tastes.
- Visual user agents generally place a line break before and after DIV elements, for instance:
`<P>aaaaaaaaa<DIV>bbbbbbbbb</DIV><DIV>ccccc<P>ccccc</DIV>`
which is typically rendered as:
```
aaaaaaaaa
bbbbbbbbb
ccccc
ccccc
```
### Headings: The `H1`, `H2`, `H3`, `H4`, `H5`, `H6` elements
```
<!ENTITY % heading "H1|H2|H3|H4|H5|H6">
<!--
There are six levels of headings from H1 (the most important)
to H6 (the least important).
-->
<!ELEMENT (%heading;) - - (%inline;)* -- heading -->
<!ATTLIST (%heading;)
%attrs; -- %coreattrs, %i18n, %events -->
```
- Start tag: required, End tag: required
- A heading element briefly describes the topic of the section it introduces.
- Example:
```
<DIV class="section" id="forest-elephants" >
<H1>Forest elephants</H1>
<P>In this section, we discuss the lesser known forest elephants. ...this section continues...
<DIV class="subsection" id="forest-habitat" >
<H2>Habitat</H2>
<P>Forest elephants do not live in trees but among them. ...this subsection continues...
</DIV>
</DIV>
<HEAD>
<TITLE>... document title ...</TITLE>
<STYLE type="text/css">
DIV.section { text-align: justify; font-size: 12pt}
DIV.subsection { text-indent: 2em }
H1 { font-style: italic; color: green }
H2 { color: green }
</STYLE>
</HEAD>
```
<DIV class="section" id="forest-elephants" >
<H1>Forest elephants</H1>
<P>In this section, we discuss the lesser known forest elephants. ...this section continues...
<DIV class="subsection" id="forest-habitat" >
<H2>Habitat</H2>
<P>Forest elephants do not live in trees but among them. ...this subsection continues...
</DIV>
</DIV>
<HEAD>
<TITLE>... document title ...</TITLE>
<STYLE type="text/css">
DIV.section { text-align: justify; font-size: 12pt}
DIV.subsection { text-indent: 2em }
H1 { font-style: italic; color: green }
H2 { color: green }
</STYLE>
</HEAD>
### The `ADDRESS` element
```
<!ELEMENT ADDRESS - - (%inline;)* -- information on author -->
<!ATTLIST ADDRESS
%attrs; -- %coreattrs, %i18n, %events -->
```
- Start tag: required, End tag: required
- The `ADDRESS` element may be used to supply **contact information** for a document or a major part of a document such as a form. This element often appears at the beginning or end of a document.
- [usage ref](https://www.w3schools.com/tags/tag_address.asp)
### 如何在 DTD 找到定義element 是否為block/inline element?
in DTD: (https://www.w3.org/TR/html401/sgml/dtd.html#flow)
```
<!--================== HTML content models ===============================-->
<!--
HTML has two basic content models:
%inline; character level elements and text strings
%block; block-like elements e.g. paragraphs and lists
-->
<!ENTITY % block
"P | %heading; | %list; | %preformatted; | DL | DIV | NOSCRIPT |
BLOCKQUOTE | FORM | HR | TABLE | FIELDSET | ADDRESS">
<!ENTITY % flow "%block; | %inline;">
```