The HTML META tag has been part of the language since the days of HTML+, and has experienced only minor changes in updates of the specification. Even in HTML 4.01, the META tag has only has six attributes, five of which are optional.
If you just want to know what a META tag looks like (and don't care about the more esoteric points of HTML), you should skip to the section on attributes.
In SGML META is an optional and empty element that must appear in the context of a HEAD element. That means:
No HTML document is required to have a META element. (In fact, the only element required for all HTML documents is TITLE.) On the other hand, I recommend that all HTML documents include at least one of the most common META tags.
The META tag has a start tag, but not an end tag, so there is never a </meta> in HTML. A META tags contains its metadata inside a tag, not between tags. That information is contained in attributes, qualities of a tag identified by name/value pairs, where the name is a keyword (like "http-equiv") and the value is connected to the keyword with an = (equal sign).
META tags always appear in the HEAD element. That doesn't necessarily mean they have to appear between <head> and </head>, since the HEAD element exists in HTML documents even when they don't use HEAD tags. If the boundaries of the HEAD element aren't marked with HEAD tags, the META tag must appear before any of the regular BODY content.
An HTML file may have as many META elements as it needs to encode its metainformation. Technically, it can even have multiple META elements with the same name or http-equiv, but this may cause problems for the user agents that read the metadata.
As mentioned above, the META tag is an empty element with one required attribute (content) and five optional ones (name, http-equiv, lang, dir, and scheme). As such, the simplest legal form of the tag looks like:
<meta content="this is an example">
Although it's syntactically valid, this tag is semantically meaningless, because it doesn't associate the content with a labeling attribute (name or http-equiv). Useful metadata is always a name/value pair. META tags with only one attribute aren't useful (except possibly when used to designate a strictly boolean value, as Cleartype does.)
The three most common (and most important) attributes of the META tag are valid HTML in all versions of HTML since HTML+.
The name attribute provides the "name" portion of the "name/value" metadata pair. It identifies a type of metadata (for example, a document's publication date or abstract), but doesn't assign a value to the name. (That's the job of content.) Most META tags for influencing search engines use NAME. For example:
<meta name="Description" content="This is my home page!">
The HTML specifications give content an SGML data type of NAME, meaning it can only contain letters, numbers, hyphens, underscores, and periods. It must always begin with a letter. Technically, the value of of name is always case-sensitive, but no software applications are known to treat it as such.
The http-equiv attribute, provides the "name" of a "name/value" pair, just like name does, with one important difference: By using the http-equiv attribute, an author gives permission to the web server holding his document to read and process that META tag, converting it into an HTTP to be read by the software that requested the file from the server.
The http-equiv attribute is intended for use by authors who need to supplement or override the HTTP headers created by their web servers. In practice, few server administrators allow their users to override server settings that way, so http-equiv is seldom used as intended. Instead, http-equiv has been bootstrapped by software authors as a tag for controlling web browsers. Most uses of the http-equiv tag activate (or deactivate) special functions in select web browser. For example:
<meta http-equiv="Refresh" content="13">
Like name the http-equiv attribute can only contain letters, numbers, hyphens, underscores, and periods, and must always begin with a letter. Officially, http-equiv is case-sensitive, but nobody treats it as such.
The content attribute is the only mandatory attribute for META tags, and the most important, because it contains the value of the label created by the name and/or http-equiv attributes. name and http-equiv declare which metadata is important, but content says what the metatdata actually is.
The HTML specifications give content a "permitted value" of CDATA, meaning it's case-sensitive and can contain any character expressible in HTML, but characters outside HTML's normal ASCII character set need to be encoded with the same HTML entities that you use in normal BODY text.
The other three attributes of the META element are only legal in HTML 4.0 and its derivatives (HTML 4.01, ISO/IEC 15445, and XHTML). They are all optional.
The lang attribute designates the language of the META tag's content value. The lang uses the RFC 1755 language codes (for major languages, they're two-letter abbreviations), and is only necessary when a META tag uses a language different from its home document's primary language.
The dir attribute specifies the directionality of the value of the content value. dir only has two possible values s of 4.0 files are supposed , using rtl for text meant to be read right-to-left, and ltr for text meant to be read left-to- right. This attribute is very rarely used, because it's only necessary when you need to override the directionality determined by the UNICODE bidirectional algorithm and the default directionality you've specified at the document level. If you're writing in English or one of the other left-to-right languages, you probably don't need to use the dir attribute.
(In case you're wondering, HTML authors are supposed to identify the primary language and directionality of their HTML documents by including the lang and dir attributes on the opening tag of the html element itself. Relatively few people do this right now.)
The scheme attribue is used to identify a metadata scheme used for decoding the value of the content attribute. This attribute is a little tricky, because it can contain a simple self-explanation of how content is encoded, or it can be a label identifying a complete specification.
For example, if I was using using the name value "date" to label when this web page was written, and using 2002-05-19 as the value of content, a self-explanatory scheme value might look like this:
<meta name="date" scheme="year-month-date" content="2002-05-19">
On the other hand, a labeling scheme might mention the ISO date standard, which is where I really got the format:
<meta name="date" scheme="ISO 8601" content="2002-05-19">
Using the scheme attribute brings up the whole issue of metadata profiles. Profiles are documents that explain the metadata system used to describe a document. In HTML, the location of a profile document (which may or may not be human-readable) can be defined two ways: by including a profile attribute in the opening tag of the HEAD element, or by using the scheme link relationship. Both methods use the URI (address) of the document to identify it.
In theory, that means every web page should point to another page explaining exactly which name, http-equiv, and scheme values it uses, and what they mean. In practice, only advanced users (more advanced than Websnob even) are doing so, because there are very few coherant metadata schemes available for use. Many sites (including this one) use combinations of multiple schemes, making it difficult to single out one as "the scheme" for a site.
Until this issue is cleared up (by the development of more detailed schemes, with reference URIs), you shouldn't worry about designating profiles. There are currently no software agents that utilize them, anyway.
XHTML (HTML re-implemented in XML) is slowly supplanting traditonal HTML on the Web. META exists in all three versions of XHTML currently published (XHTML 1.0, 1.1, and 2.0). There are three major (but easily made) changes to how XHTML formats the META element. These apply to all versions of XHTML:
The first, and most important, is that XHTML elements are case
sensitive and lower-cased. In other words, you have to type
meta
, not META
.
Just as important is the fact that XHTML closes empty elements with
/>
instead of >
. All versions of XHTML close
the element this way. The typical Description would look
like this in XHTML 1.0:
<meta name="Description" content="This is my home page!"/>
Finally (and not as critical) is the addition of the xml:lang attribute, which identifies the language used in the content attribute. xml:lang uses the same language abbreviations as the lang attribute. META elements in XHTML 1.0 may have both lang and xml:lang attributes.
XHTML 1.1 goes one step further than XHTML 1.0, in that it completely drops the lang attribute in favor of using the xml:lang exclusively.
XHTML 2.0 (still in development) includes three more major changes to the META element: two attributes (content and http-equiv) have been removed, and the element is no longer empty. Information that previously would have been included in the content attribute is instead included between the two meta tags. That means the typical Description looks like:
<meta name="Description">This is my home page!</meta>
At this point (November 2003), it remains unclear what the removal of the http-equiv attribute will mean the META tags that use it. In at least one case (META="refresh"), an new link relationship has been proposed as substitute.