aboutsummaryrefslogtreecommitdiff
path: root/en_US.ISO_8859-1/books/fdp-primer/sgml-primer/chapter.sgml
diff options
context:
space:
mode:
authorAndrey A. Chernov <ache@FreeBSD.org>2001-06-11 01:20:40 +0000
committerAndrey A. Chernov <ache@FreeBSD.org>2001-06-11 01:20:40 +0000
commitf749b200c1ca3b47c8058de0866f0cf6f69859be (patch)
treeedd12ebb0622f027051220884a994db606114196 /en_US.ISO_8859-1/books/fdp-primer/sgml-primer/chapter.sgml
parente8fc3b162e0f24945430f8fe87bb85ae7f895dfc (diff)
downloaddoc-f749b200c1ca3b47c8058de0866f0cf6f69859be.tar.gz
doc-f749b200c1ca3b47c8058de0866f0cf6f69859be.zip
ISO_* -> ISO* rename
Notes
Notes: svn path=/head/; revision=9587
Diffstat (limited to 'en_US.ISO_8859-1/books/fdp-primer/sgml-primer/chapter.sgml')
-rw-r--r--en_US.ISO_8859-1/books/fdp-primer/sgml-primer/chapter.sgml1556
1 files changed, 0 insertions, 1556 deletions
diff --git a/en_US.ISO_8859-1/books/fdp-primer/sgml-primer/chapter.sgml b/en_US.ISO_8859-1/books/fdp-primer/sgml-primer/chapter.sgml
deleted file mode 100644
index e6378df431..0000000000
--- a/en_US.ISO_8859-1/books/fdp-primer/sgml-primer/chapter.sgml
+++ /dev/null
@@ -1,1556 +0,0 @@
-<!-- Copyright (c) 1998, 1999 Nik Clayton, All rights reserved.
-
- Redistribution and use in source (SGML DocBook) and 'compiled' forms
- (SGML, HTML, PDF, PostScript, RTF and so forth) with or without
- modification, are permitted provided that the following conditions
- are met:
-
- 1. Redistributions of source code (SGML DocBook) must retain the above
- copyright notice, this list of conditions and the following
- disclaimer as the first lines of this file unmodified.
-
- 2. Redistributions in compiled form (transformed to other DTDs,
- converted to PDF, PostScript, RTF and other formats) must reproduce
- the above copyright notice, this list of conditions and the
- following disclaimer in the documentation and/or other materials
- provided with the distribution.
-
- THIS DOCUMENTATION IS PROVIDED BY NIK CLAYTON "AS IS" AND ANY EXPRESS OR
- IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- DISCLAIMED. IN NO EVENT SHALL NIK CLAYTON BE LIABLE FOR ANY DIRECT,
- INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
- SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
- STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
- ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE
- POSSIBILITY OF SUCH DAMAGE.
-
- $FreeBSD: doc/en_US.ISO_8859-1/books/fdp-primer/sgml-primer/chapter.sgml,v 1.16 2001/04/09 00:33:47 dd Exp $
--->
-
-<chapter id="sgml-primer">
- <title>SGML Primer</title>
-
- <para>The majority of FDP documentation is written in applications of
- SGML. This chapter explains exactly what that means, how to read
- and understand the source to the documentation, and the sort of SGML
- tricks you will see used in the documentation.</para>
-
- <para>Portions of this section were inspired by Mark Galassi's <ulink
- url="http://nis-www.lanl.gov/~rosalia/mydocs/docbook-intro/docbook-intro.html">Get Going With DocBook</ulink>.</para>
-
- <sect1>
- <title>Overview</title>
-
- <para>Way back when, electronic text was simple to deal with. Admittedly,
- you had to know which character set your document was written in (ASCII,
- EBCDIC, or one of a number of others) but that was about it. Text was
- text, and what you saw really was what you got. No frills, no
- formatting, no intelligence.</para>
-
- <para>Inevitably, this was not enough. Once you have text in a
- machine-usable format, you expect machines to be able to use it and
- manipulate it intelligently. You would like to indicate that certain
- phrases should be emphasised, or added to a glossary, or be hyperlinks.
- You might want filenames to be shown in a &ldquo;typewriter&rdquo; style
- font for viewing on screen, but as &ldquo;italics&rdquo; when printed,
- or any of a myriad of other options for presentation.</para>
-
- <para>It was once hoped that Artificial Intelligence (AI) would make this
- easy. Your computer would read in the document and automatically
- identify key phrases, filenames, text that the reader should type in,
- examples, and more. Unfortunately, real life has not happened quite
- like that, and our computers require some assistance before they can
- meaningfully process our text.</para>
-
- <para>More precisely, they need help identifying what is what. You or I
- can look at
-
- <blockquote>
- <para>To remove <filename>/tmp/foo</filename> use &man.rm.1;.</para>
-
- <screen>&prompt.user; <userinput>rm /tmp/foo</userinput></screen>
- </blockquote>
-
- and easily see which parts are filenames, which are commands to be typed
- in, which parts are references to manual pages, and so on. But the
- computer processing the document can not. For this we need
- markup.</para>
-
- <para>&ldquo;Markup&rdquo; is commonly used to describe &ldquo;adding
- value&rdquo; or &ldquo;increasing cost&rdquo;. The term takes on both
- these meanings when applied to text. Markup is additional text included
- in the document, distinguished from the document's content in some way,
- so that programs that process the document can read the markup and use
- it when making decisions about the document. Editors can hide the
- markup from the user, so the user is not distracted by it.</para>
-
- <para>The extra information stored in the markup <emphasis>adds
- value</emphasis> to the document. Adding the markup to the document
- must typically be done by a person&mdash;after all, if computers could
- recognise the text sufficiently well to add the markup then there would
- be no need to add it in the first place. This <emphasis>increases the
- cost</emphasis> (i.e., the effort required) to create the
- document.</para>
-
- <para>The previous example is actually represented in this document like
- this;</para>
-
- <programlisting><![ CDATA [
-<para>To remove <filename>/tmp/foo</filename> use &man.rm.1;.</para>
-
-<screen>&prompt.user; <userinput>rm /tmp/foo</userinput></screen>]]></programlisting>
-
- <para>As you can see, the markup is clearly separate from the
- content.</para>
-
- <para>Obviously, if you are going to use markup you need to define what
- your markup means, and how it should be interpreted. You will need a
- markup language that you can follow when marking up your
- documents.</para>
-
- <para>Of course, one markup language might not be enough. A markup
- language for technical documentation has very different requirements
- than a markup language that was to be used for cookery recipes. This,
- in turn, would be very different from a markup language used to describe
- poetry. What you really need is a first language that you use to write
- these other markup languages. A <emphasis>meta markup
- language</emphasis>.</para>
-
- <para>This is exactly what the Standard Generalised Markup Language (SGML)
- is. Many markup languages have been written in SGML, including the two
- most used by the FDP, HTML and DocBook.</para>
-
- <para>Each language definition is more properly called a Document Type
- Definition (DTD). The DTD specifies the name of the elements that can
- be used, what order they appear in (and whether some markup can be used
- inside other markup) and related information. A DTD is sometimes
- referred to as an <emphasis>application</emphasis> of SGML.</para>
-
- <para id="sgml-primer-validating">A DTD is a <emphasis>complete</emphasis>
- specification of all the elements that are allowed to appear, the order
- in which they should appear, which elements are mandatory, which are
- optional, and so forth. This makes it possible to write an SGML
- <emphasis>parser</emphasis> which reads in both the DTD and a document
- which claims to conform to the DTD. The parser can then confirm whether
- or not all the elements required by the DTD are in the document in the
- right order, and whether there are any errors in the markup. This is
- normally referred to as <quote>validating the document</quote>.</para>
-
- <note>
- <para>This processing simply confirms that the choice of elements, their
- ordering, and so on, conforms to that listed in the DTD. It does
- <emphasis>not</emphasis> check that you have used
- <emphasis>appropriate</emphasis> markup for the content. If you were
- to try and mark up all the filenames in your document as function
- names, the parser would not flag this as an error (assuming, of
- course, that your DTD defines elements for filenames and functions,
- and that they are allowed to appear in the same place).</para>
- </note>
-
- <para>It is likely that most of your contributions to the Documentation
- Project will consist of content marked up in either HTML or DocBook,
- rather than alterations to the DTDs. For this reason this book will
- not touch on how to write a DTD.</para>
- </sect1>
-
- <sect1 id="sgml-primer-elements">
- <title>Elements, tags, and attributes</title>
-
- <para>All the DTDs written in SGML share certain characteristics. This is
- hardly surprising, as the philosophy behind SGML will inevitably show
- through. One of the most obvious manifestations of this philisophy is
- that of <emphasis>content</emphasis> and
- <emphasis>elements</emphasis>.</para>
-
- <para>Your documentation (whether it is a single web page, or a lengthy
- book) is considered to consist of content. This content is then divided
- (and further subdivided) into elements. The purpose of adding markup is
- to name and identify the boundaries of these elements for further
- processing.</para>
-
- <para>For example, consider a typical book. At the very top level, the
- book is itself an element. This &ldquo;book&rdquo; element obviously
- contains chapters, which can be considered to be elements in their own
- right. Each chapter will contain more elements, such as paragraphs,
- quotations, and footnotes. Each paragraph might contain further
- elements, identifying content that was direct speech, or the name of a
- character in the story.</para>
-
- <para>You might like to think of this as &ldquo;chunking&rdquo; content.
- At the very top level you have one chunk, the book. Look a little
- deeper, and you have more chunks, the individual chapters. These are
- chunked further into paragraphs, footnotes, character names, and so
- on.</para>
-
- <para>Notice how you can make this differentation between different
- elements of the content without resorting to any SGML terms. It really
- is surprisingly straightforward. You could do this with a highlighter
- pen and a printout of the book, using different colours to indicate
- different chunks of content.</para>
-
- <para>Of course, we do not have an electronic highlighter pen, so we need
- some other way of indicating which element each piece of content belongs
- to. In languages written in SGML (HTML, DocBook, et al) this is done by
- means of <emphasis>tags</emphasis>.</para>
-
- <para>A tag is used to identify where a particular element starts, and
- where the element ends. <emphasis>The tag is not part of the element
- itself</emphasis>. Because each DTD was normally written to mark up
- specific types of information, each one will recognise different
- elements, and will therefore have different names for the tags.</para>
-
- <para>For an element called <replaceable>element-name</replaceable> the
- start tag will normally look like
- <literal>&lt;<replaceable>element-name</replaceable>&gt;</literal>. The
- corresponding closing tag for this element is
- <literal>&lt;/<replaceable>element-name</replaceable>&gt;</literal>.</para>
-
- <example>
- <title>Using an element (start and end tags)</title>
-
- <para>HTML has an element for indicating that the content enclosed by
- the element is a paragraph, called <literal>p</literal>. This
- element has both start and end tags.</para>
-
- <programlisting><![ CDATA [<p>This is a paragraph. It starts with the start tag for
- the 'p' element, and it will end with the end tag for the 'p'
- element.</p>
-
-<p>This is another paragraph. But this one is much shorter.</p>]]></programlisting>
- </example>
-
- <para>Not all elements require an end tag. Some elements have no content.
- For example, in HTML you can indicate that you want a horizontal line to
- appear in the document. Obviously, this line has no content, so just
- the start tag is required for this element.</para>
-
- <example>
- <title>Using an element (start tag only)</title>
-
- <para>HTML has an element for indicating a horizontal rule, called
- <literal>hr</literal>. This element does not wrap content, so only
- has a start tag.</para>
-
- <programlisting><![ CDATA [<p>This is a paragraph.</p>
-
-<hr>
-
-<p>This is another paragraph. A horizontal rule separates this
- from the previous paragraph.</p>]]></programlisting>
- </example>
-
- <para>If it is not obvious by now, elements can contain other elements.
- In the book example earlier, the book element contained all the chapter
- elements, which in turn contained all the paragraph elements, and so
- on.</para>
-
- <example>
- <title>Elements within elements; <sgmltag>em</sgmltag></title>
-
- <programlisting><![ CDATA [<p>This is a simple <em>paragraph</em> where some
- of the <em>words</em> have been <em>emphasised</em>.</p>]]></programlisting>
- </example>
-
- <para>The DTD will specify the rules detailing which elements can contain
- other elements, and exactly what they can contain.</para>
-
- <important>
- <para>People often confuse the terms tags and elements, and use the
- terms as if they were interchangeable. They are not.</para>
-
- <para>An element is a conceptual part of your document. An element has
- a defined start and end. The tags mark where the element starts and
- end.</para>
-
- <para>When this document (or anyone else knowledgable about SGML) refers
- to &ldquo;the &lt;p&gt; tag&rdquo; they mean the literal text
- consisting of the three characters <literal>&lt;</literal>,
- <literal>p</literal>, and <literal>&gt;</literal>. But the phrase
- &ldquo;the &lt;p&gt; element&rdquo; refers to the whole
- element.</para>
-
- <para>This distinction <emphasis>is</emphasis> very subtle. But keep it
- in mind.</para>
- </important>
-
- <para>Elements can have attributes. An attribute has a name and a value,
- and is used for adding extra information to the element. This might be
- information that indicates how the content should be rendered, or might
- be something that uniquely identifies that occurence of the element, or
- it might be something else.</para>
-
- <para>An element's attributes are written <emphasis>inside</emphasis> the
- start tag for that element, and take the form
- <literal><replaceable>attribute-name</replaceable>="<replaceable>attribute-value</replaceable>"</literal>.</para>
-
- <para>In sufficiently recent versions of HTML, the <sgmltag>p</sgmltag>
- element has an attribute called <literal>align</literal>, which suggests
- an alignment (justification) for the paragraph to the program displaying
- the HTML.</para>
-
- <para>The <literal>align</literal> attribute can take one of four defined
- values, <literal>left</literal>, <literal>center</literal>,
- <literal>right</literal> and <literal>justify</literal>. If the
- attribute is not specified then the default is
- <literal>left</literal>.</para>
-
- <example>
- <title>Using an element with an attribute</title>
-
- <programlisting><![ CDATA [<p align="left">The inclusion of the align attribute
- on this paragraph was superfluous, since the default is left.</p>
-
-<p align="center">This may appear in the center.</p>]]></programlisting>
- </example>
-
- <para>Some attributes will only take specific values, such as
- <literal>left</literal> or <literal>justify</literal>. Others will
- allow you to enter anything you want. If you need to include quotes
- (<literal>"</literal>) within an attribute then use single quotes around
- the attribute value.</para>
-
- <example>
- <title>Single quotes around attributes</title>
-
- <programlisting><![ CDATA [<p align='right'>I'm on the right!</p>]]></programlisting>
- </example>
-
- <para>Sometimes you do not need to use quotes around attribute values at
- all. However, the rules for doing this are subtle, and it is far
- simpler just to <emphasis>always</emphasis> quote your attribute
- values.</para>
-
- <sect2>
- <title>For you to do&hellip;</title>
-
- <para>In order to run the examples in this document you will need to
- install some software on your system and ensure that an environment
- variable is set correctly.</para>
-
- <procedure>
- <step>
- <para>Download and install <filename>textproc/docproj</filename>
- from the FreeBSD ports system. This is a
- <emphasis>meta-port</emphasis> that should download and install
- all of the programs and supporting files that are used by the
- Documentation Project.</para>
- </step>
-
- <step>
- <para>Add lines to your shell startup files to set
- <envar>SGML_CATALOG_FILES</envar>.</para>
-
- <example id="sgml-primer-envars">
- <title><filename>.profile</filename>, for &man.sh.1; and
- &man.bash.1; users</title>
-
- <programlisting>SGML_ROOT=/usr/local/share/sgml
-SGML_CATALOG_FILES=${SGML_ROOT}/jade/catalog
-SGML_CATALOG_FILES=${SGML_ROOT}/iso8879/catalog:$SGML_CATALOG_FILES
-SGML_CATALOG_FILES=${SGML_ROOT}/html/catalog:$SGML_CATALOG_FILES
-SGML_CATALOG_FILES=${SGML_ROOT}/docbook/catalog:$SGML_CATALOG_FILES
-export SGML_CATALOG_FILES</programlisting>
- </example>
-
- <example>
- <title><filename>.login</filename>, for &man.csh.1; and
- &man.tcsh.1; users</title>
-
- <programlisting>setenv SGML_ROOT /usr/local/share/sgml
-setenv SGML_CATALOG_FILES ${SGML_ROOT}/jade/catalog
-setenv SGML_CATALOG_FILES ${SGML_ROOT}/iso8879/catalog:$SGML_CATALOG_FILES
-setenv SGML_CATALOG_FILES ${SGML_ROOT}/html/catalog:$SGML_CATALOG_FILES
-setenv SGML_CATALOG_FILES ${SGML_ROOT}/docbook/catalog:$SGML_CATALOG_FILES</programlisting>
- </example>
-
- <para>Then either log out, and log back in again, or run those
- commands from the command line to set the variable values.</para>
- </step>
- </procedure>
-
- <procedure>
- <step>
- <para>Create <filename>example.sgml</filename>, and enter the
- following text;</para>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
- <head>
- <title>An example HTML file</title>
- </head>
-
- <body>
- <p>This is a paragraph containing some text.</p>
-
- <p>This paragraph contains some more text.</p>
-
- <p align="right">This paragraph might be right-justified.</p>
- </body>
-</html>]]></programlisting>
- </step>
-
- <step>
- <para>Try and validate this file using an SGML parser.</para>
-
- <para>Part of <filename>textproc/docproj</filename> is the
- &man.nsgmls.1; <link linkend="sgml-primer-validating">validating
- parser</link>. Normally, &man.nsgmls.1; reads in a document
- marked up according to an SGML DTD and returns a copy of the
- document's Element Structure Information Set (ESIS, but that is
- not important right now).</para>
-
- <para>However, when &man.nsgmls.1; is given the <option>-s</option>
- parameter, &man.nsgmls.1; will suppress its normal output, and
- just print error messages. This makes it a useful way to check to
- see if your document is valid or not.</para>
-
- <para>Use &man.nsgmls.1; to check that your document is
- valid;</para>
-
- <screen>&prompt.user; <userinput>nsgmls -s example.sgml</userinput></screen>
-
- <para>As you will see, &man.nsgmls.1; returns without displaying any
- output. This means that your document validated
- successfully.</para>
- </step>
-
- <step>
- <para>See what happens when required elements are omitted. Try
- removing the <sgmltag>title</sgmltag> and
- <sgmltag>/title</sgmltag> tags, and re-run the validation.</para>
-
- <screen>&prompt.user; <userinput>nsgmls -s example.sgml</userinput>
-nsgmls:example.sgml:5:4:E: character data is not allowed here
-nsgmls:example.sgml:6:8:E: end tag for "HEAD" which is not finished</screen>
-
- <para>The error output from &man.nsgmls.1; is organised into
- colon-separated groups, or columns.</para>
-
- <informaltable frame="none">
- <tgroup cols="2">
- <thead>
- <row>
- <entry>Column</entry>
- <entry>Meaning</entry>
- </row>
- </thead>
-
- <tbody>
- <row>
- <entry>1</entry>
- <entry>The name of the program generating the error. This
- will always be <literal>nsgmls</literal>.</entry>
- </row>
-
- <row>
- <entry>2</entry>
- <entry>The name of the file that contains the error.</entry>
- </row>
-
- <row>
- <entry>3</entry>
- <entry>Line number where the error appears.</entry>
- </row>
-
- <row>
- <entry>4</entry>
- <entry>Column number where the error appears.</entry>
- </row>
-
- <row>
- <entry>5</entry>
- <entry>A one letter code indicating the nature of the
- message. <literal>I</literal> indicates an informational
- message, <literal>W</literal> is for warnings, and
- <literal>E</literal> is for errors<footnote>
- <para>It is not always the fifth column either.
- <command>nsgmls -sv</command> displays
- <literal>nsgmls:I: SP version "1.3"</literal>
- (depending on the installed version). As you can see,
- this is an informational message.</para>
- </footnote>, and <literal>X</literal> is for
- cross-references. As you can see, these messages are
- errors.</entry>
- </row>
-
- <row>
- <entry>6</entry>
- <entry>The text of the error message.</entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
-
- <para>Simply omitting the <sgmltag>title</sgmltag> tags has
- generated 2 different errors.</para>
-
- <para>The first error indicates that content (in this case,
- characters, rather than the start tag for an element) has occured
- where the SGML parser was expecting something else. In this case,
- the parser was expecting to see one of the start tags for elements
- that are valid inside <sgmltag>head</sgmltag> (such as
- <sgmltag>title</sgmltag>).</para>
-
- <para>The second error is because <sgmltag>head</sgmltag> elements
- <emphasis>must</emphasis> contain a <sgmltag>title</sgmltag>
- element. Because it does not &man.nsgmls.1; considers that the
- element has not been properly finished. However, the closing tag
- indicates that the element has been closed before it has been
- finished.</para>
- </step>
-
- <step>
- <para>Put the <literal>title</literal> element back in.</para>
- </step>
- </procedure>
- </sect2>
- </sect1>
-
- <sect1 id="sgml-primer-doctype-declaration">
- <title>The DOCTYPE declaration</title>
-
- <para>The beginning of each document that you write must specify the name
- of the DTD that the document conforms to. This is so that SGML parsers
- can determine the DTD and ensure that the document does conform to
- it.</para>
-
- <para>This information is generally expressed on one line, in the DOCTYPE
- declaration.</para>
-
- <para>A typical declaration for a document written to conform with version
- 4.0 of the HTML DTD looks like this;</para>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN">]]></programlisting>
-
- <para>That line contains a number of different components.</para>
-
- <variablelist>
- <varlistentry>
- <term><literal>&lt;!</literal></term>
-
- <listitem>
- <para>Is the <emphasis>indicator</emphasis> that indicates that this
- is an SGML declaration. This line is declaring the document type.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><literal>DOCTYPE</literal></term>
-
- <listitem>
- <para>Shows that this is an SGML declaration for the document
- type.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><literal>html</literal></term>
-
- <listitem>
- <para>Names the first <link linkend="sgml-primer-elements">element</link> that
- will appear in the document.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><literal>PUBLIC "-//W3C//DTD HTML 4.0//EN"</literal></term>
-
- <listitem>
- <para>Lists the Formal Public Identifier (FPI)<indexterm>
- <primary>Formal Public Identifier</primary>
- </indexterm>
- for the DTD that this
- document conforms to. Your SGML parser will use this to find the
- correct DTD when processing this document.</para>
-
- <para><literal>PUBLIC</literal> is not a part of the FPI, but
- indicates to the SGML processor how to find the DTD referenced in
- the FPI. Other ways of telling the SGML parser how to find the
- DTD are shown <link
- linkend="sgml-primer-fpi-alternatives">later</link>.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><literal>&gt;</literal></term>
-
- <listitem>
- <para>Returns to the document.</para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <sect2>
- <title>Formal Public Identifiers (FPIs)<indexterm significance="preferred">
- <primary>Formal Public Identifier</primary>
- </indexterm>
-</title>
-
- <note>
- <para>You don't need to know this, but it's useful background, and
- might help you debug problems when your SGML processor can't locate
- the DTD you are using.</para>
- </note>
-
- <para>FPIs must follow a specific syntax. This syntax is as
- follows;</para>
-
- <programlisting>"<replaceable>Owner</replaceable>//<replaceable>Keyword</replaceable> <replaceable>Description</replaceable>//<replaceable>Language</replaceable>"</programlisting>
-
- <variablelist>
- <varlistentry>
- <term><replaceable>Owner</replaceable></term>
-
- <listitem>
- <para>This indicates the owner of the FPI.</para>
-
- <para>If this string starts with &ldquo;ISO&rdquo; then this is an
- ISO owned FPI. For example, the FPI <literal>"ISO
- 8879:1986//ENTITIES Greek Symbols//EN"</literal> lists
- <literal>ISO 8879:1986</literal> as being the owner for the set
- of entities for greek symbols. ISO 8879:1986 is the ISO number
- for the SGML standard.</para>
-
- <para>Otherwise, this string will either look like
- <literal>-//<replaceable>Owner</replaceable></literal> or
- <literal>+//<replaceable>Owner</replaceable></literal> (notice
- the only difference is the leading <literal>+</literal> or
- <literal>-</literal>).</para>
-
- <para>If the string starts with <literal>-</literal> then the
- owner information is unregistered, with a <literal>+</literal>
- it identifies it as being registered.</para>
-
- <para>ISO 9070:1991 defines how registered names are generated; it
- might be derived from the number of an ISO publication, an ISBN
- code, or an organisation code assigned according to ISO 6523.
- In addition, a registration authority could be created in order
- to assign registered names. The ISO council delegated this to
- the American National Standards Institute (ANSI).</para>
-
- <para>Because the FreeBSD Project hasn't been registered the
- owner string is <literal>-//FreeBSD</literal>. And as you can
- see, the W3C are not a registered owner either.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><replaceable>Keyword</replaceable></term>
-
- <listitem>
- <para>There are several keywords that indicate the type of
- information in the file. Some of the most common keywords are
- <literal>DTD</literal>, <literal>ELEMENT</literal>,
- <literal>ENTITIES</literal>, and <literal>TEXT</literal>.
- <literal>DTD</literal> is used only for DTD files,
- <literal>ELEMENT</literal> is usually used for DTD fragments
- that contain only entity or element declarations.
- <literal>TEXT</literal> is used for SGML content (text and
- tags).</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><replaceable>Description</replaceable></term>
-
- <listitem>
- <para>Any description you want to supply for the contents of this
- file. This may include version numbers or any short text that
- is meaningful to you and unique for the SGML system.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term><replaceable>Language</replaceable></term>
-
- <listitem>
- <para>This is an ISO two-character code that identifies the native
- language for the file. <literal>EN</literal> is used for
- English.</para>
- </listitem>
- </varlistentry>
- </variablelist>
-
- <sect3>
- <title><filename>catalog</filename> files</title>
-
- <para>If you use the syntax above and try and process this document
- using an SGML processor, the processor will need to have some way of
- turning the FPI into the name of the file on your computer that
- contains the DTD.</para>
-
- <para>In order to do this it can use a catalog file. A catalog file
- (typically called <filename>catalog</filename>) contains lines that
- map FPIs to filenames. For example, if the catalog file contained
- the line;</para>
-
- <programlisting>PUBLIC "-//W3C//DTD HTML 4.0//EN" "4.0/strict.dtd"</programlisting>
-
- <para>The SGML processor would know to look up the DTD from
- <filename>strict.dtd</filename> in the <filename>4.0</filename>
- subdirectory of whichever directory held the
- <filename>catalog</filename> file that contained that line.</para>
-
- <para>Look at the contents of
- <filename>/usr/local/share/sgml/html/catalog</filename>. This is
- the catalog file for the HTML DTDs that will have been installed as
- part of the <filename>textproc/docproj</filename> port.</para>
- </sect3>
-
- <sect3>
- <title><envar>SGML_CATALOG_FILES</envar></title>
-
- <para>In order to locate a <filename>catalog</filename> file, your
- SGML processor will need to know where to look. Many of them
- feature command line parameters for specifying the path to one or
- more catalogs.</para>
-
- <para>In addition, you can set <envar>SGML_CATALOG_FILES</envar> to
- point to the files. This environment variable should consist of a
- colon-separated list of catalog files (including their full
- path).</para>
-
- <para>Typically, you will want to include the following files;</para>
-
- <itemizedlist>
- <listitem>
- <para><filename>/usr/local/share/sgml/docbook/catalog</filename></para>
- </listitem>
-
- <listitem>
- <para><filename>/usr/local/share/sgml/html/catalog</filename></para>
- </listitem>
-
- <listitem>
- <para><filename>/usr/local/share/sgml/iso8879/catalog</filename></para>
- </listitem>
-
- <listitem>
- <para><filename>/usr/local/share/sgml/jade/catalog</filename></para>
- </listitem>
- </itemizedlist>
-
- <para>You should <link linkend="sgml-primer-envars">already have done
- this</link>.</para>
- </sect3>
- </sect2>
-
- <sect2 id="sgml-primer-fpi-alternatives">
- <title>Alternatives to FPIs</title>
-
- <para>Instead of using an FPI to indicate the DTD that the document
- conforms to (and therefore, which file on the system contains the DTD)
- you can explicitly specify the name of the file.</para>
-
- <para>The syntax for this is slightly different:</para>
-
- <programlisting><![ CDATA [<!DOCTYPE html SYSTEM "/path/to/file.dtd">]]></programlisting>
-
- <para>The <literal>SYSTEM</literal> keyword indicates that the SGML
- processor should locate the DTD in a system specific fashion. This
- typically (but not always) means the DTD will be provided as a
- filename.</para>
-
- <para>Using FPIs is preferred for reasons of portability. You don't
- want to have to ship a copy of the DTD around with your document, and
- if you used the <literal>SYSTEM</literal> identifier then everyone
- would need to keep their DTDs in the same place.</para>
- </sect2>
- </sect1>
-
- <sect1 id="sgml-primer-sgml-escape">
- <title>Escaping back to SGML</title>
-
- <para>Earlier in this primer I said that SGML is only used when writing a
- DTD. This is not strictly true. There is certain SGML syntax that you
- will want to be able to use within your documents. For example,
- comments can be included in your document, and will be ignored by the
- parser. Comments are entered using SGML syntax. Other uses for SGML
- syntax in your document will be shown later too.</para>
-
- <para>Obviously, you need some way of indicating to the SGML processor
- that the following content is not elements within the document, but is
- SGML that the parser should act upon.</para>
-
- <para>These sections are marked by <literal>&lt;! ... &gt;</literal> in
- your document. Everything between these delimiters is SGML syntax as
- you might find within a DTD.</para>
-
- <para>As you may just have realised, the <link
- linkend="sgml-primer-doctype-declaration">DOCTYPE declaration</link>
- is an example of SGML syntax that you need to include in your
- document&hellip;</para>
- </sect1>
-
- <sect1>
- <title>Comments</title>
-
- <para>Comments are an SGML construction, and are normally only valid
- inside a DTD. However, as <xref linkend="sgml-primer-sgml-escape">
- shows, it is possible to use SGML syntax within your document.</para>
-
- <para>The delimiter for SGML comments is the string
- &ldquo;<literal>--</literal>&rdquo;. The first occurence of this string
- opens a comment, and the second closes it.</para>
-
- <example>
- <title>SGML generic comment</title>
-
- <programlisting>&lt;!-- test comment --></programlisting>
-
- <programlisting><![ CDATA [
-<!-- This is inside the comment -->
-
-<!-- This is another comment -->
-
-<!-- This is one way
- of doing multiline comments -->
-
-<!-- This is another way of --
- -- doing multiline comments -->]]></programlisting>
- </example>
-
- <![ %output.print; [
- <important>
- <title>Use 2 dashes</title>
-
- <para>There is a problem with producing the Postscript and PDF versions
- of this document. The above example probably shows just one hyphen
- symbol, <literal>-</literal> after the <literal>&lt;!</literal> and
- before the <literal>&gt;</literal>.</para>
-
- <para>You <emphasis>must</emphasis> use two <literal>-</literal>,
- <emphasis>not</emphasis> one. The Postscript and PDF versions have
- translated the two <literal>-</literal> in the original to a longer,
- more professional <emphasis>em-dash</emphasis>, and broken this
- example in the process.</para>
-
- <para>The HTML, plain text, and RTF versions of this document are not
- affected.</para>
- </important>
- ]]>
-
- <para>If you have used HTML before you may have been shown different rules
- for comments. In particular, you may think that the string
- <literal>&lt!--</literal> opens a comment, and it is only closed by
- <literal>--&gt;</literal>.</para>
-
- <para>This is <emphasis>not</emphasis> the case. A lot of web browsers
- have broken HTML parsers, and will accept that as valid. However, the
- SGML parsers used by the Documentation Project are much stricter, and
- will reject documents that make that error.</para>
-
- <example>
- <title>Errorneous SGML comments</title>
-
- <programlisting><![ CDATA [
-<!-- This is in the comment --
-
- THIS IS OUTSIDE THE COMMENT!
-
- -- back inside the comment -->]]></programlisting>
-
- <para>The SGML parser will treat this as though it were actually;</para>
-
- <programlisting>&lt;!THIS IS OUTSIDE THE COMMENT&gt;</programlisting>
-
- <para>This is not valid SGML, and may give confusing error
- messages.</para>
-
- <programlisting><![ CDATA [<!--------------- This is a very bad idea --------------->]]></programlisting>
-
- <para>As the example suggests, <emphasis>do not</emphasis> write
- comments like that.</para>
-
- <programlisting><![ CDATA [<!--===================================================-->]]></programlisting>
-
- <para>That is a (slightly) better approach, but it still potentially
- confusing to people new to SGML.</para>
- </example>
-
- <sect2>
- <title>For you to do&hellip;</title>
-
- <procedure>
- <step>
- <para>Add some comments to <filename>example.sgml</filename>, and
- check that the file still validates using &man.nsgmls.1;</para>
- </step>
-
- <step>
- <para>Add some invalid comments to
- <filename>example.sgml</filename>, and see the error messages that
- &man.nsgmls.1; gives when it encounters an invalid comment.</para>
- </step>
- </procedure>
- </sect2>
- </sect1>
-
- <sect1>
- <title>Entities</title>
-
- <para>Entities are a mechanism for assigning names to chunks of content.
- As an SGML parser processes your document, any entities it finds are
- replaced by the content of the entity.</para>
-
- <para>This is a good way to have re-usable, easily changeable chunks of
- content in your SGML documents. It is also the only way to include one
- marked up file inside another using SGML.</para>
-
- <para>There are two types of entities which can be used in two different
- situations; <emphasis>general entities</emphasis> and
- <emphasis>parameter entities</emphasis>.</para>
-
- <sect2 id="sgml-primer-general-entities">
- <title>General Entities</title>
-
- <para>You can not use general entities in an SGML context (although you
- define them in one). They can only be used in your document.
- Contrast this with <link
- linkend="sgml-primer-parameter-entities">parameter
- entities</link>.</para>
-
- <para>Each general entity has a name. When you want to reference a
- general entity (and therefore include whatever text it represents in
- your document), you write
- <literal>&amp;<replaceable>entity-name</replaceable>;</literal>. For
- example, suppose you had an entity called
- <literal>current.version</literal> which expanded to the current
- version number of your product. You could write;</para>
-
- <programlisting><![ CDATA [<para>The current version of our product is
- &current.version;.</para>]]></programlisting>
-
- <para>When the version number changes you can simply change the
- definition of the value of the general entity and reprocess your
- document.</para>
-
- <para>You can also use general entities to enter characters that you
- could not otherwise include in an SGML document. For example, &lt;
- and &amp; can not normally appear in an SGML document. When the SGML
- parser sees the &lt; symbol it assumes that a tag (either a start tag
- or an end tag) is about to appear, and when it sees the &amp; symbol
- it assumes the next text will be the name of an entity.</para>
-
- <para>Fortunately, you can use the two general entities &amp;lt; and
- &amp;amp; whenever you need to include one or other of these </para>
-
- <para>A general entity can only be defined within an SGML context.
- Typically, this is done immediately after the DOCTYPE
- declaration.</para>
-
- <example>
- <title>Defining general entities</title>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [
-<!ENTITY current.version "3.0-RELEASE">
-<!ENTITY last.version "2.2.7-RELEASE">
-]>]]></programlisting>
-
- <para>Notice how the DOCTYPE declaration has been extended by adding a
- square bracket at the end of the first line. The two entities are
- then defined over the next two lines, before the square bracket is
- closed, and then the DOCTYPE declaration is closed.</para>
-
- <para>The square brackets are necessary to indicate that we are
- extending the DTD indicated by the DOCTYPE declaration.</para>
- </example>
- </sect2>
-
- <sect2 id="sgml-primer-parameter-entities">
- <title>Parameter entities</title>
-
- <para>Like <link linkend="sgml-primer-general-entities">general
- entities</link>, parameter entities are used to assign names to
- reusable chunks of text. However, where as general entities can only
- be used within your document, parameter entities can only be used
- within an <link linkend="sgml-primer-sgml-escape">SGML
- context</link>.</para>
-
- <para>Parameter entities are defined in a similar way to general
- entities. However, instead of using
- <literal>&amp;<replaceable>entity-name</replaceable>;</literal> to
- refer to them, use
- <literal>%<replaceable>entity-name</replaceable>;</literal><footnote>
- <para><emphasis>P</emphasis>arameter entities use the
- <emphasis>P</emphasis>ercent symbol.</para>
- </footnote>. The definition also includes the <literal>%</literal>
- between the <literal>ENTITY</literal> keyword and the name of the
- entity.</para>
-
- <example>
- <title>Defining parameter entities</title>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [
-<!ENTITY % param.some "some">
-<!ENTITY % param.text "text">
-<!ENTITY % param.new "%param.some more %param.text">
-
-<!-- %param.new now contains "some more text" -->
-]>]]></programlisting>
- </example>
-
- <para>This may not seem particularly useful. It will be.</para>
- </sect2>
-
- <sect2>
- <title>For you to do&hellip;</title>
-
- <procedure>
- <step>
- <para>Add a general entity to
- <filename>example.sgml</filename>.</para>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" [
-<!ENTITY version "1.1">
-]>
-
-<html>
- <head>
- <title>An example HTML file</title>
- </head>
-
- <!-- You might well have some comments in here as well -->
-
- <body>
- <p>This is a paragraph containing some text.</p>
-
- <p>This paragraph contains some more text.</p>
-
- <p align="right">This paragraph might be right-justified.</p>
-
- <p>The current version of this document is: &version;</p>
- </body>
-</html>]]></programlisting>
- </step>
-
- <step>
- <para>Validate the document using &man.nsgmls.1;</para>
- </step>
-
- <step>
- <para>Load <filename>example.sgml</filename> into your web browser
- (you may need to copy it to <filename>example.html</filename>
- before your browser recognises it as an HTML document).</para>
-
- <para>Unless your browser is very advanced, you won't see the entity
- reference <literal>&amp;version;</literal> replaced with the
- version number. Most web browsers have very simplistic parsers
- which do not handle proper SGML<footnote>
- <para>This is a shame. Imagine all the problems and hacks (such
- as Server Side Includes) that could be avoided if they
- did.</para>
- </footnote>.</para>
- </step>
-
- <step>
- <para>The solution is to <emphasis>normalise</emphasis> your
- document using an SGML normaliser. The normaliser reads in valid
- SGML and outputs equally valid SGML which has been transformed in
- some way. One of the ways in which the normaliser transforms the
- SGML is to expand all the entity references in the document,
- replacing the entities with the text that they represent.</para>
-
- <para>You can use &man.sgmlnorm.1; to do this.</para>
-
- <screen>&prompt.user; <userinput>sgmlnorm example.sgml > example.html</userinput></screen>
-
- <para>You should find a normalised (i.e., entity references
- expanded) copy of your document in
- <filename>example.html</filename>, ready to load into your web
- browser.</para>
- </step>
-
- <step>
- <para>If you look at the output from &man.sgmlnorm.1; you will see
- that it does not include a DOCTYPE declaration at the start. To
- include this you need to use the <option>-d</option>
- option;</para>
-
- <screen>&prompt.user; <userinput>sgmlnorm -d example.sgml > example.html</userinput></screen>
- </step>
- </procedure>
- </sect2>
- </sect1>
-
- <sect1>
- <title>Using entities to include files</title>
-
- <para>Entities (both <link
- linkend="sgml-primer-general-entities">general</link> and <link
- linkend="sgml-primer-parameter-entities">parameter</link>) are
- particularly useful when used to include one file inside another.</para>
-
- <sect2 id="sgml-primer-include-using-gen-entities">
- <title>Using general entities to include files</title>
-
- <para>Suppose you have some content for an SGML book organised into
- files, one file per chapter, called
- <filename>chapter1.sgml</filename>,
- <filename>chapter2.sgml</filename>, and so forth, with a
- <filename>book.sgml</filename> file that will contain these
- chapters.</para>
-
- <para>In order to use the contents of these files as the values for your
- entities, you declare them with the <literal>SYSTEM</literal> keyword.
- This directs the SGML parser to use the contents of the named file as
- the value of the entity.</para>
-
- <example>
- <title>Using general entities to include files</title>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [
-<!ENTITY chapter.1 SYSTEM "chapter1.sgml">
-<!ENTITY chapter.2 SYSTEM "chapter2.sgml">
-<!ENTITY chapter.3 SYSTEM "chapter3.sgml">
-<!-- And so forth -->
-]>
-
-<html>
- <!-- Use the entities to load in the chapters -->
-
- &chapter.1;
- &chapter.2;
- &chapter.3;
-</html>]]></programlisting>
- </example>
-
- <warning>
- <para>When using general entities to include other files within a
- document, the files being included
- (<filename>chapter1.sgml</filename>,
- <filename>chapter2.sgml</filename>, and so on) <emphasis>must
- not</emphasis> start with a DOCTYPE declaration. This is a syntax
- error.</para>
- </warning>
- </sect2>
-
- <sect2>
- <title>Using parameter entities to include files</title>
-
- <para>Recall that parameter entities can only be used inside an SGML
- context. Why then would you want to include a file within an SGML
- context?</para>
-
- <para>You can use this to ensure that you can reuse your general
- entities.</para>
-
- <para>Suppose that you had many chapters in your document, and you
- reused these chapters in two different books, each book organising the
- chapters in a different fashion.</para>
-
- <para>You could list the entities at the top of each book, but this
- quickly becomes cumbersome to manage.</para>
-
- <para>Instead, place the general entity definitions inside one file,
- and use a parameter entity to include that file within your
- document.</para>
-
- <example>
- <title>Using parameter entities to include files</title>
-
- <para>First, place your entity definitions in a separate file, called
- <filename>chapters.ent</filename>. This file contains the
- following;</para>
-
- <programlisting><![ CDATA [<!ENTITY chapter.1 SYSTEM "chapter1.sgml">
-<!ENTITY chapter.2 SYSTEM "chapter2.sgml">
-<!ENTITY chapter.3 SYSTEM "chapter3.sgml">]]></programlisting>
-
- <para>Now create a parameter entity to refer to the contents of the
- file. Then use the parameter entity to load the file into the
- document, which will then make all the general entities available
- for use. Then use the general entities as before;</para>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [
-<!-- Define a parameter entity to load in the chapter general entities -->
-<!ENTITY % chapters SYSTEM "chapters.ent">
-
-<!-- Now use the parameter entity to load in this file -->
-%chapters;
-]>
-
-<html>
- &chapter.1;
- &chapter.2;
- &chapter.3;
-</html>]]></programlisting>
- </example>
- </sect2>
-
- <sect2>
- <title>For you to do&hellip;</title>
-
- <sect3>
- <title>Use general entities to include files</title>
-
- <procedure>
- <step>
- <para>Create three files, <filename>para1.sgml</filename>,
- <filename>para2.sgml</filename>, and
- <filename>para3.sgml</filename>.</para>
-
- <para>Put content similar to the following in each file;</para>
-
- <programlisting><![ CDATA [<p>This is the first paragraph.</p>]]></programlisting>
- </step>
-
- <step>
- <para>Edit <filename>example.sgml</filename> so that it looks like
- this;</para>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [
-<!ENTITY version "1.1">
-<!ENTITY para1 SYSTEM "para1.sgml">
-<!ENTITY para2 SYSTEM "para2.sgml">
-<!ENTITY para3 SYSTEM "para3.sgml">
-]>
-
-<html>
- <head>
- <title>An example HTML file</title>
- </head>
-
- <body>
- <p>The current version of this document is: &version;</p>
-
- &para1;
- &para2;
- &para3;
- </body>
-</html>]]></programlisting>
- </step>
-
- <step>
- <para>Produce <filename>example.html</filename> by normalising
- <filename>example.sgml</filename>.</para>
-
- <screen>&prompt.user; <userinput>sgmlnorm -d example.sgml > example.html</userinput></screen>
- </step>
-
- <step>
- <para>Load <filename>example.html</filename> in to your web
- browser, and confirm that the
- <filename>para<replaceable>n</replaceable>.sgml</filename> files
- have been included in <filename>example.html</filename>.</para>
- </step>
- </procedure>
- </sect3>
-
- <sect3>
- <title>Use parameter entities to include files</title>
-
- <note>
- <para>You must have taken the previous steps first.</para>
- </note>
-
- <procedure>
- <step>
- <para>Edit <filename>example.sgml</filename> so that it looks like
- this;</para>
-
- <programlisting><![ CDATA [<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [
-<!ENTITY % entities SYSTEM "entities.sgml"> %entities;
-]>
-
-<html>
- <head>
- <title>An example HTML file</title>
- </head>
-
- <body>
- <p>The current version of this document is: &version;</p>
-
- &para1;
- &para2;
- &para3;
- </body>
-</html>]]></programlisting>
- </step>
-
- <step>
- <para>Create a new file, <filename>entities.sgml</filename>, with
- this content:</para>
-
- <programlisting><![ CDATA [<!ENTITY version "1.1">
-<!ENTITY para1 SYSTEM "para1.sgml">
-<!ENTITY para2 SYSTEM "para2.sgml">
-<!ENTITY para3 SYSTEM "para3.sgml">]]></programlisting>
- </step>
-
- <step>
- <para>Produce <filename>example.html</filename> by normalising
- <filename>example.sgml</filename>.</para>
-
- <screen>&prompt.user; <userinput>sgmlnorm -d example.sgml > example.html</userinput></screen>
- </step>
-
- <step>
- <para>Load <filename>example.html</filename> in to your web
- browser, and confirm that the
- <filename>para<replaceable>n</replaceable>.sgml</filename> files
- have been included in <filename>example.html</filename>.</para>
- </step>
- </procedure>
- </sect3>
- </sect2>
- </sect1>
-
- <sect1 id="sgml-primer-marked-sections">
- <title>Marked sections</title>
-
- <para>SGML provides a mechanism to indicate that particular pieces of the
- document should be processed in a special way. These are termed
- &ldquo;marked sections&rdquo;.</para>
-
- <example>
- <title>Structure of a marked section</title>
-
- <programlisting>&lt;![ <replaceable>KEYWORD</replaceable> [
- Contents of marked section
-]]&gt;</programlisting>
- </example>
-
- <para>As you would expect, being an SGML construct, a marked section
- starts with <literal>&lt!</literal>.</para>
-
- <para>The first square bracket begins to delimit the marked
- section.</para>
-
- <para><replaceable>KEYWORD</replaceable> describes how this marked
- section should be processed by the parser.</para>
-
- <para>The second square bracket indicates that the content of the marked
- section starts here.</para>
-
- <para>The marked section is finished by closing the two square brackets,
- and then returning to the document context from the SGML context with
- <literal>&gt;</literal></para>
-
- <sect2>
- <title>Marked section keywords</title>
-
- <sect3>
- <title><literal>CDATA</literal>, <literal>RCDATA</literal></title>
-
- <para>These keywords denote the marked sections <emphasis>content
- model</emphasis>, and allow you to change it from the
- default.</para>
-
- <para>When an SGML parser is processing a document it keeps track
- of what is called the &ldquo;content model&rdquo;.</para>
-
- <para>Briefly, the content model describes what sort of content the
- parser is expecting to see, and what it will do with it when it
- finds it.</para>
-
- <para>The two content models you will probably find most useful are
- <literal>CDATA</literal> and <literal>RCDATA</literal>.</para>
-
- <para><literal>CDATA</literal> is for &ldquo;Character Data&rdquo;.
- If the parser is in this content model then it is expecting to see
- characters, and characters only. In this model the &lt; and &amp;
- symbols lose their special status, and will be treated as ordinary
- characters.</para>
-
- <para><literal>RCDATA</literal> is for &ldquo;Entity references and
- character data&rdquo; If the parser is in this content model then it
- is expecting to see characters <emphasis>and</emphasis> entities.
- &lt; loses its special status, but &amp; will still be treated as
- starting the beginning of a general entity.</para>
-
- <para>This is particularly useful if you are including some verbatim
- text that contains lots of &lt; and &amp; characters. While you
- could go through the text ensuring that every &lt; is converted to a
- &amp;lt; and every &amp; is converted to a &amp;amp;, it can be
- easier to mark the section as only containing CDATA. When the SGML
- parser encounters this it will ignore the &lt; and &amp; symbols
- embedded in the content.</para>
-
- <!-- The nesting of CDATA within the next example is disgusting -->
-
- <example>
- <title>Using a CDATA marked section</title>
-
- <programlisting>&lt;para>Here is an example of how you would include some text
- that contained many &amp;lt; and &amp;amp; symbols. The sample
- text is a fragment of HTML. The surrounding text (&lt;para> and
- &lt;programlisting>) are from DocBook.&lt;/para>
-
-&lt;programlisting>
- &lt![ CDATA [ <![ CDATA [
- <p>This is a sample that shows you some of the elements within
- HTML. Since the angle brackets are used so many times, it's
- simpler to say the whole example is a CDATA marked section
- than to use the entity names for the left and right angle
- brackets throughout.</p>
-
- <ul>
- <li>This is a listitem</li>
- <li>This is a second listitem</li>
- <li>This is a third listitem</li>
- </ul>
-
- <p>This is the end of the example.</p>]]>
- ]]&gt;
-&lt/programlisting></programlisting>
-
- <para>If you look at the source for this document you will see this
- technique used throughout.</para>
- </example>
- </sect3>
-
- <sect3>
- <title><literal>INCLUDE</literal> and
- <literal>IGNORE</literal></title>
-
- <para>If the keyword is <literal>INCLUDE</literal> then the contents
- of the marked section will be processed. If the keyword is
- <literal>IGNORE</literal> then the marked section is ignored and
- will not be processed. It will not appear in the output.</para>
-
- <example>
- <title>Using <literal>INCLUDE</literal> and
- <literal>IGNORE</literal> in marked sections</title>
-
- <programlisting>&lt;![ INCLUDE [
- This text will be processed and included.
-]]&gt;
-
-&lt;![ IGNORE [
- This text will not be processed or included.
-]]&gt;</programlisting>
- </example>
-
- <para>By itself, this isn't too useful. If you wanted to remove text
- from your document you could cut it out, or wrap it in
- comments.</para>
-
- <para>It becomes more useful when you realise you can use <link
- linkend="sgml-primer-parameter-entities">parameter entities</link>
- to control this. Remember that parameter entities can only be used
- in SGML contexts, and the keyword of a marked section
- <emphasis>is</emphasis> an SGML context.</para>
-
- <para>For example, suppose that you produced a hard-copy version of
- some documentation and an electronic version. In the electronic
- version you wanted to include some extra content that wasn't to
- appear in the hard-copy.</para>
-
- <para>Create a parameter entity, and set it's value to
- <literal>INCLUDE</literal>. Write your document, using marked
- sections to delimit content that should only appear in the
- electronic version. In these marked sections use the parameter
- entity in place of the keyword.</para>
-
- <para>When you want to produce the hard-copy version of the document,
- change the parameter entity's value to <literal>IGNORE</literal> and
- reprocess the document.</para>
-
- <example>
- <title>Using a parameter entity to control a marked
- section</title>
-
- <programlisting>&lt;!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [
-&lt;!ENTITY % electronic.copy "INCLUDE">
-]]&gt;
-
-...
-
-&lt;![ %electronic.copy [
- This content should only appear in the electronic
- version of the document.
-]]&gt;</programlisting>
-
- <para>When producing the hard-copy version, change the entity's
- definition to;</para>
-
- <programlisting>&lt!ENTITY % electronic.copy "IGNORE"></programlisting>
-
- <para>On reprocessing the document, the marked sections that use
- <literal>%electronic.copy</literal> as their keyword will be
- ignored.</para>
- </example>
- </sect3>
- </sect2>
-
- <sect2>
- <title>For you to do&hellip;</title>
-
- <procedure>
- <step>
- <para>Create a new file, <filename>section.sgml</filename>, that
- contains the following;</para>
-
- <programlisting>&lt;!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [
-&lt;!ENTITY % text.output "INCLUDE">
-]&gt;
-
-&lt;html>
- &lt;head>
- &lt;title>An example using marked sections&lt;/title>
- &lt;/head>
-
- &lt;body>
- &lt;p>This paragraph &lt;![ CDATA [contains many &lt;
- characters (&lt; &lt; &lt; &lt; &lt;) so it is easier
- to wrap it in a CDATA marked section ]]&gt;&lt/p>
-
- &lt;![ IGNORE [
- &lt;p>This paragraph will definitely not be included in the
- output.&lt;/p>
- ]]&gt;
-
- &lt;![ <![ CDATA [%text.output]]> [
- &lt;p>This paragraph might appear in the output, or it
- might not.&lt;/p>
-
- &lt;p>Its appearance is controlled by the <![CDATA[%text.output]]>
- parameter entity.&lt;/p>
- ]]&gt;
- &lt;/body>
-&lt;/html></programlisting>
- </step>
-
- <step>
- <para>Normalise this file using &man.sgmlnorm.1; and examine the
- output. Notice which paragraphs have appeared, which have
- disappeared, and what has happened to the content of the CDATA
- marked section.</para>
- </step>
-
- <step>
- <para>Change the definition of the <literal>text.output</literal>
- entity from <literal>INCLUDE</literal> to
- <literal>IGNORE</literal>. Re-normalise the file, and examine the
- output to see what has changed. </para>
- </step>
- </procedure>
- </sect2>
- </sect1>
-
- <sect1>
- <title>Conclusion</title>
-
- <para>That is the conclusion of this SGML primer. For reasons of space
- and complexity several things have not been covered in depth (or at
- all). However, the previous sections cover enough SGML for you to be
- able to follow the organisation of the FDP documentation.</para>
- </sect1>
-</chapter>
-
-<!--
- Local Variables:
- mode: sgml
- sgml-declaration: "../chapter.decl"
- sgml-indent-data: t
- sgml-omittag: nil
- sgml-always-quote-attributes: t
- sgml-parent-document: ("../book.sgml" "part" "chapter")
- End:
--->