MSG Chapter 2: Entity References

advertisement
Chapter 2
Entity References
This chapter explains entity references and how you can use them in your documents.
This chapter also contains the syntax for defining entity references.
What is an Entity Reference?
An entity reference is used to substitute an abbreviation for a long string in your
document.
In Standard Generalized Markup Language (SGML) and eXtensible Markup Language
(XML), entity references are used in a variety of ways. Even though BASIS supports
only the substitution of strings for entity references in BASIS Generalized Markup
Language (BGML) and XML documents, BASIS preserves and stores entity references
found in an SGML document.
IMPORTANT: Before BASIS 8.0, even though upper- and lowercase could be used,
entity references were raised to uppercase when parsed. But beginning with the BASIS
8.0 release, entity references are no longer raised to uppercase. Within documents, any
markup that does not match the case of the originally defined entity references will not be
recognized as that entity. Any newly loaded document which contains entity references
that do not match the case of those defined in the markup and style guide will have the
entity loaded as regular text instead of as an entity reference.
Entity References  35
Defining Entity References
To define an entity reference, use the DEFINE/ENTITY statement.
For example, in the following document excerpt, XYZ is an entity reference. Notice, in
the example given below, that XYZ is delimited by the entity reference open and entity
reference close marks (& and ; respectively).
<H1>Holidays</H1>
<P>Observance of holidays at &XYZ; should follow
local custom for similar organizations in like
circumstances and be consistent with the pursuit of
&XYZ;'s objectives and purposes. Corporate
approval is required before establishing any
holidays.</P>
The DEFINE/ENTITY statement that is placed in the markup and style guide for this
entity reference is
DEFINE/ENTITY XYZ=1
TEXT='XYZ Corporation'
When this document is imported into a BASIS database, the long string XYZ Corporation
replaces every XYZ entity reference, in the document.
A description of DEFINE/ENTITY and its parameters follows.
Definition of BGML/XML Entities
Purpose:
To define an entity reference. An entity reference is used to substitute an abbreviation for
a long string in your document.
Syntax:
DEFine/ENTity
entity_spec=entity_code
TEXT='char_cons_255' | NONE
36  Entity References
Parameters:
entity_spec
(Required)
Specifies the entity reference. An entity reference can be in one of three forms: entity id,
entity character, or entity character code.
An entity id is a long_id (a character string from 1 to 32 characters in length). For more
details about long_id, see Appendix A, “Common Syntax.” It appears in the text
delimited by the entity reference open and entity reference close marks which are defined
on the DEFINE/CONVERTER statement.
An entity character is a single character (with a character code from 1 to 255) enclosed in
single quotes. It appears in the text as a single character without the quotes. This form of
entity reference is known as a short entity reference.
An entity character code is an integer from 1 to 255. It appears in the text as a single
character, not as a character code. In other words, you enter the character whose
character code is the integer you used for the entity character code. (See Example #3.)
entity_code
(Required)
Specifies the unique internal code for this entity. An entity_code is an integer from 1 to
65535.
Numbers from 1 to 255 take the least amount of storage and numbers from 256 to 65408
take more storage. Therefore, the lower the number the better for compact data storage.
Your more common entities should be assigned the lower numbers.
TEXT='char_cons_255' | NONE
(Required)
Specifies the ‘text substitution string’ which is substituted for the entity reference in order
to validate, search, or index the data.
This string can contain up to 255 characters.
Key Points:

For more information about common syntax (e.g., long_id or 'char_cons_255'), see
“Common Syntax.”

Rather than using a single character as an entity reference (and requiring others to
remember to use or avoid using this character in their documents), it is better to use
an entity id.
Entity References  37

A short entity reference cannot be null. Only the numeric values 1:255 are valid.

If you define a short entity reference as ‘<’ for example, you should not define any
delimiters in your converters which begin with that character. This prevents any
confusion during markup parsing.

As a document is imported into the system, the reference to an entity is either saved
or lost, depending on the value of the RETAIN_ENTITY_REFS parameter on the
DEFINE/CONVERTER definition.
If RETAIN_ENTITY_REFS=YES, the entity reference is saved as a kind of hidden
embedded object. The text substitution string is put in the data and is NOT
rescanned by the BGML or XML converter. Since the text substitution string is not
rescanned, the converter does not notice any markup (such as start and end tags) that
is embedded in the text substitution. When the document is exported, the original
entity reference is output in the text and the substitution string is removed.
If RETAIN_ENTITY_REFS=NO, the entity reference is replaced by the text
substitution string on import. The text substitution string is now a permanent part of
your document.

Short entity references are not delimited by entity reference open and entity reference
close marks.
1.
This first example shows an entity id form of an entity reference. Notice, in the
document excerpt given below, that the entity id, XYZ, is delimited by the entity
reference open and entity reference close marks (& and ; respectively).
Examples:
DEFINE/ENTITY XYZ=1
TEXT='XYZ Corporation'
<H1>Holidays</H1>
<P>Observance of holidays at &XYZ; should follow
local custom for similar organizations in like
circumstances and be consistent with the pursuit of
&XYZ;'s objectives and purposes. Corporate approval
is required before establishing any holidays.</P>
2.
This is an example of a short entity reference. In the document excerpt, notice that
the entity character is not enclosed in single quotes nor is it delimited by entity
reference open and entity reference close marks.
38  Entity References
DEFINE/ENTITY '@'=27
TEXT='XYZ Corporation'
<H1>Holidays</H1>
<P>Observance of holidays at @ should follow local
custom for similar organizations in like
circumstances and be consistent with the pursuit of
@'s objectives and purposes. Corporate approval is
required before establishing any holidays.</P>
3.
The entity character code form of an entity reference is used in this example. In the
DEFINE/ENTITY statement, 92 is used, but in the document segment \ is used.
DEFINE/ENTITY 92=32
TEXT='XYZ Corporation'
<H1>Holidays</H1>
<P>Observance of holidays at \ should follow local
custom for similar organizations in like
circumstances and be consistent with the pursuit of
\'s objectives and purposes. Corporate approval is
required before establishing any holidays.</P>
Entity References  39
40  Entity References
Download