Chapter 2 Entity References This chapter explains entity references and how you can use them in your documents. This chapter also contains the syntax for defining entity references. What is an Entity Reference? An entity reference is used to substitute an abbreviation for a long string in your document. In Standard Generalized Markup Language (SGML) and eXtensible Markup Language (XML), entity references are used in a variety of ways. Even though BASIS supports only the substitution of strings for entity references in BASIS Generalized Markup Language (BGML) and XML documents, BASIS preserves and stores entity references found in an SGML document. IMPORTANT: Before BASIS 8.0, even though upper- and lowercase could be used, entity references were raised to uppercase when parsed. But beginning with the BASIS 8.0 release, entity references are no longer raised to uppercase. Within documents, any markup that does not match the case of the originally defined entity references will not be recognized as that entity. Any newly loaded document which contains entity references that do not match the case of those defined in the markup and style guide will have the entity loaded as regular text instead of as an entity reference. Entity References 35 Defining Entity References To define an entity reference, use the DEFINE/ENTITY statement. For example, in the following document excerpt, XYZ is an entity reference. Notice, in the example given below, that XYZ is delimited by the entity reference open and entity reference close marks (& and ; respectively). <H1>Holidays</H1> <P>Observance of holidays at &XYZ; should follow local custom for similar organizations in like circumstances and be consistent with the pursuit of &XYZ;'s objectives and purposes. Corporate approval is required before establishing any holidays.</P> The DEFINE/ENTITY statement that is placed in the markup and style guide for this entity reference is DEFINE/ENTITY XYZ=1 TEXT='XYZ Corporation' When this document is imported into a BASIS database, the long string XYZ Corporation replaces every XYZ entity reference, in the document. A description of DEFINE/ENTITY and its parameters follows. Definition of BGML/XML Entities Purpose: To define an entity reference. An entity reference is used to substitute an abbreviation for a long string in your document. Syntax: DEFine/ENTity entity_spec=entity_code TEXT='char_cons_255' | NONE 36 Entity References Parameters: entity_spec (Required) Specifies the entity reference. An entity reference can be in one of three forms: entity id, entity character, or entity character code. An entity id is a long_id (a character string from 1 to 32 characters in length). For more details about long_id, see Appendix A, “Common Syntax.” It appears in the text delimited by the entity reference open and entity reference close marks which are defined on the DEFINE/CONVERTER statement. An entity character is a single character (with a character code from 1 to 255) enclosed in single quotes. It appears in the text as a single character without the quotes. This form of entity reference is known as a short entity reference. An entity character code is an integer from 1 to 255. It appears in the text as a single character, not as a character code. In other words, you enter the character whose character code is the integer you used for the entity character code. (See Example #3.) entity_code (Required) Specifies the unique internal code for this entity. An entity_code is an integer from 1 to 65535. Numbers from 1 to 255 take the least amount of storage and numbers from 256 to 65408 take more storage. Therefore, the lower the number the better for compact data storage. Your more common entities should be assigned the lower numbers. TEXT='char_cons_255' | NONE (Required) Specifies the ‘text substitution string’ which is substituted for the entity reference in order to validate, search, or index the data. This string can contain up to 255 characters. Key Points: For more information about common syntax (e.g., long_id or 'char_cons_255'), see “Common Syntax.” Rather than using a single character as an entity reference (and requiring others to remember to use or avoid using this character in their documents), it is better to use an entity id. Entity References 37 A short entity reference cannot be null. Only the numeric values 1:255 are valid. If you define a short entity reference as ‘<’ for example, you should not define any delimiters in your converters which begin with that character. This prevents any confusion during markup parsing. As a document is imported into the system, the reference to an entity is either saved or lost, depending on the value of the RETAIN_ENTITY_REFS parameter on the DEFINE/CONVERTER definition. If RETAIN_ENTITY_REFS=YES, the entity reference is saved as a kind of hidden embedded object. The text substitution string is put in the data and is NOT rescanned by the BGML or XML converter. Since the text substitution string is not rescanned, the converter does not notice any markup (such as start and end tags) that is embedded in the text substitution. When the document is exported, the original entity reference is output in the text and the substitution string is removed. If RETAIN_ENTITY_REFS=NO, the entity reference is replaced by the text substitution string on import. The text substitution string is now a permanent part of your document. Short entity references are not delimited by entity reference open and entity reference close marks. 1. This first example shows an entity id form of an entity reference. Notice, in the document excerpt given below, that the entity id, XYZ, is delimited by the entity reference open and entity reference close marks (& and ; respectively). Examples: DEFINE/ENTITY XYZ=1 TEXT='XYZ Corporation' <H1>Holidays</H1> <P>Observance of holidays at &XYZ; should follow local custom for similar organizations in like circumstances and be consistent with the pursuit of &XYZ;'s objectives and purposes. Corporate approval is required before establishing any holidays.</P> 2. This is an example of a short entity reference. In the document excerpt, notice that the entity character is not enclosed in single quotes nor is it delimited by entity reference open and entity reference close marks. 38 Entity References DEFINE/ENTITY '@'=27 TEXT='XYZ Corporation' <H1>Holidays</H1> <P>Observance of holidays at @ should follow local custom for similar organizations in like circumstances and be consistent with the pursuit of @'s objectives and purposes. Corporate approval is required before establishing any holidays.</P> 3. The entity character code form of an entity reference is used in this example. In the DEFINE/ENTITY statement, 92 is used, but in the document segment \ is used. DEFINE/ENTITY 92=32 TEXT='XYZ Corporation' <H1>Holidays</H1> <P>Observance of holidays at \ should follow local custom for similar organizations in like circumstances and be consistent with the pursuit of \'s objectives and purposes. Corporate approval is required before establishing any holidays.</P> Entity References 39 40 Entity References