Attributes Entitites   «Prev  Next»

Lesson 4Defining entities
Objective Define entities in a DTD.

Defining XML Entities

In HTML, you can include special characters through the use of pre-defined entities such as these:
<
"
©

These entities represent predefined characters that are not keyboard characters or that might be interpreted incorrectly by the browser. For example, &lt; represents the "less than" left-angle bracket character (<). Entities are defined in HTML to help define certain characters that might otherwise be interpreted differently.
A Document Type Definition (DTD) is used to define the structure and data types for XML documents, and entities are a key component in this process. Entities in a DTD act as placeholders or shortcuts for reusable content, such as text, markup, or external data, allowing for modularity and efficiency in XML documents. They are defined using the <!ENTITY> declaration, which specifies the entity's name and its replacement content. Entities can be internal, where the content is defined directly in the DTD, or external, where the content is pulled from an outside source. Properly defining entities ensures consistency, simplifies maintenance, and enables reuse across multiple XML documents.
To define an internal entity, the syntax is <!ENTITY entityName "replacementText">. For example, <!ENTITY company "Acme Corp"> creates an entity named company that, when referenced as &company; in the XML document, is replaced with "Acme Corp". Internal entities are useful for fixed text like company names, addresses, or boilerplate content that appears repeatedly. For external entities, the declaration references an external resource, such as <!ENTITY chapter SYSTEM "chapter1.xml">, where chapter pulls in the content of the file chapter1.xml. External entities are ideal for including large sections of content or data maintained separately, ensuring the XML document remains modular and easier to update.
Entities can also be parameterized for use within the DTD itself, defined with <!ENTITY % entityName "replacementText"> and referenced using %entityName;. These are typically used to define reusable DTD components, like element or attribute lists. For instance, <!ENTITY % commonAttrs "id ID #IMPLIED"> allows the commonAttrs entity to be reused in multiple element declarations. When defining entities, it is critical to ensure unique names, avoid circular references, and validate the replacement content to prevent parsing errors. By carefully structuring entity declarations, developers can create flexible, maintainable, and scalable XML document frameworks.


Built-in Entities

In XML, there are five built-in entities that you need not define but can readily use. These are:
   
&lt;
&gt;
&quot;
&apos;
&amp;

  • Declaring general entities
    A general entity is defined in a DTD and referenced in the XML document. In a DTD, you can define your own entities in addition to five listed above. You can also define multiple characters.

Defining Multiple Characters

Imagine that you are working on a new software product code-named Maui. You may need to create a host of marketing information for your Web site, and you would not want to manually search and replace the word Mauion each of 50 pages with the eventual product name. So you could create an entity named &NewProduct; such that the entity could contain "Maui" until the name of the product is decided. You would then simply replace the string "Maui" in the entity declaration with the new product name. Once you do this, everywhere that &NewProduct; has been specified, the new product name will appear.
  • Single XML document
    A single XML document can draw both data and declarations from many different sources in many different files. In fact, some of the data may draw directly from databases, CGI scripts, or other nonfile sources. The items where the pieces of an XML document are stored, in whatever form they take, are called entities. Entity references load these entities into the main XML document. General entity references load data into the root element of an XML document. <, >, ', " and & are predefined general entity references that refer to the text entities <, >, ', ", and &, respectively. Parameter entity references load data into the document's document type definition (DTD). They begin with a % instead of an &. Unparsed entities point to non-XML, binary data whose type is identified with a notation and are referenced by an ENTITY type attribute. All three kinds of entities are declared in the DTD.


Entity as a Replaceable Field

If you have ever performed a "mail merge," you are familiar with using replaceable fields to represent actual text. Think of an entity as a replaceable field representing some other text. The beauty of using entities is that you can define them once in a DTD and then use them throughout a range of documents. Declare an entity using the following syntax:
< !ENTITY entityName "character string represented"> 

If in your DTD you declared the following entity:
< !ENTITY prodname "ACME Calendar"> 
you could use the following in your XML file: Thank you for choosing &prodname; as your primary scheduling program.
When rendered by a user agent, the actual text would then read as follows:
Thank you for choosing ACME Calendar as your primary scheduling program.
The next lesson shows you how to create parameter entities to use within a DTD.

SEMrush Software