Getting Text Node Contents from XML in Actionscript 3.0

Following up on a Mark's comment regarding XML text nodes, I ended up comparing the output of a handful of XML methods in different situations. The overall problem is grabbing HTML from an element.

With XML entities encoded

The first set of results takes XML data which uses the correct XML entities. In the code below, the bold tag has been properly encoded.

// nodeKind is "element"
var node:XML = <value>Some &lt;b&gt;text&lt;/b&gt; is bold</value>;
Code Output
node
Some <b>text</b> is bold
node.*
Some <b>text</b> is bold
node.text()
Some <b>text</b> is bold
node.toString()
Some <b>text</b> is bold
node.toXMLString()
<value>Some &lt;b&gt;text&lt;/b&gt; is bold</value>

As you can see, the entities are automatically decoded by most methods. If you want to get the contents without decoding the XML entities, you can use xml.*.toXMLString().

With unencoded XML entities

This second example uses un-encoded HTML tags. While it is better to encode the XML entities, I've had some clients who wanted to edit the XML file directly, using HTML tags without manually escaping all the entities. This has the disadvantage that some malformed HTML text can ruin the entire XML document, but this was acceptable since these were technical users who could fix such problems on their own.

Here are the results using XML with unencoded HTML tags.

// nodeKind is "element"
var node:XML = <value>Some <b>text</b> is bold</value>;
Code Output
node
<value>
  Some
  <b>text</b>
  is bold
</value>
node.*
Some
<b>text</b>
is bold
node.text()
Someis bold
node.toString()
<value>
  Some
  <b>text</b>
  is bold
</value>
node.toXMLString()
<value>
  Some
  <b>text</b>
  is bold
</value>

There's a lot more variety here, though most of the functions return the entire element as a string. node.* is the best choice for picking out the HTML contents.

The behaviour of node.text() warrants a little bit of explanation. The XML.text() method returns an XMLList of all the direct child elements of node whose nodeKind is "text." Since the bold tag has a nodeKind of "element," that node and its contents aren't included.

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
If you have a Gravatar account, used to display your avatar.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • You can enable syntax highlighting of source code with the following tags: <code>. Beside the tag style "<foo>" it is also possible to use "[foo]".
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options