Advertisement
Tech

XML Tips and Tricks: Line Breaks in XML Documents

Working with XML files has always had a few gotchas. Special characters, especially line breaks, are one of the worst of these. Continue reading this article to discover how to create line breaks in XML documents.

By Daniel Robson
Desk Tech
Reading time 3 min read
Word count 628
Web development Internet HTML articles
XML Tips and Tricks: Line Breaks in XML Documents
Advertisement
Quick Take

Working with XML files has always had a few gotchas. Special characters, especially line breaks, are one of the worst of these. Continue reading this article to discover how to create line breaks in XML documents.

On this page

XML documents were designed as a universal data storage standard. The idea was that any program could read and process an XML file, enabling data to be freely shared rather than kept hidden in proprietary data silos’. A reasonable suggestion, but the lack of a single standard for displaying the contents of XML files has lead to several different ways of interpreting the stored data. For the purposes of this article we will be exploring how to insert a line break into xml text nodes. It is also possible to add line breaks into attribute values, but the XHTML specification recommends against this.

Line Breaks Across Operating Systems

As XML is designed as a universal format, it needs to work across all operating systems. Special characters are often used to signify the end of a line, dating all the way back to the line feed (LF, ‘\n’, 0x0A) and carriage return (CR, ‘\r’, 0x0D) symbols used in the ASCII standard. Early computer systems used both CR+LF, primarily due to direct interaction with teletype terminals. This convention has carried over to modern Windows based systems. However *nix based systems such as Linux and Mac OS X use just the line feed character, while version 9 and earlier of the Mac operating system used carriage return as their newline character.

Advertisement

Already we can see that handling the wide variety of newline characters may be an issue. Even worse, due to the nature of XML, how these line break characters are interpreted could vary wildly between individual programs. Luckily this issue was foreseen by the designers of XML, and so by default all newline characters in XML are normalized to ‘/n’, or the line feed character. However if you want to use a different character to indicate a line break in your XML document, it is possible to insert it using a character entity reference. For instance, ‘/r’ or the carriage return character can be inserted using ‘ ’.

Line Break Handling in Specific Programs

For the vast majority of programs then, the normalized ‘/n’ line break in XML files should be correctly interpreted. This is certainly the case in C# and Java. As a quick aside, if you are programmatically generating XML files in C# I would highly recommend reading Microsoft’s article on indenting XML files using XmlDocument or XSL Transforms .

Advertisement

There are, of course, a number of programs that aren’t quite so nice; Flash and ActionScript are two notorious culprits for messing up line breaks. One way to get around Flash’s distaste for whitespace characters is to make use of ‘ignoreWhite’ when loading an XML document:

var objectName = new XML();

Advertisement

objectName.ignoreWhite = true;

objectName.load(“fileToLoad.xml”);

Advertisement

This will often fix weird formatting issues, such as double spaced line breaks. If you find that Flash isn’t rendering line breaks in your XML file at all, then CDATA tags may be the solution. These allow embedding of HTML tags within an XML node, and can be used specifically to embed the
html line break character, like so:

<![CDATA[first line
second line
]]>

Advertisement

When dealing with Flash, using character entity references is also a valid tactic. Characters like ‘ ’ get ignored by the initial Flash parser, and hence displayed properly when parsed by the browser.

Moving away from Flash, PHP is another language that sometimes has issues properly display newline characters. When outputting data from an XML document into HTML you might find it useful to make use of the ’nl2br’ function, which inserts HTML line breaks before every newline character.

Advertisement

Conclusion

Hopefully this article has given a good overview of how to handle line breaks in XML files. If you have any questions please feel free to leave in a note in the comments section, and I’ll try to answer any queries.

Keep Exploring

More from Tech

Filed under
Web development Internet
More topics
HTML articles
Advertisement