The eXtensible Markup Language (XML) is a markup language that uses textual tags to describe the meaning of its content. It is a text based format and as such it is both human and machine-readable. <!–more–>
The XML language was first developed as a means of overcoming the problem of storing and passing data between entities. Today, XML is commonly used in web applications to pass data over the network with formatting embedded into the file itself. HTML is a markup language based on XML.
Advertisement Begins
Advertisement End
What are the advantages and disadvantages of storing data in XML and binary format? The eXtensible Markup Language (XML) is a markup language that uses textual tags to describe the meaning of its content. It is a text based format and as such it is both human and machine-readable.
What are the advantages of XML over binary data?
Kitchen Multi-Timer Pro
Now you’re cooking
Multi Timer Pro is your ultimate meal prep companion, keeping track of multiple cooking times and making adjustments on the fly. Give it a try today and become a better home cook!
Representing numbers and strings in binary can be ambiguous because different systems encode and represent these datatypes differently, which makes binary data difficult to interpret. For example, try saving an int
and then a float
to disk in binary format using a C program and read it back as a float
followed by an int
!
Binary suffers from a number of such representation issues. Endian byte order for integers1, the IEEE format for floats, and different-sized booleans and string formats used across platforms and programming languages name a few of these representation issues. Since XML is a text-based format, the representation of numbers and strings is unambiguous. What goes in as a printable character, must come out as a printable character. It is therefore trivial to detect corrupted XML data. Unprintable characters imply corruption.
XML is extensible because you can do things with it that you didn’t think of when XML was designed. Adding new XML tags does not break existing code.
Similarly, XML is a very flexible language in that slight formatting variations usually don’t matter.
XML is self-describing and text-based.
Affiliate Content Start
New Amazon Kindle (16 GB) - Lightest and most compact Kindle, with glare-free display, faster page turns, adjustable front light, and long battery life - Matcha
$109.99 (as of December 22, 2024 06:44 GMT +08:00 - More infoProduct prices and availability are accurate as of the date/time indicated and are subject to change. Any price and availability information displayed on [relevant Amazon Site(s), as applicable] at the time of purchase will apply to the purchase of this product.)Apple Watch SE (2nd Gen) [GPS 40mm] Smartwatch with Starlight Aluminium Case with Starlight Sport Band S/M. Fitness and Sleep Trackers, Crash Detection, Heart Rate Monitor, Retina Display
$181.44 (as of December 22, 2024 06:44 GMT +08:00 - More infoProduct prices and availability are accurate as of the date/time indicated and are subject to change. Any price and availability information displayed on [relevant Amazon Site(s), as applicable] at the time of purchase will apply to the purchase of this product.)Wireless Earbuds, Bluetooth 5.3 Headphones Bass Stereo, Ear Buds with Noise Cancelling Mic LED Display, IP7 Waterproof in Ear Earphones, 36H Playtime for Laptop Pad Phones Sports Workout, Black
$29.99 (as of December 22, 2024 06:44 GMT +08:00 - More infoProduct prices and availability are accurate as of the date/time indicated and are subject to change. Any price and availability information displayed on [relevant Amazon Site(s), as applicable] at the time of purchase will apply to the purchase of this product.)Affiliate Content End
Binary Advantages over XML
There are some great reasons to use binary data. For one, binary data tends to be a lot smaller in size compared to text-based data. Transferring large amounts of data in binary form over a network is therefore much more efficient as there is little overhead introduced by XML tags and character encoding. XML addresses all data as alphanumeric and symbol characters. Encoding raw binary data, such as image files, involves encoding its bits into (unprintable) “characters” which must be converted back into binary data when the file is read. This process doubles the data size and makes XML very unsuitable for storing and transmitting binary data.
Storing the number 1000 as a standard 32-bit integer in binary format would take 32 bits exactly the same number of bits as storing the string “1000”, which consists of four characters, each one taking up 8-bits. The number 10000 would still only take 32 bits in memory1, however, the string representation of the number, "10000"
would take 5*8 = 40bits stored in an XML file, excluding any tags and metadata.
Another advantage is the speed of reading binary data. There is no need to convert between text and numbers when dealing directly with binary information as is the case when reading and converting XML files.
Binary data is simpler than XML. Parsing XML is complicated because there are multiple ways to structure the same data.
Conclusion
XML’s flexibility is a burden when it comes to parsing XML data.
MY MISSION
This blog started nearly 10 years ago to help me document my technical adventures in home automation and various side projects. Since then, my audience has grown significantly thanks to readers like you.
While blog content can be incredibly valuable to visitors, it’s difficult for bloggers to capture any of that value – and we still have to work for a living too. There are many ways to support my efforts should you choose to do so:
Consider joining my newsletter or shouting a coffee to help with research, drafting, crafting and publishing of new content or the costs of web hosting.
It would mean the world if gave my Android App a go or left a 5-star review on Google Play. You may also participate in feature voting to shape the apps future.
Alternatively, leave the gift of feedback, visit my Etsy Store or share a post you liked with someone who may be interested. All helps spread the word.
BTC network: 32jWFfkMQQ6o4dJMpiWVdZzSwjRsSUMCk6