I tried using XML on a lark the other day and realized that XSDs are actually somewhat load bearing. It's difficult to map data in XML to objects in your favorite programming language without the schema being known beforehand as lists of a single element are hard to distinguish from just a property of the overall object.
Maybe this is okay if you know your schema beforehand and are willing to write an XSD. My usecase relied on not knowing the schema. Despite my excitement to use a SAX-style parser, I tucked my tail between my legs and switched back to JSONL. Was I missing something?
XML is extensible markup, i.e. it's like HTML that can be applied to tasks outside of representing web pages. It's designed to be written by hand. It has comments! A good use for XML would be declaring a native UI: it's not HTML but it's like HTML.
JSON is a plain text serialization format. It's designed to be generated and consumed by computers whilst being readable by humans.
Neither is a configuration language but both have been abused as one.
This assertion is comically out of touch with reality, particularly when trying to describe JSON as something that is merely "readable by humans". You could not do anything at all with XML without having to employ half a dozen frameworks and tools and modules.
> The complexity about XML comes from the many additional languages and tools built on top of it.
It's not just that, is it? There are also attributes versus child elements, dealing with white space including the xml:space attribute, namespaces, schemas, integration of external document fragments with xinclude:include or &extern;. Each of these is a huge can of worms in its own right. There are probably more that I'm not even aware of right now.
A few years ago, I wrote a fully functional parser for JSON that is easy to verify for correctness and that isn't just lying around somewhere as a toy, but is actually used (by me) in various projects time and again. Overall, building this parser was almost trivial. With XML, I'm not even sure I would be able to write a correct and complete parser.
But I agree with you that XML-based languages and XML tools make things even worse. I had to work with XML a lot over ten years ago. I still get annoyed when I think about XSLT, or dealing with schemas, or the challenge of finding usable tools that are reasonably compliant with standards.
You can only have a positive view of XML when you think of something like this:
And at that level, I have (almost) no problem with XML. But as soon as things get more demanding and you really take the various aspects of XML's value proposition seriously, you enter a world of pain and despair. At least, that's how it was for me back then. Maybe I would see things differently today, but I'm not really interested in finding out.
First, you're describing the parsing side, while the message I was replying to claimed that it can't be written by hand.
Anyhow, schemas, XInclude and even namespaces are what I was referring to as additional languages of tools.
In your application you use them if you want, they're not really part of XML.
Of course even a parser for plain XML is a lot more complex than one for JSON, but people usually use libraries for that...
In any case, in your application nothing prevents you from using a dumbed-down version of XML, without entities, white space handling, and even only looking at elements and attributes; there were some applications that did that.
That already gives you a format that's easier to read and write manually than json.
I had more to say about "attributes versus child elements", but it's taking me too much time, I'll probably do that tomorrow.
I think I understand your point. I only brought parsing into play to illustrate that XML is complicated, not because it's my general focus. I wouldn't classify namespaces, etc. as additional languages and tools, but that's beside the point.
> in your application nothing prevents you from using a dumbed-down version of XML
That's right. And if XML were exactly that, then there wouldn't be so many people frustrated with it. Unfortunately, in a professional work context, you don't always have control over whether it stays within this manageable subset. Sometimes the less pleasant aspects simply come into play, and then you have to deal with the whole complicated mess.
Are you sure about that? I've heard XML gurus say the exact opposite.
This is a very good example of why I detest the phrase “use the right tool for the job.” People say this as an appeal to reason, as if there weren't an obvious follow-up question that different people might answer very differently.
SGML was designed for documents, and it can be written by hand (or by a machine). HTML (another descendant of SGML) is in fact written by hand regularly. When you're using SGML descendants for what they were meant for (documents) they're pretty good for this purpose. Writing documents — not configuration files, not serialized data, not code — by hand.
XML can still be used as a very powerful generic document markup language, that is more restricted (and thus easier to parse) than SGML. The problems started when people started using XML for other things, especially for configuration files, data interchange and even for programming language.
So I don't think GP is wrong. The authors of the original XML spec probably envisioned people writing this by hand. But XML is very bad for writing by hand the things that it eventually got used for.
Perfectly sure. XML is eXtensible Markup Language, the generalized counterpart to Hypertext Markup Language.
XML, HTML, SGML are all designed to be written by hand.
You can generate XML, just like you can generate HTML, but the language wasn't designed to make that easy.
Computers don't need comments, matching </end> tags, or whitespace stripping.
There was a time, in the early-mid 2000s when XML was the hammer for every screw. But then JSON was invented and it took over most of those use cases. Perhaps those XML gurus are stuck in a time warp.
XML remains a good way to represent tree structures that need to be human editable.
XML was designed as a document format, not a data structure serialization format. You're supposed to parse it into a DOM or similar format, not a bunch of strongly-typed objects. You definitely need some extra tooling if you're trying to do the latter, and yes, that's one of XSD's purposes.
that's underselling xml. xml is explicitly meant for data serialization and exchange, xsd reflects that, and it's the reason for jaxb Java xml binding tooling.
get me right: Json is superior in many aspects, xml is utterly overengineered.
but xml absolutely was _meant_ for data exchange, machine to machine.
The design goals for XML are:
XML shall be straightforwardly usable over the Internet.
XML shall support a wide variety of applications.
XML shall be compatible with SGML.
It shall be easy to write programs which process XML documents.
The number of optional features in XML is to be kept to the absolute minimum, ideally zero.
XML documents should be human-legible and reasonably clear.
The XML design should be prepared quickly.
The design of XML shall be formal and concise.
XML documents shall be easy to create.
Terseness in XML markup is of minimal importance.
Or heck, even more concisely from the abstract: "The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML."
It's always talking about documents. It was a way to serve up marked-up documents that didn't depend on using the specific HTML tag vocabulary. Everything else happened to it later, and was a bad idea.
the origin of the latter, the edi/xml WG, was the successor of an edi/sgml WG which had started in the early 1990, and was born out of the desire to get a "universal electronic data exchange" that would work cross platform, vms, mainframes, unix and even DOS hehe, and to leverage the successful sgml doc book interoperability.
was it niche? yes. was it starting in sgml already? and baked into xml/xsd/xslt? I think so.
>XML shall be straightforwardly usable over the Internet.
is machine to machine communication
to me, XML is an example of worse is better, or rather, better is worse. it would never have come out of Bell Labs in the early 70s. Neither would JSON for that matter.
And as for JAXB, it was released in 2003, well into XML's decadent period. The original Java APIs for XML parsing were SAX and DOM, both of which are tag and document oriented.
Maybe this is okay if you know your schema beforehand and are willing to write an XSD. My usecase relied on not knowing the schema. Despite my excitement to use a SAX-style parser, I tucked my tail between my legs and switched back to JSONL. Was I missing something?