I've often wished a data format connoiseur (they exist!) would write a guide to what makes a good data format. I've encountered critiques or comments on specific formats, but never principles behind designing a good format or ways that formats go bad.
The topic subsumes configuration files, and data interchange formats, though it's debatable whether there should be any overlap. And yet we have JSON...
TOML seems solid for configuration files, but I've seen an order of magnitude less discussion of it than JSON, YAML or XML.
There are quite a few edge cases with JSON.
Tom Bray, who authored some JSON specs, identifies three pain points: commas, timestamps and schemas. See also the TOML discussion of how to handle times (which seems to be still unresolved).
Toml prioritizes human editability more than JSON (comments, syntax), and simplicity more than YAML. It has some similarities to .ini, but is better specified.
Dhall is a programmable configuration language that is not Turing-complete
You can think of Dhall as: JSON + functions + types
So, to a great extent, you can forget about the space-efficiency of your file formats and wire formats if you run them through a generic compression algorithm as a last step, and optimize them entirely for readability, extensibility, and simplicity.
There's at least three types of deserialization vulnerabilities: buffer overflows in languages that aren't memory safe, denial of service attacks, and allowing the deserialization of arbitrary classes (which typically means remote code execution).
Java serialization has provided an extensive series of security issues.
Describes the security vulnerabilities in YAML deserialization that hit rails in 2013. Nicely points out that even restrictive whitelists can enable attacks. Maybe YAML is just too expressive.
Graydon Hoare's Criterion
(IMO a good criterion for data format robustness is "how much work does a conforming processor have to do to skip a quoted payload")— Graydon Hoare (@graydon_pub) May 18, 2017
Security/DOS vulnerabilities in XML. Written from a Python perspective.