If you've been watching the evolution of database technologies over the past few years, you've seen how quickly JSON has quickly cemented its position in major database servers. Due to its use in the web front-end, JSON has overtaken XML in APIs, and it’s spread through all the layers in the stack one step at a time.
Most major databases supported XML in some fashion for a while, too, but developer uptake wasn’t universal. JSON adoption among developers is nearly universal today, however. (The king is dead, long live the king!) But how good is JSON support in the databases we know and love? We’ll do a comparison in this blog post.
What is JSON Support?
First – what does it even mean for a database to support JSON?
You could easily argue that JSON support has been in databases for a long time. After all, all you have to do is store a blob of text that is correctly formatted in the database, and applications can store, retrieve, and manipulated as usual. But the level of JSON support that we are seeing in databases today exceeds that. JSON is becoming natively supported as a datatype, with functions and operators and all of the decoration that goes along with it.
For the purposes of this article, we don’t insist upon JSON being a natively supported data type that is distinct from other datatypes. It is enough for JSON to be stored as a blob, and manipulated with functions. However, as you will see, most databases go well beyond that.
JSON Support in MySQL
MySQL was late to the party. For a long time, developers, including VividCortex, stored JSON in blobs in MySQL, and either manipulated it in the application or used see compiled functions or stored procedures to manipulate it. That was always been suboptimal. MySQL, and particularly InnoDB, are not great at storing and manipulating blob data. It can be stored inefficiently, with a lot of overhead.
Luckily, with MySQL 5.7 that all changed. As of MySQL 5.7, JSON has been a natively supported data type.
In addition, MySQL 5.7 introduced a set of JSON functions for manipulating JSON columns directly. This is an even more recent update -- as a previous version of this blog post noted, when MySQL 5.7 first introduced JSON, its function prefixes were designated with
JSN_ , in case JSON function ever became part of the SQL standard. As current MySQL documentation notes, "For functions that take a JSON argument, an error occurs if the argument is not a valid JSON value," and, "Unless otherwise indicated, the JSON functions were added in MySQL 5.7.8."
Lastly in MySQL 5.7, it became possible to index values buried inside JSON documents by using virtual columns.
In the newest MySQL release, MySQL 8.0, several updates were added for document database support, such as
JSON_OBJECTAGG(), along with the JSON operator -
>>. MySQL 8.0's release notes describe the new operator as "similar to the
-> operator, but [it] performs JSON unquoting of the value as well. For a JSON column
mycol and JSON path expression
mypath, the following three expressions are equivalent."
In older versions of MySQL, a set of UDF functions had been available for quite a while for the community to install into their databases and manipulate JSON. There are also community UDFs in the MySQL UDF repository.
JSON Support in PostgreSQL
JSON has a longer history of support in PostgreSQL. I have not used it personally, but friends of mine have been talking to me about it at least since the 9.2 release. In this release, JSON was a natively supported datatype, and there were a couple of functions available for querying and manipulating it.
In the 9.3 release, support was greatly extended. In addition to the functions, there were a half-dozen operators, and these are still available in the latest PostgreSQL release, 9.6. Personally, I find these operators hard to read, and I don’t think they add much to SQL. I think they would be better off as functions. But that’s just my personal preference.
In version 9.4, the JSON datatype was supplemented by a JSONB datatype, which is still supported in 9.6. JSONB includes another half dozen operators, making the SQL look even more like Perl. There are also a couple of dozen JSONB functions, one for each corresponding JSON function, and PostgreSQL 9.6 added JSONB_Insert().
Perhaps one of the biggest advantages of JSONB over plaintext JSON is that it supports indexing. You can create functional indexes in Postgres, of course, so you can still index plaintext JSON; but it’s better and more efficient to index JSONB than JSON.
JSON Support in MongoDB
It may seem a little silly to include this section in this blog post, especially if you are familiar with MongoDB. However, even if it’s a little obvious, we don’t want to leave it unstated. And there are non-obvious things, too.
However, until versions of MongoDB from a few years ago, the data was stored with no compression. The recommended way of mitigating this problem was to use short field names, which wasn’t a very good alternative, frankly. As of MongoDB version 3.0, the storage engine began to include compression by default, much improving the situation. You can also use Percona’s TokuMX, which includes a storage engine with high-performance, transactions, and native compression.
Indexing is natively supported as well, naturally. This has been included since the first MongoDB releases. However, indexing has improved a lot over time, including the addition of sparse indexes (since not all documents will contain all fields you are indexing).
JSON Support in SQL Server
JSON support in Microsoft SQL server arrived in the 2016 release of the product. However, in contrast to the other databases we have discussed, SQL Server doesn't include a native JSON datatype. Instead, the functionality is quite similar to the native XML functionality that has existed in SQL Server for a long time. If you are familiar with SQL Server, you will immediately recognize the
OPENJSON() function. It operates on
NVARCHAR values. You can learn more in Microsoft's documentation on SQL Server 2016, which has been updated recently.
Indexing is also limited. No native JSON indexes; just fulltext indexing.
You can read more details on the MSDN blog.
One of the big reasons why people are interested in JSON support in databases is that they want to use fewer types of databases. The modern technology stack is beginning to introduce significant sprawl as people use different databases in particular areas, taking advantage of their strengths to gain efficiency. However, this polyglot persistence increases the technology surface area enough that it can become quite difficult to monitor, manage, develop, and operate such a diverse set of databases.
One potential answer to this problem is to continue using work horses such as MySQL and Postgres, replacing specialized JSON native databases with the new JSON functionality. For the record, MongoDB is not the only JSON native database. There are many databases, including RethinkDB, that deal with JSON natively.
A big difference between the JSON support in MySQL, Postgres, and MongoDB is that in MongoDB, this is the native transport across the wire as well. JSON is native end to end in this database, whereas in other databases it is typically shoehorned into row and column storage in the client and on the wire, as well as the programming API.
Still, keeping technology diversity down can be a big enough reason to continue using the reliable and trusted databases of yore.