Today I was writing a tool that needed to handle JSON, the content was relatively large and I only needed to pick out a few parts of the structure, and thought to myself this would be the perfect time to try out the new System.Text.Json JsonDocument.

To be clear, there are multiple APIs offered in System.Texts.Json from high level to low level, and JsonDocument sits somewhere in the middle, in most cases the simpler JsonSerializer APIs will be perfectly adequate and most appropriate.

The benefits of using JsonDocument are:

  • It has asynchronous APIs, for example JsonDocument.ParseAsync which takes a Stream, and importantly will call ReadAsync on that underlying stream.
  • It uses very little memory, is does this by:
    • Using shared rented buffers under the hood when reading the underlying stream.
    • Reading UTF8 content directly, and not having to convert UTF8 bytes into strings.
    • Only allocating when you ask it for a materialized value.

For my console application, these aren't properties I particularly need, but for a backend service, web app or microservice, you start to really care about these concerns.

Getting Started

If you are targeting netcoreapp3.0, then there's nothing you need to add to your project, just start using the System.Text.Json namespace.

For netstandard2.0 projects or full framework, you'll need to add the following package which is in preview:

<PackageReference Include="System.Text.Json" Version="4.6.0-preview7.19362.9" />

The Problem

I want to parse the dotnet metadata file releases-index.json, found here and pick out the url for the channel JSON (2.2 for example). The channel JSON is much larger, I want to then search for a particular version and select the files for that release, you can see an example here.

The Code

// using System;
// using System.Linq;
// using System.Net.Http;
// using System.Text.Json;
// using System.Threading.Tasks;
// using System.Collections.Generic;

var version = "2.2.100";
var channelVersion = ParseChannelVersion(version); // "2.2"

var httpClient = new HttpClient();
using var releasesResponse = await JsonDocument.ParseAsync(await httpClient.GetStreamAsync(
"https://raw.githubusercontent.com/dotnet/core/master/release-notes/releases-index.json"));

var channel = releasesResponse.RootElement.GetProperty("releases-index").EnumerateArray()
.First(x => x.GetProperty("channel-version").GetString() == channelVersion);

var channelJson = channel.GetProperty("releases.json").GetString();

using var channelResponse = await JsonDocument.ParseAsync(await httpClient.GetStreamAsync(channelJson));

var files = channelResponse
.RootElement.GetProperty("releases").EnumerateArray()
.SelectMany(x =>
{
IEnumerable<JsonElement> GetSdks()
{
yield return x.GetProperty("sdk");
if (x.TryGetProperty("sdks", out var sdks))
{
foreach (var y in sdks.EnumerateArray())
yield return y;
}
}

return GetSdks();
})
.First(x => x.GetProperty("version").GetString() == version)
.GetProperty("files").ToString();

When creating the JsonDocument, it could have been tempting to write it as:

JsonDocument.Parse(await httpClient.GetStringAsync("..."));

however this would have materialized the entire contents as a string first, undermining the overall benefits of using this API.

Also notice the use of using var releasesResponse here, this is a new C# 8.0 feature that means Dispose will be called at the end of the method, without any extra indentation that comes with the more familiar using block.

It's important to always dispose of JsonDocument, as not doing so will result in the rented buffers never being returned to the pool, and you have a memory leak on your hands!

From then on the API is very easy to use, you can call .GetProperty("property-name").Get*() to get a typed value, and it's only at this point that things will be heap allocated. Where we have optional properties, we use TryGetProperty, which will return a boolean indicated whether the property was found, and the value is passed as an out variable.

Conclusion

What's really nice here is that I didn't need to make a tradeoff between readability and performance, I "only paid for what I used", and I was able to avoid the ceremony of creating a set of classes that comprehensively modelled the content which I was mostly discarding.

More information about System.Text.Json can be found here.