Deserialize JSON field with multiple possible types in Rust

Ever wondered about serde's custom deserialization because of multiple possible types? Me too! After some researching and testing I found a way to easily convert all possible data types into one single type so your application can handle it.
Rust and strict types - the problem
Rust is a very type safe language, so it's kind of strict with allowing only one specific type per field or variable.
This is great but might complicate things a bit if you have to compute, for example, a JSON file where a field (or many) can have multiple types.
Like boolean
, array
, integer
and string
or maybe even null
.
So what can we do about this in Rust? Well, using serde to deserialize the JSON file already helps a lot.
And the fact that you can customize the deserialization (as well as serialization of course) brings us close to a solution.
Customized deserialization
Now, you can do a lot with serde, most things work pretty much out of the box.
Therefore, in many cases, deserializing a JSON payload is as easy as assigning it a struct.
Even when we're dealing with optional values (i.e., a field not always set in the payload) it's just a matter of setting
#[serde(default)]
for the field in question in the associated struct.
How to handle multiple types
Like explained above, it gets a bit more complicated when you want to handle a field which can have multiple different types. Basically, what we need now, is a custom deserializer (and serializer for the other way around).
In theory, there are at least three ways to do this:
You can either create multiple fields in your struct for each type and try to assign the values depending on the type.
I haven't tried this, yet. Therefore, I'm not entirely sure if it works.
Another way is using an enum like the following code:
#[derive(Debug, Serialize, Deserialize)]
#[serde(untagged)]
enum MultipleTypes {
Str(String),
Vec(Vec<String>)
Bool(bool),
U64(u64),
}
This case might even work without a custom serializer by using the struct value like:
#[derive(Debug, Serialize, Deserialize)]
struct MultiStruct {
key_1: i32,
key_2: bool,
key_3: MultipleTypes,
}
With key_3
having the MultipleTypes
enum and therefore able to hold multiple types thanks to the previously defined enum.
You could even handle null
values by using Option
like Option<MultipleTypes>
.
For my use-case, the downside of this approach was that I would have to resolve the value on each usage.
So, there's another way:
You can create a custom deserializer for serde which can be used for the field related to multiple types.
All the deserializer function does is handling and converting the different types into a single result type.
fn parse_value<'de, D>(deserializer: D) -> Result<Option<Vec<String>>, D::Error>
where
D: Deserializer<'de>,
{
#[derive(Deserialize)]
#[serde(untagged)]
enum AnyType<'a> {
Str(&'a str),
U64(u64),
Vec(Vec<String>),
Bool(bool),
None,
}
Ok(match AnyType::deserialize(deserializer)? {
AnyType::Str(v) => Some(vec![v.to_string()]),
AnyType::U64(v) => Some(vec![v.to_string()]),
AnyType::Vec(v) => Some(v),
AnyType::Bool(v) => Some(vec![v.to_string()]),
AnyType::None => None,
})
}
This is similar to the first approach with the main difference that we're converting all the values into a vector of strings.
Since I didn't need the real types here, it was the best option at the time.
The code for using the custom deserialization function in a struct looks like that:
#[derive(Debug, Serialize, Deserialize)]
struct MultiStruct {
key_1: i32,
key_2: bool,
#[serde(deserialize_with="parse_value")]
key_3: Option<Vec<String>>,
}
Now, accessing it in your code needs resolving the Option first, i.e. like:
let value = field.value.to_owned();
let current_value = match value {
Some(value) => value,
None => continue,
};
// now do sth. with the first element
let val = current_value[0].clone();
And that's about it. What's your experience with type conversion in Rust?
Attribution: Article image by Ulrike Leone from Pixabay