It's CBOR, which seems to pack not so efficiently (250*250*20 * 12
floats takes 400 MB per frame, whereas this many doubles could pack into
120 MB). serde_cbor also seems to be extremely slow: taking multiple
minutes per frame. This could be parallelized to be less problematic
though (it already is -- just bump the parallelism).
Might be worth testing other formats.