Skip to content

DcclImproveDefaultCodecs

Toby Schneider edited this page May 26, 2016 · 2 revisions

discussion for the blueprint https://blueprints.launchpad.net/dccl/+spec/dccl-improve-default-codecs

Primary rationale: the repeated field implementation in v2 is quite inefficient when you want max_repeat to be any reasonable size. Also, the optional embedded message codec in v2 is buggy: https://bugs.launchpad.net/dccl/+bug/1181026

These changes break the v2 wire protocol so users must opt in explicitly by setting option (dccl.msg).codec_version = 3 in the root DCCL message. Previous messages will function as before (using the v2 codecs).

Improvements thus far:

  1. v3::DefaultMessageCodec uses a presence bit for optional fields (so in the encoded representation, a new bit is added at the least significant end, If false, the entire embedded message is not set, true means the field is set). This fixes the v2 buggy behavior of optional embedded message that had required children: https://bugs.launchpad.net/dccl/+bug/1181026
  2. Updated all the v3 repeated codecs that use TypedFieldCodec as the base class; i.e. use the FieldCodecBase default implementations of the following methods:
virtual void any_encode_repeated(Bitset* bits, const std::vector<boost::any>& wire_values);
virtual void any_decode_repeated(Bitset* repeated_bits, std::vector<boost::any>* field_values);
virtual void any_pre_encode_repeated(std::vector<boost::any>* wire_values,
                                     const std::vector<boost::any>& field_values);
virtual void any_post_decode_repeated(const std::vector<boost::any>& wire_values,
                                      std::vector<boost::any>* field_values);
virtual unsigned any_size_repeated(const std::vector<boost::any>& wire_values);
virtual unsigned max_size_repeated();
virtual unsigned min_size_repeated();

These codecs will now use an integer (sized for min = 0, max = (dccl.field).max_repeat, so for example 3 bits if max_repeat=7) indicating the number of elements present in the repeated field, which we'll call N. This size integer is inserted before (i.e. in the lower bits) the N copies of the encoded field. This will greatly reduce the space used when max_repeat is larger than N, since no bits are used for fields from N to max_repeat (previously all fields were repeated max_repeat times and it was the responsibility of the codec to treat the extra fields as not present.

Improvements to do:

  1. v3::DefaultStringCodec should use the same idea as the new repeated codecs for the length of the string, rather than being fixed at a maximum of 255.