Protocol-buffers – maximum serial Protobuf message size

Is there a way to get the maximum size of a protobuf message after serialization?

I am referring to messages that do not contain “duplicate” elements.

Please note that I am not referring to the size of the protobuf message with specific content, but to it The largest possible size that can be reached (in the worst case).

Generally, any protobuf message Both can be of any length, as there may be unknown fields. If you receive a message, you cannot make any assumptions about the length. If you want to send a message that you build yourself, then you can assume that it only contains fields that you know-however, In this case, you can also easily calculate the exact message size. Therefore, it is usually useless to ask what the maximum size is.

Having said that, you can write code that uses the Descriptor interface to Iterate FieldDescriptors to get the message type (MyMessageType::descriptor()).

See: https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor< /p>

Similar interfaces exist in Java, Python and other possible interfaces.

The following are the rules to be implemented:

Each field is marked by some data followed by Composition.

For tags:

>Field numbers 1-15 have a 1-byte mark.
>Field numbers 16 and above have a 2-byte mark.

For data:

> bool is always one byte.
> The maximum data length of int32, int64, uint64 and sint64 is 10 bytes (yes, unfortunately, if If int32 is negative, int32 can be 10 bytes).
> The maximum data length of sint32 and uint32 is 5 bytes.
> fixed32, sfixed32 and float are always exactly 4 bytes.
> br>> fixet64, sfixed64 and double are always exactly 8 bytes.
>The maximum length of the enumeration field depends on the maximum enumeration value:

> 0-127: 1 byte
> 128-16384: 2 bytes
> …it is 7 bits per byte, but hope your enumeration is not that big!
>Please also note that negative values ​​will be encoded as 10 bytes, but hopefully not.

>The maximum length of the message type field is the maximum length of the message type plus the length prefix byte The length prefix is ​​again one byte per 7-bit integer data. The maximum size of the
> group (you should not use them; they are old and old features that were deprecated before the protobuf public release) is equal to the maximum size of the content plus The second field tag (see above).

If your email contains any of the following, its maximum length is unlimited:

>string or word Any field of the section type. (Unless you know their maximum length, in which case it is the maximum length plus the length prefix, just like a sub-message.)> Any repeated fields. (Unless you know its maximum Length, in this case, each element of the list has a maximum length, as if it is a separate field, including the tag. There is no total length prefix here. Unless you use [packed = true], in this case , You will have to look up the details.)>Extension.

Is there a way to get the maximum size of a protobuf message after serialization?

I am referring to messages that do not contain “duplicate” elements.

Please note that I am not referring to the size of the protobuf message with specific content, but to it The maximum possible size that can be reached (in the worst case).

In general, any protobuf message can be any length because there may be unknown fields. If you receive a message, you can’t make any assumptions about the length. If you are sending a message that you constructed yourself, then you can assume that it only contains fields that you know-but in this case, you can also easily calculate the exact Therefore, it is usually useless to ask what the maximum size is.

Having said that, you can write code that uses the Descriptor interface to iterate FieldDescriptors to get the message type (MyMessageType::descriptor( )).

See: https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor

Similar interfaces exist in Java , Python and other possible interfaces.

The following are the rules to be implemented:

Each field consists of tags followed by some data.

For tags:

>Field numbers 1-15 have a 1-byte mark.
>Field numbers 16 and above have a 2-byte mark.

For data:

> bool is always one byte.
> The maximum data length of int32, int64, uint64 and sint64 is 10 bytes (yes, unfortunately, if int32 is negative, int32 can be 10 bytes ).
> The maximum data length of sint32 and uint32 is 5 bytes.
> fixed32, sfixed32 and float are always exactly 4 bytes.
> fixet64, sfixed64 and double are always exactly 8 bytes.
>The maximum length of the enumeration field depends on the maximum enumeration value:

> 0-127: 1 byte
> 128-16384: 2 bytes
> …it is 7 bits per byte, but hope your enumeration is not that big!
>Please also note that negative values ​​will be encoded as 10 bytes, but hopefully not.

>The maximum length of the message type field is the maximum length of the message type plus the length prefix byte The length prefix is ​​again one byte per 7-bit integer data. The maximum size of the
> group (you should not use them; they are old and old features that were deprecated before the protobuf public release) is equal to the maximum size of the content plus The second field tag (see above).

If your email contains any of the following, its maximum length is unlimited:

>string or word Any field of the section type. (Unless you know their maximum length, in which case it is the maximum length plus the length prefix, just like a sub-message.)> Any repeated fields. (Unless you know its maximum Length, in this case, each element of the list has a maximum length, as if it is a separate field, including the tag. There is no total length prefix here. Unless you use [packed = true], in this case , You will have to look up the details.)>Extensions.

Leave a Comment

Your email address will not be published.