Protocol buffers (3): Read a binary file

Contents

  • Proto files
  • Serialization
  • Binary file analysis
  • Deserialization
  • Reference

Blog: blog.shinelee.me | Blog Garden| CSDN

In this article, we will define a relatively complex data structure and directly analyze the serialized binary file.

Proto file

Write the addressbook.proto file, slightly modify the official example, and add the float field for analysis The storage method of floating-point numbers.

syntax = "proto2";package tutorial;message Person {required string name = 1; required int32 id = 2; optional string email = 3; enum PhoneType {MOBILE = 0; HOME = 1; WORK = 2;} message PhoneNumber {required string number = 1; optional PhoneType type = 2 [default = HOME];} repeated PhoneNumber phones = 4; repeated float weight_recent_months = 100 [packed = true];}message AddressBook { repeated Person people = 1;}

Generate codec files, addressbook.pb.cc and addressbook.pb.h.

protoc.exe addressbook.proto --cpp_out=.

serialization

Write the following code, The address_book object is serialized and saved to the binary file address_book.bin.

int main(){ tutorial::AddressBook address_book; tutorial::Person* person = address_book.add_people(); person->set_id(1); person->set_name(" Jack"); person->set_email("[email protected]"); tutorial::Person::PhoneNumber* phone_number = person->add_phones(); phone_number->set_number("123456"); phone_number->set_type(tutorial ::Person::HOME); phone_number = person->add_phones(); phone_number->set_number("234567"); phone_number->set_type(tutorial::Person::MOBILE); person->add_weight_recent_months(50); person ->add_weight_recent_months(52); person->add_weight_recent_months(54); fstream fw("./address_book.bin", ios::out | ios::binary); address_book.SerializePartialToOstream(&fw); fw.close(); return 0;}

The binary file address_book.bin has a total of 62 bytes, and the content is as follows:
address_book bin file

binary file analysis

From the previous article, each fie The key = (field_number << 3) | wire_type of ld is all expressed by varint.

The first field of the message Addressbook is Person people. Person is also a message, which is parsed byte by byte below.

0a // (1 << 3) + 2, 1 is the field_bumber of people, 2 is the wire type3c corresponding to the embedded message // 0x3c = 60, which means the next 60 words The section is the data of Person people // Enter message Person0a below // (1 << 3) + 2, the first field of Person name field_number=1, 2 is the wire type04 corresponding to string // The string length of the name field Is 44a 61 63 6b // ascii code of "Jack" 10 // (2 << 3) + 0, field id field_number=2, 0 is wire type 01 corresponding to int32 // id is 11a // (3 << 3 ) + 2, the field email field_number=3, 2 is the wire type0b corresponding to the string // 0x0b = 11 The string length of the email field is 114a 61 63 6b 40 71 71 2e 63 6f 6d // "[emailprotected]" // The first PhoneNumber, nested message 22 // (4 << 3) + 2, the phones field, field_number=4, 2 is the wire type 0a corresponding to the embedded message // The next 10 bytes are PhoneNumber data 0a // (1 << 3) + 2, the first field number of message PhoneNumber, 2 is the wire type 06 corresponding to string // The string length of the number field is 6 31 32 33 34 35 36 // "123456" 10 // (2 << 3) + 0, PhoneType type field, 0 is the wire type 01 corresponding to enum // HOME, enum is treated as an integer // The second PhoneNumber, nested message 22 0a 0a 06 32 33 34 35 36 37 10 00 //The information is interpreted as above, the last 00 is MOBIL Ea2 06 // 1010 0010 0000 0110 varint method, weight_recent_months key // 010 0010 000 0110 → 000 0110 0100 010 little-endian storage // (100 << 3) + 2, 100 is the field number of weight_recent_months // 2 is Wire type0c of packed repeated field // The following 12 bytes are packed float data, one for every 4 bytes 00 00 48 42 // float 5000 00 50 42 // float 5200 00 58 42 // float 54

It should be noted that if the field followed by repeated is a message, such as the PhoneNumber above, there are several PhoneNumbers, the key will appear several times during encoding; if the field is connected to a numeric field, and packed = true When compressing storage, only one key will appear. If it is not stored in a compressed manner, its key will appear multiple times. In proto3, it is stored in compressed mode by default, and in proto2, it needs to be explicitly declared.

At this point, the analysis of the binary file has been completed, and now it is easy to look at the decoding code.

Deserialization

Here only paste the decoding code corresponding to the message Person, you can see that when it encounters a nested message PhoneNumber, it will call The decoding code of PhoneNumber.

bool Person::MergePartialFromCodedStream( ::google::protobuf::io::CodedInputStream* input) {#define DO_(EXPRESSION) if (!PROTOBUF_PREDICT_TRUE(EXPRESSION)) goto failure ::google::protobuf::uint32 tag; // @@protoc_insertion_point(parse_start:tutorial.Person) for (;;) {::std::pair<::google::protobuf::uint32, bool> p = input->ReadTagWithCutoffNoLastTag(16383u); tag = p.first; if (!p.second) goto handle_unusual; switch (::google::protobuf::internal::WireFormatLite::GetTagFieldNumber(tag)) {// required string name = 1; case 1: {if (static_cast< ::google::protobuf::uint8>(tag) == (10 & 0xFF)) {DO_(::google::protobuf::internal::WireFormatLite:: ReadString( input, this->mutable_name())); ::google::protobuf::internal::WireFormat::VerifyUTF8StringNamedField( this->name().data(), static_cast(this->name( ).length()), ::google::protobuf::internal::WireFormat::PARSE, "tut orial.Person.name");} else {goto handle_unusual;} break;} // required int32 id = 2; case 2: {if (static_cast< ::google::protobuf::uint8>(tag) == ( 16 & 0xFF)) {HasBitSetters::set_has_id(this); DO_((::google::protobuf::internal::WireFormatLite::ReadPrimitive< ::google::protobuf::int32, ::google::protobuf: :internal::WireFormatLite::TYPE_INT32>( input, &id_)));} else {goto handle_unusual;} break;} // optional string email = 3; case 3: {if (static_cast< ::google::protobuf: :uint8>(tag) == (26 & 0xFF)) {DO_(::google::protobuf::internal::WireFormatLite::ReadString( input, this->mutable_email())); ::google::protobuf ::internal::WireFormat::VerifyUTF8StringNamedField( this->email().data(), static_cast(this->email().length()), ::google::protobuf::internal::WireFormat ::PARSE , "tutorial.Person.email");} else {goto handle_unusual;} break;} // repeated .tutorial.Person.PhoneNumber phones = 4; case 4: {if (static_cast< ::google::protobuf::uint8 >(tag) == (34 & 0xFF)) {DO_(::google::protobuf::internal::WireFormatLite::ReadMessage( input, add_phones()));} else {goto handle_unusual;} break;} / / repeated float weight_recent_months = 100 [packed = true]; case 100: {if (static_cast< ::google::protobuf::uint8>(tag) == (802 & 0xFF)) {DO_((::google:: protobuf::internal::WireFormatLite::ReadPackedPrimitive< float, ::google::protobuf::internal::WireFormatLite::TYPE_FLOAT>( input, this->mutable_weight_recent_months())));} else if (static_cast< :: google::protobuf::uint8>(tag) == (805 & 0xFF)) {DO_((::google::protobuf::internal::WireFormatLite::ReadRepeatedPrimitiveN oInline< float, ::google::protobuf::internal::WireFormatLite::TYPE_FLOAT>( 2, 802u, input, this->mutable_weight_recent_months())));} else {goto handle_unusual;} break;} default: { handle_unusual: if (tag == 0) {goto success;} DO_(::google::protobuf::internal::WireFormat::SkipField( input, tag, _internal_metadata_.mutable_unknown_fields())); break;}}} success : // @@protoc_insertion_point(parse_success:tutorial.Person) return true;failure: // @@protoc_insertion_point(parse_failure:tutorial.Person) return false;#undef DO_}

Above.

Reference

  • Protocol Buffer Basics: C++

Directory

  • Proto file
  • Serialization
  • Binary file analysis
  • < li>Deserialization

  • Reference

  • Proto file
  • Serialization< /li>
  • Binary file analysis
  • Deserialization
  • Reference

Leave a Comment

Your email address will not be published.