Get a byte array in the default encoding format of the operating system. This means that under different operating systems, the returned things are different!
byte[] a = “中”.getBytes()
String.getBytes(String decode) method will Return the byte array representation of a string under the code according to the specified decode code, such as
byte[] a = “中”.getBytes(“GBK”)//The length is 2
byte[] b= “Medium”.getBytes(“UTF-8”)//The length is 3
byte[] c= “Medium”.getBytes(“ISO8859-1”)/ /Length is 1
In contrast to getBytes, you can restore this “medium” word by way of new String(byte[], decode). This new String(byte[], decode) is actually Use the specified encoding decode to parse byte[] into a string.
String s_gbk = new String(a,”GBK”);String s_utf8 = new String(b,”UTF-8″);String s_iso88591 = new String(c,”ISO8859-1″);
By outputting s_gbk, s_utf8 and s_iso88591, you will find that s_gbk and s_utf8 are both “medium”, and only s_iso88591 is an unrecognized character (can be understood as garbled), why use ISO8859-1 encoding and then Can’t restore the word “中” after the combination? The reason is very simple, because the encoding table of ISO8859-1 encoding does not contain Chinese characters at all, of course, it is impossible to get the correct “中” character in ISO8859-1 through “中”.getBytes(“ISO8859-1”); The coded value of, so it is impossible to restore it through new String().
Therefore, when obtaining byte[] through the String.getBytes(String decode) method, you must make sure that the code value represented by String does exist in the encoding table of decode, so that the byte[] array obtained can be correct Was restored.
Note: Sometimes, in order to adapt Chinese characters to some special requirements (for example, http header requires that its content must be iso8859-1 encoded), it may be possible to encode Chinese characters in byte mode , Such as:
String s_iso88591 = new String(“中”.getBytes(“UTF-8″),”ISO8859-1”);
, the resulting s_iso8859-1 The string is actually three characters in ISO8859-1. After passing these characters to the destination, the destination program then uses the opposite method.
String s_utf8 = new String(s_iso88591.getBytes(” ISO8859-1″),”UTF-8″);
To get the correct Chinese character “中”, so as to ensure compliance with the agreement and support Chinese.