**[JAVA] String.getBytes()在不同作業環境下的坑**

# **[JAVA] String.getBytes()在不同作業環境下的坑** ###### tags: `Java` `工作筆記` ### 坑點專案中需要產出指定格式的TXT檔案，介接其他系統的API。其中規定中文算2碼，英文數字為1碼原本在本機測試和WinServer測試環境上都是OK的。結果放上OpenShift環境裡，資料格式直接跑版。 ![Uploading file..._rlzxcsw5u]() ### 原因 **String.getBytes()取用編碼的依序如下：** 1.String.getBytes("指定編碼") 2.JVM設定的編碼例如：`-Dfile.encoding=UTF-8` 3.系統環境預設的編碼一般來說 * 中文操作環境下，getBytes()預設使用GBK或CBK2312的編碼 * 英文操作環境下，getBytes()預設使用ISO-8859-1的編碼而在這些編碼內對中文字byte的長度判別有所不同： ```java= public static void main(String[] args) throws UnsupportedEncodingException { String str = "HELLO! 哈囉"; byte[] defaultByte = str.getBytes(); byte[] GBKByte = str.getBytes("GBK"); byte[] ISOByte = str.getBytes("ISO-8859-1"); byte[] UTF8Byte = str.getBytes("UTF-8"); System.out.println( "作業系統預設Byte長度："+ defaultByte.length); System.out.println( "GBKByte長度："+ GBKByte.length); System.out.println( "ISOByte長度："+ ISOByte.length); System.out.println( "UTF8Byte長度："+ UTF8Byte.length); } /**在中文環境下的執行結果：作業系統預設Byte長度：11 GBKByte長度：11 ISOByte長度：9 UTF8Byte長度：13 **/ ``` 以下可以看出每種編碼輸出成txt檔案後的差異。 ```java= public static void main(String[] args) throws UnsupportedEncodingException { byte[] tempMemo1 = new byte[20]; byte[] tempMemo2 = new byte[20]; byte[] tempMemo3 = new byte[20]; byte[] tempMemo4 = new byte[20]; byte[] tempMemo5 = new byte[20]; String memo = "ＡＴ哈囉"+ " "; //20個空白 String memo1,memo2,memo3,memo4,memo5; System.arraycopy(memo.getBytes(), 0, tempMemo1, 0, 20); System.arraycopy(memo.getBytes(StandardCharsets.UTF_8), 0, tempMemo2, 0, 20); System.arraycopy(memo.getBytes("BIG5"), 0, tempMemo3, 0, 20); System.arraycopy(memo.getBytes(StandardCharsets.ISO_8859_1), 0, tempMemo4, 0, 20); System.arraycopy(memo.getBytes("GBK"), 0, tempMemo5, 0, 20); memo1 = new String(tempMemo1); System.out.println("預設："+ memo1.length()); byte2file("D://0DEFAULT.txt", tempMemo1); memo2 = new String(tempMemo2); System.out.println("UTF8："+ memo2.length()); byte2file("D://0UTF_8.txt", tempMemo2); memo3 = new String(tempMemo3); System.out.println("BIG5："+ memo3.length()); byte2file("D://0BIG5.txt", tempMemo3); memo4 = new String(tempMemo4); System.out.println("ISO："+ memo4.length()); byte2file("D://0ISO.txt", tempMemo4); memo5 = new String(tempMemo5); System.out.println("GBK："+ memo5.length()); byte2file("D://0GBK.txt", tempMemo5); } public static void byte2file(String path,byte[] data) { try { FileOutputStream outputStream =new FileOutputStream(new File(path)); outputStream.write(data); outputStream.close(); } catch (Exception e) { e.printStackTrace(); } } ``` ### 實際範例 :::info 【規格要求】英文／數字/半形算 1 碼中文字/全形算 2 碼請提供內容為：姓名(10碼)+生日(8碼)+地址(40碼)+結尾符號@(1碼) 的txt檔案 ::: 若我要輸出：姓名：張小明，共計 6 碼，故必須再補齊 4 個半形空白。生日：19901231，共計 8 碼。地址：台北市中正區梅花街小飛巷123號，共計 29 碼，故必須再補齊 11 個半形空白。結尾：@。 txt檔案會長這個樣子： ![Uploading file..._w1ypdn6oy]() ```java= 英文/數字/半形，在程式內基本上不用特殊的處理。但中文碼/全形的，需要特別注意： // 姓名的處理 byte[] tempName = new byte[10]; name = peopleDTO.getName() + " "; //10個半形空白 System.arraycopy(memo.getBytes("BIG5"), 0, tempMemo, 0, 10); name = new String(tempMemo,"BIG5"); Stringbuilder finalData; finalData.append(name).append(birthday).append(address).append("@"); return return TxtUtil.downloadTxt( FormatUtil.convertDateToString(new Date(), "yyyyMMdd") + ".TXT", finalData.toString()); public final class TxtUtil { public static ResponseEntity<Resource> downloadTxt(String fileName, String txtData) throws IOException { try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) { // 防止中文亂碼 fileName = URLEncoder.encode(fileName, StandardCharsets.UTF_8).replaceAll("\\+", "%20"); baos.write(txtData.getBytes("BIG5")); ByteArrayResource resource = new ByteArrayResource(baos.toByteArray()); return ResponseEntity.ok() .header(HttpHeaders.CONTENT_DISPOSITION, "attachment;filename*=utf-8''" + fileName) .contentLength(baos.size()) .contentType(MediaType.APPLICATION_OCTET_STREAM) .body(resource); } } } ``` ### 結論如果要在各種不同環境下運行，請使用`String.getBytes("指定編碼")`去取得。