# 如何在 MSSQL 環境下,計算字串在 UTF8編碼下的 byte MSSQL中的nvarchar為Unicode(UCS-2),當在處理計算字串的SHA256值時,與一般常見的 UTF8 不同,需要透過一些手段才能讓其把字串當 UTF8 處理後計算 SHA256。 ## 建立 SQL function ```SQL create function [dbo].[ToUTF8](@s nvarchar(max)) returns varbinary(max) as begin declare @i int = 1, @n int = datalength(@s)/2, @r varbinary(max) = 0x, @c int, @c2 int, @d varbinary(4) while @i <= @n begin set @c = unicode(substring(@s, @i, 1)) if (@c & 0xFC00) = 0xD800 begin set @i += 1 if @i > @n return cast(cast('Malformed UTF-16 - two nchar sequence cut short' as int) as varbinary) set @c2 = unicode(substring(@s, @i, 1)) if (@c2 & 0xFC00) <> 0xDC00 return cast(cast('Malformed UTF-16 - continuation missing in a two nchar sequence' as int) as varbinary) set @c = (((@c & 0x3FF) * 0x400) | (@c2 & 0x3FF)) + 0x10000 end if @c < 0x80 set @d = cast(@c as binary(1)) if @c >= 0x80 and @c < 0x800 set @d = cast(((@c * 4) & 0xFF00) | (@c & 0x3F) | 0xC080 as binary(2)) if @c >= 0x800 and @c < 0x10000 set @d = cast(((@c * 0x10) & 0xFF0000) | ((@c * 4) & 0x3F00) | (@c & 0x3F) | 0xe08080 as binary(3)) if @c >= 0x10000 set @d = cast(((@c * 0x40) & 0xFF000000) | ((@c * 0x10) & 0x3F0000) | ((@c * 4) & 0x3F00) | (@c & 0x3F) | 0xf0808080 as binary(4)) set @r += @d set @i += 1 end return @r end ``` ## 資料轉換 ```SQL select dbo.ToUTF8(N'中堃文測試') ``` ## 計算 SHA256 ```SQL select hashbytes('SHA2_256',dbo.ToUTF8(N'中堃文測試')) ``` ## 網路參考資料 * [sevaa/ToUTF8.sql](https://gist.github.com/sevaa/f084a0a5a994c3bc28e518d5c708d5f6#file-toutf8-sql) * [Unicode(UCS-2)與UTF-8的轉換](https://iamzzz.pixnet.net/blog/post/2169145)
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up