INET在处理编码方面比较弱,如果网页是UTF-8的英文显示没有问题,但是汉字却会有问题,在网上找了很多代码,都搞不定,问题都是转换的不完整,转换完遇到奇数个连续的汉字,最后一个字会变成"?".
以下代码可以完整地转换UTF-8格式的编码,不会存在上述问题了!
调用时应该这样做:
Dim bStrRec() As Byte
'注意这里,一定要用字节方式,如果不是字节方式,我估计INET下来的就是不完整的
bStrRec = Inet.OpenURL(txtUrl.Text, 1)
strRec = Utf8ToUnicode(bStrRec)
Function Utf8ToUnicode(ByRef Utf() As Byte) As String Dim utfLen As Long
utfLen = -1 On Error Resume Next utfLen = UBound(Utf) If utfLen = -1 Then Exit Function
On Error GoTo 0
Dim i As Long, j As Long, k As Long, N As Long Dim B As Byte, cnt As Byte Dim Buf() As String ReDim Buf(utfLen)
i = 0 j = 0 Do While i <= utfLen B = Utf(i)
If (B And &HFC) = &HFC Then cnt = 6 ElseIf (B And &HF8) = &HF8 Then cnt = 5 ElseIf (B And &HF0) = &HF0 Then cnt = 4 ElseIf (B And &HE0) = &HE0 Then cnt = 3 ElseIf (B And &HC0) = &HC0 Then cnt = 2 Else cnt = 1 End If
If i + cnt - 1 > utfLen Then Buf(j) = "?" Exit Do End If
Select Case cnt Case 2 N = B And &H1F Case 3 N = B And &HF Case 4 N = B And &H7 Case 5 N = B And &H3 Case 6 N = B And &H1 Case Else Buf(j) = Chr$(B) GoTo Continued
End Select
For k = 1 To cnt - 1 B = Utf(i + k) N = N * &H40 + (B And &H3F) Next
Buf(j) = ChrW$(N) Continued: i = i + cnt j = j + 1 Loop
Utf8ToUnicode = Join(Buf, "") End Function
至少要成为本站的注册会员才能下载! 注册点我!
|