p2p.wrox.com Forums

p2p.wrox.com Forums (http://p2p.wrox.com/)
-   VBScript (http://p2p.wrox.com/vbscript-77/)
-   -   Unicode UTf-8? System.Text.UTF8Encoding from VBA? (http://p2p.wrox.com/vbscript/29099-unicode-utf-8-system-text-utf8encoding-vba.html)

forum1 May 9th, 2005 03:45 PM

Unicode UTf-8? System.Text.UTF8Encoding from VBA?
 
I'm writing VBA code embedded in an Excel spreadsheet.

I need to convert strings from Unicode UTF-8 to standard string objects.

If you read the MSDN docs, it looks like the way is the object System.Text.UTF8Encoding. But it SEEMS like there is no way to use it from the VBA world. It seems like it might be Visual Basic.NET only, but not exported to the VBA world. Is that true?!?!?

How the heck do I do this?

I can create VBA code that does createobject("System.Text.UTF8Encoding") which runs fine, but then the object that it creates can't take any messages. :-(

  Dim myconverter
  Set myconverter = CreateObject("System.Text.UTF8Encoding") ' no error

  Dim encodedBytes
 encodedBytes = myconverter.GetBytes("sdsf") ' THIS FAILS - no GetBytes

There is some commercial developer who licenses some DLL to do this from VBA/VBscript, but that seems ridiculous, and wouldn't work in my architecture anyway -- i need it all to be contained within the Excel file.

Any ideas?!?!

Thanks!


BrianWren May 9th, 2005 03:57 PM

Do you know what the difference is between UTF-8 format and a standard string?

In your snippet, you are using a literal string as the argument. Inasmuch as it is not UTF-8 encoded, I wouldn't expect that to fly, even if the creation of the object "MyConverter" did work.

BrianWren May 9th, 2005 04:08 PM

If UTF-8 is unicode, and you want just the bytes of the ASCII characters, you can do this (built in conversion in VBA):
Code:

    Dim bString() As Byte
    Dim strSource As String
    Dim i        As Integer

    strSource = "ABC"          ' strSource contains 65  0  66  0  67  0
    bString() = strSource      ' bString contains  65  66  67

    For i = LBound(bString) To UBound(bString)
        bString(i) = bString(i) + 1
    Next i
    strSource = bString()      ' strSource contains 66  0  67  0  68  0

    Debug.Print strSource      ' Prints "BCD"


forum1 May 10th, 2005 04:34 PM

If UTF-8 is unicode, and you want just the bytes of the ASCII characters, you can do this (built in conversion in VBA):

UTF-F is is a variable-character-length encoding of Unicode that pays careful attention to not confusing code that looks for chars 1-127 (classic ASCII), thus encoding Unicode chars in standard 8-bit strings, but all bytes with values 1-127 are what they appear to be.

Learn more here:
 http://en.wikipedia.org/wiki/Utf-8

So, the example you posted doesn't do what i need. I need to take a string that I have pulled into a string from a textstream.ReadLine() command and I need to convert it using UTF-8 decoder into a Unicode string decoded from the encoded string. For instance, a 10 byte (10 8-bit bytes) might convert into a 2-10 character Unicode string after variable-length UTF-8 decoding.

this decoding is built into VB, but seems to be offlimits to VBA clients and ASP? Is there no way to do this? BAsically, I need to use the System.Text.UTF8Encoding class from VBA and it seems like it isn't available from the VBA world.

Any ideas?


faisalpv May 23rd, 2005 03:44 AM

Hi,

Did you manage to do that? I am looking for a solution for exactly the same issue, but in Word VBA macro. I can't any way to convert UTF-8 encoded string to UNICODE.

Thanks


anbedesigns March 30th, 2006 01:46 PM

Take a look at http://anbedesigns.alojardominio.com. There you can get Utf82Unicode function.


march11 May 20th, 2011 11:58 AM

try this code
 
You try this snippet of code it worked for me....

Code:

Function UTF8_Decode(ByVal sStr As String)
    Dim l As Long, sUTF8 As String, iChar As Integer, iChar2 As Integer
    For l = 1 To Len(sStr)
        iChar = Asc(Mid(sStr, l, 1))
        If iChar > 127 Then
            If Not iChar And 32 Then ' 2 chars
            iChar2 = Asc(Mid(sStr, l + 1, 1))
            sUTF8 = sUTF8 & ChrW$(((31 And iChar) * 64 + (63 And iChar2)))
            l = l + 1
        Else
            Dim iChar3 As Integer
            iChar2 = Asc(Mid(sStr, l + 1, 1))
            iChar3 = Asc(Mid(sStr, l + 2, 1))
            sUTF8 = sUTF8 & ChrW$(((iChar And 15) * 16 * 256) + ((iChar2 And 63) * 64) + (iChar3 And 63))
            l = l + 2
        End If
            Else
            sUTF8 = sUTF8 & Chr$(iChar)
        End If
    Next l
    UTF8_Decode = sUTF8
End Function


march11 May 20th, 2011 12:29 PM

trying to read Japanese character set
 
I am reading some cells in Excel with VBA in Access. One of the cells contains Japanese characters that I need to populate in an Access Table.

If I understand things correctly, I need to read the cell from Excel as Binary then convert to UTF-8 in order to save in my Access table.

If this is correct, could you help me with the code to read the Cell from Excel with VBA as a Binary.

Pretty sure I can then pass this value to your UTF-8 converter to create the valid string.

thank you so much!!!

Hacene May 28th, 2015 09:22 PM

Quote:

Originally Posted by march11 (Post 272370)
You try this snippet of code it worked for me....

Code:

Function UTF8_Decode(ByVal sStr As String)
    Dim l As Long, sUTF8 As String, iChar As Integer, iChar2 As Integer
    For l = 1 To Len(sStr)
        iChar = Asc(Mid(sStr, l, 1))
        If iChar > 127 Then
            If Not iChar And 32 Then ' 2 chars
            iChar2 = Asc(Mid(sStr, l + 1, 1))
            sUTF8 = sUTF8 & ChrW$(((31 And iChar) * 64 + (63 And iChar2)))
            l = l + 1
        Else
            Dim iChar3 As Integer
            iChar2 = Asc(Mid(sStr, l + 1, 1))
            iChar3 = Asc(Mid(sStr, l + 2, 1))
            sUTF8 = sUTF8 & ChrW$(((iChar And 15) * 16 * 256) + ((iChar2 And 63) * 64) + (iChar3 And 63))
            l = l + 2
        End If
            Else
            sUTF8 = sUTF8 & Chr$(iChar)
        End If
    Next l
    UTF8_Decode = sUTF8
End Function


That is an excellent function.
However, what if the text parsed is in UTF-16 or UTF-32?
By any chance, would you have similar functions for these various types of encoding?

Thank you in advance for any pointer/help.

vsrawat December 7th, 2015 05:27 AM

Yes, I worked perfectly. Thanks a million.
 
Quote:

Originally Posted by march11 (Post 272370)
You try this snippet of code it worked for me....

Code:

Function UTF8_Decode(ByVal sStr As String)
    Dim l As Long, sUTF8 As String, iChar As Integer, iChar2 As Integer
    For l = 1 To Len(sStr)
        iChar = Asc(Mid(sStr, l, 1))
        If iChar > 127 Then
            If Not iChar And 32 Then ' 2 chars
            iChar2 = Asc(Mid(sStr, l + 1, 1))
            sUTF8 = sUTF8 & ChrW$(((31 And iChar) * 64 + (63 And iChar2)))
            l = l + 1
        Else
            Dim iChar3 As Integer
            iChar2 = Asc(Mid(sStr, l + 1, 1))
            iChar3 = Asc(Mid(sStr, l + 2, 1))
            sUTF8 = sUTF8 & ChrW$(((iChar And 15) * 16 * 256) + ((iChar2 And 63) * 64) + (iChar3 And 63))
            l = l + 2
        End If
            Else
            sUTF8 = sUTF8 & Chr$(iChar)
        End If
    Next l
    UTF8_Decode = sUTF8
End Function


Yes, This code worked perfectly.

I had been stuck for several days trying to read Hindi devanagari unicode characters from a tab delimited file into vba7 ms word 1020 on w8-32 bit.

Had spend several hours searching everywhere on net.

Then I found your code, and it did the trick.

My this problem is fully and perfectly solved.

May God bless you.

Thanks a million.
--
Rawat
India


All times are GMT -4. The time now is 08:18 AM.

Powered by vBulletin®
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
2013 John Wiley & Sons, Inc.