Announcement

Collapse
No announcement yet.

Embedded Browser Character Display

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Embedded Browser Character Display

    While working on the new gbThreads, I have noticed that the threads I download from the forum don't look the same in the embedded browser (Explorer) as they do in current browsers, such as Chrome..

    In this example, look at the word "don't" at the end of the sentence.

    Click image for larger version

Name:	pb_2203.jpg
Views:	37
Size:	12.2 KB
ID:	791387
    I can see with a Hex Editor that the apostrophe in the downloaded thread is simply not Chr$(27). In addition to this example, I've seen many other places where the downloaded threads displayed in the embedded browser do not display characters as expected.

    I'd like not to have to track down each offending character and correct them, but I'm not sure of a generic solution to display the characters as expected.

    I can use something like PBWin Remove$ to keep all of the 0-126 characters in the thread but that removes the undesirable characters instead of reproducing them as desired.


  • #2
    I don't use Embedded browser, but at first glance it looks like a UTF-8 Encoding
    maybe PB's Utf8ToChr$ function

    Comment


    • #3
      Something to read ’

      Comment


      • #4
        More info confirming it is UTF-8 encoding

        Dec Hex
        226 E2 â Latin Small Letter A With Circumflex
        128 80 € Euro Sign
        153 99 ™ Trade Mark Sign

        UTF-8 (hex) E2 80 99 (e28099) RIGHT SINGLE QUOTATION MARK

        Comment


        • #5
          Rod,
          Thanks for the suggestion. Doing this to the entire thread seems to fix all the instances where I've seen the problem.

          Code:
             tmp$ = Utf8ToChr$(tmp$)
          I wonder if there's a forum setting that resulted in sending UTF-8. And why only on some strings?

          I don't have the big picture of why this is happening in the first place.

          Comment


          • #6
            UTF8 is the default encoding used by just about every web server these days. I'd expect all of your forum threads to be UTF-8 encoded.

            Your old Explorer is probably assuming that the page is encoded as "Windows European" instead of UTF-8.

            Right click on the embedded browser window and look at "Encoding"

            In most situations, that's not a problem, but the "apostrophe" in your example is one of the more common problems. It's the Unicode character "Single right quote" &H2019 (encoded in UTF-8 as Hex bytes E2 80 99). The text was probably generated originally in MS Word which is an absolute PITA when it converts simple apostrophes/quotes to what it calls "smart quotes". It would then have been copy/pasted by the poster

            Comment

            Working...
            X