You are not logged in. You can browse in the PowerBASIC Community, but you must click Login (top right) before you can post. If this is your first visit, check out the FAQ or Sign Up.
It is not very clear on what exactly you are looking for... do you want to convert or "prevent" conversion (in which case I have no idea what you mean!).
In general, there are localization issues to deal with when converting from ANSI to Unicode.
One possible approach would be to write a small PB/CC app that uses the Windows MultiByteToWideChar() and WideCharToMultiByte() API functions. You can launch a PB/CC ap synchronously from a PB/DOS app (under Windows 95 or better) using the SHELL statement.
While this approach will work, it would probably need to use disk files to pass the strings back and forth, and the speed of the SHELL operation _may_ be an issue - without knowing what you want to achieve it is not possible to guess!
The only thing I can suggest would be to seach the Internet for some conversion tables and write a DOS app around those. Ugh!
Can you please be more descriptive about what you wish to achieve? Thanks!
Technically, ASCII started out 7-bit (due to the high data costs involved!) covers just CHR$(0) to CHR$(127). IBM 'introduced' the first 8-bit ASCII table and it became known as the "IBM extended character set" and widely adopted in the PC world.
When Windows 1.0 was introduced, International characters became a more significant problem, and the solution was a new character set developed by ANSI/ISO, and termed "ANSI". It provided an way to specify alternative character sets to cater for Internationalization. DOS 3.0 (or possibly 3.3 - I forget) added "code page" support to provide a similar arrangement for DOS users.
Essentially, ANSI characters below 128 are almost identical to the original IBM 7-bit ASCII definition.
While this all seemed cool and catered to most of the world, some languages like Japanese and Hebrew were still out of reach, since more than 256 possible characters were required (according to my old notes, over 20000 chars for Japanese + Korean, etc).
The solution was to create "double-byte" character sets (DCBS). These were a combination of 8-bit ANSI and new 16-bit characters codes, but DCBS were found to be inherently difficult to deal with, since you had to work out whether the next character in a string was 1 or 2 bytes in length, and repeat for each character!
So, along came Unicode - it uses ONLY 16-bit (2 byte) character codes. And yes, Unicode is a ANSI/ISO standard too.
Actually, there were many 8-bit character sets before IBM. The point is, here, that
ASCII only defines a 7-bit character set, and anything over that is nonstandard.
Unfortunately, even a bulky 16-bit character set proved to be insufficiently comprehensive,
so Unicode now comes in a "double word character set" version: just like DCBS, but twice as
big. So much for simplicity...
We process personal data about users of our site, through the use of cookies and other technologies, to deliver our services, and to analyze site activity. For additional details, refer to our Privacy Policy.
By clicking "I AGREE" below, you agree to our Privacy Policy and our personal data processing and cookie practices as described therein. You also acknowledge that this forum may be hosted outside your country and you consent to the collection, storage, and processing of your data in the country where this forum is hosted.
Comment