Subject: Re: [Summer of Code]Wide Character Support in curses
To: Brett Lymn <blymn@baesystems.com.au>
From: Ruibiao Qiu <ruibiao@arl.wustl.edu>
List: tech-userlevel
Date: 06/13/2005 10:15:55
On Mon, 13 Jun 2005, Brett Lymn wrote:
> On Sun, Jun 12, 2005 at 04:23:42PM -0500, Ruibiao Qiu wrote:
>> To improve the memory usage, I propose a different structure than my
>> original structure. Essentially, it is about the same as the existing
>> storage structure. That is, the character value is still an 8-bit
>> character.
>
> No. A wide character is 32bits.
Brett,
Thanks for your feedback.
I guess I did not make it quite clear there. Sorry about the confusion. What
I really meant is to keep a storage structure for each display position on a
screen. The 32-bit value of a wide character is split into the value fields
of the four character display cells, using bit mask and shift operations.
Similarly, a 16-bit wide character is split into two cells.
For example, a wide character of width 2 with a value of 0x9ABC would occupy
two storage cells, one with a character value of 0x9A and the other with
0xBC. In addition, the attribute of the first half-character indicates that
it is the beginning of a wide character, and the second be the end.
IMHO, this storage structure does not make cursor positioning more difficult.
Because there will be characters of different widths on a screen (see the
reasons below), moving up or down the cursor simply make it go to the display
cell directly using the current position without summing up all the width in
all the proceeding cells of the current and next lines. It may fall into the
middle of a wide character, but with the help of the alignment field, the
start of the wide character can be easily located, if necessary.
Besides, there were suggestions that all wide characters in the same screen
have the same width from the discussion. From a wide character application
users' perspective, I think that is too restricted. A user may want to mix
single-width character and wide characters in the same screen because it saves
screen space and looks nicer. For example, phone numbers and an English
address are normally displayed in single-character formats. There is
a single/wide character switch function in all Chinese input method module
just for this purpose. So, I think the storage structure should have a way to
indicate the width of the character, although it does not necessarily need a
byte.
Anyway, it is always good to have more people discuss the proposed solutions.
I really appreciate it. I plan to implement with the several viable
alternative storage cell structures discussed here, and compare their memory
usage and performance to find out the good solution.
Please keep sending your comments and feedback. Thanks.
Ruibiao