(No sign) == Specialist's comment > == Amateur speach > > == Specialist's comment > > It's not an easy problem, though. > > > The bulgarian regular user would like to use accents very rarely (with one > middle-sized exclusion though). The current most popular cyrillic encodings, > such as Windows-1251, KOI8-?, etc, are not able to handle accents. It is too > late to invent new encodings, since this would attribute additional disorder > in the current mess. > > There is an ongoing effort in Bulgaria to invent a general solution of this > (and couple of others) problem. > > The only (?) flexible and promising way seems to be the use of this technics > (combining diacriticals). It is flexible, since it can be related in regular > ways with other aspects of text processing (such as spellcheck). > > That's why I think it is important to fight this problem. > > > The only real way to handle > > it is to convert the sequence of characters to a single character > > and show that one. Just adding a diacritical to a base character > > is not likely to yield satisfactory results (particularly not with > > fonts like unifont that doesn't have the diacritical itself in the > > combining diacritical code point...). > > If I understand you right, this means that unifont is not completely ot > correctly implemented? > > > Of course, not all > > combinations exist as standalone characters in unicode... > > Yes... And I don't know if it is a flexible thing to keep standalone unicode > positions for each accented symbol. This would be very bad for backwards > encoding compatibility (?), spellcheck ability, and would introduce (again) > new disorder... > > > Some fonts try to work around this by making the combining diacritical > > a zero-width character thereby causing it to be drawn in the same > > position as the next character. Still, it's not particularly likely > > that adding a diacritical to a character yields the same result as > > designing a character with a diacritical... > > Well, I guess this concerns the font resposibility only. > I guess applicatins (incl. Opera) just have to follow the standards, and as > long the font hadnles everything correctly, application should do the same. > > I agree with the idea, but I think the technical challenges (if this is to be done correctly) are really big. The real problem is that there is no longer a one-to-one mapping between codepoints and characters. For example comb. diaresis + a is two codepoints that should generate the character a with diaresis. For this simple case, where the character has it's own codepoint in unicode, the application can convert the two codepoints into a single that adequately represents the character, and draw the glyph that represents it. However, if the character does not exist in unicode (and most of the possible combinations don't) then the application can not draw the character. The best it can do is to design it's own glyph to match the vague description (like a with accent egu and tilde). To be correct the new glyph must look like the other glyphs in the font, with the diacriticals in the correct positions according to the actual character to be drawn (should they be above each other or next to each other? How far apart from each other? How far apart from the base character?). And finally, it must look pretty and be easy to read for a human reader. The only real solution is that the font contain glyphs for all possible combinations of combining characters. (Only humans can decide whether a font design is pretty and readable, and if it matches the traditional use of the character). The partial solution is to stick generic diacriticals around the generic base character. Then the fonts and the font renderer must agree on how to construct these characters, and it will still not look right.