To: vladimirg(at)need.bg From: Andrew Cunningham andrewc(at)vicnet.net.au Cc: unicode(at)unicode.org Received: (qmail 23437 invoked by uid 0); 11 Jul 2003 00:52:51 -0000 from unicode.org (209.235.17.55) by ns.need.bg with SMTP; 11 Jul 2003 00:52:51 -0000 from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h6B0ers20433; Thu, 10 Jul 2003 20:40:53 -0400 with ECARTIS (v1.0.0; list unicode); Thu, 10 Jul 2003 20:40:53 -0400 (EDT) from mail.vicnet.net.au (postfix(at)merri.vicnet.net.au [203.10.72.9]) by unicode.org (8.11.6/8.11.6) with ESMTP id h6B0eqs20426 for ; Thu, 10 Jul 2003 20:40:52 -0400 by mail.vicnet.net.au (Postfix, from userid 998) id 215D698841; Fri, 11 Jul 2003 10:40:50 +1000 (EST) from vicnet.net.au (unknown [202.137.76.75]) by mail.vicnet.net.au (Postfix) with ESMTP id 9B8B09880D; Fri, 11 Jul 2003 10:40:43 +1000 (EST) Date: Fri, 11 Jul 2003 10:40:43 +1000 MIMEVersion: 1.0 ContentType: text/plain; charset=us-ascii; format=flowed Subject: Re: Combining diacriticals and Cyrillic Body: Hi Vladimir yes in theory your answer is Unicode, i.e. cyrillic plus combining diacritics. Although the actual application of the theory will differ from operating system to operating system. I did a quick test on windows in both word processors and web browsers. Everything displayed correctly (given certain combinations of fonts and applications). There are two elements that need to be addressed: 1) appropraite fonts. I only know of two that are suitable: Code2000 (v. 1.13) has the appropriate opentype tables (I believe it uses the OpenType MarkToBase feature - others on the list will correct me if my memory is faulty). The second font is Doulos SIL (v 0.6 - Beta). This font has both OpenType tables and Graphite tables. Graphite is a rendering system developed by SIL International. 2) You need a rendering system that supports the features. On Windows, this means that you will need a version of Uniscribe that supports the use of combining diacritics with cyrillic characters. Currently none are available, except for the version in the MS Office 2003 Beta. I did a quick test using the two fonts above, and the characters displayed correctly. So from the point of view of word processing, there is a solution coming. This approach will also work with other applications that support uniscribe. Although you might ahve to wait until Microsoft release a service pack that contains the uniscribe update. I assume that Microsoft will update one or more fonts with the necessary features when they release Office 2003. I also tested the software in some graphite enabled software (WorldPad and a graphite enabled version of Mozilla). It seemed to work fine as well. vladimirg(at)need.bg wrote: > Dear Ladys and Gentlemen, > > Currently there is an ongoing effort in Bulgaria trying to resolve an > issuie concerning the way we write in Bulgarian. > > Our problem is: > > Usually a bulgarian regular user does not need to write accented characters. > There is one middle-sized exclusion of this, but generally we do fine without > accented characters. The problem is that in some special cases or more > serious lingustic work, one definetely needs to be able to write accented > characters (accented vowels). > > One of the ideas is to invent a new ASCII-based encodings, containing > the accented characters we need. This would introduce an additional > disorder in the current mess of cyrillic encodings, and would introduce > problems with automated spellcheck. > > Generally I beleive it would be best to invent a Unicode based solution. > > Such a solution is for example, combining diacritical signs with the > cyrillic symbols. > > I composed a demo page: > http://v.bulport.com/bugs/opera/426/balhaah_lonex_org/ > > and then made 10-20 shots of the results on Opera and IE on Linux, > Windows 98 and Windows XP: > http://v.bulport.com/bugs/opera/426/balhaah_lonex_org/shots.html > > You can see that this approach yields _quite_ incosistent and useless > results, depending on the font, application and operating system being > used. > > Finally, I wonder if you could give us some advice: > > 1. > Is it possible somehow to improve this approach? I imagine eg., > if the font can provide prepared combined symbols whenever the > application asks for a combined cyrillic+diacritical, instead of > leaving the application to do the combination. > > 2. > Do you see other unicode based approach to the Bulgarian problem? > > 3. > Do you beleive the approach should be looked for outside Unicode? > > Please excuse me for wasting your time, > Vladimir, > Bulgaria > > . > -- Andrew Cunningham Multilingual Technical Officer Online Projects Team, Vicnet State Library of Victoria 328 Swanston Street Melbourne VIC 3000 Australia andrewc(at)vicnet.net.au Ph. +61-3-8664-7430 Fax: +61-3-9639-2175 http://www.openroad.net.au/ http://www.libraries.vic.gov.au/ http://www.vicnet.net.au/