Phonetic symbols in Calibri and Cambria

July 6, 2010

This topic was imported from the Typophile platform

John Wells's Phonetic Blog has a new post about the design of phonetic symbols in Calibri and Cambria. John Wells is a British phonetician and editor of the Longman Pronunciation Dictionary.

Here is what he has to say:
Calibri (like the font most of you will see in this blog) has a small cap i without the serifs it really needs for good legibility. It also has too much space before the stress mark, after the ɡ and after the length mark. In Cambria the serifs and stress mark are satisfactory, but the character spacing in the word ɡɑːdn̩ still leaves a lot to be desired.

The 'small cap i' is actually ɪ, the phonetic symbol for the vowel of 'kit'. I have to agree that even in a sans serif design where neither I nor the small cap i have serifs, this phonetic symbol needs serifs as they are an integral part of the symbol's identity. Lucida Sans Unicode is an example where the capital I has no serifs but the symbol ɪ does. Without serifs, I have trouble reading it as anything other than the Turkish dotless i.

Based on the samples on the blog, the spacing does look problematic and I agree with his assessments. I don't know how much of this is the inherent spacing and how much is the rendering and software used with possibly limited kerning support.

If you have any insights to share about the designs from the perspective of type designers to an audience of phoneticians, I would encourage you to comment on the blog itself, and we could also have a Typophile-specific discussion here.

July 6, 2010

Will log a bug against Calibri. Can't repro the spacing of Cambria. Will investigate.

Thanks, Si

July 6, 2010

The spacing problems seem to be a result of rounding errors in Wells’ rendering environment. This is how the spacing appears in GDI under Vista:

[Note that the below combining mark touches the bottom of the letter at some sizes. Correcting this would require size-specific adjustment of GPOS mark positioning data. This can be done in VOLT, but it is a major amount of work.]

July 6, 2010

And Calibri:

I quite agree regarding the phonetic smallcap ɪ. [Actually, I lean more and more to the view that the I in sans serif fonts, especially screen fonts, should have bars top and bottom to aid legibility. These should not be thought of as serifs: the I with barred terminals is a well-attested, indeed common form in writing.]

July 6, 2010

Actually, I lean more and more to the view that the I in sans serif fonts, especially screen fonts, should have bars top and bottom to aid legibility. These should not be thought of as serifs: the I with barred terminals is a well-attested, indeed common form in writing.

You'd get quite a bit of resistance to that, John, from users of sans serif typefaces, many of whom find the barred I quite ugly in an otherwise simple-looking design. I think it depends on which is more important in a particular context: character distinction or smooth flow. Within familiar words, the bar-less I is often a much more pleasing form (depending, always, on the typeface's overall design), but where letters and numbers and perhaps symbols will be mixed together (as in UK and Canadian postal codes, or in serial numbers and similar codes), then the need to be absolutely sure whether that's an I, an l, or a 1 trumps beauty.

I think that if you include both forms in a font, the barred I will usually need to sit in a wider space than the bar-less I. Otherwise, one of them will look wrong.

John

July 6, 2010

John, I agree that the barred I looks less clean and in all-caps words it tends to disrupt spacing. I'm thinking mainly in terms of screen legibility. Of course, with OTL we could contextually vary the form of the I based on proximity to l or 1. :)

My main point is that a barred I is not a serif’d I, so while there may be aesthetic and functional objections to its use in sans serif fonts there shouldn't be any categorical objection.

July 6, 2010

>I agree that the barred I looks less clean and in all-caps words it tends to disrupt spacing.

I-k-ea what you're saying ;-)

July 7, 2010

About the "barred" I, I find interesting the specific use of it developed in a specialised context such as comic book lettering:
http://www.blambot.com/grammar.shtml

December 27, 2011

It seems that Calibri lacks 203F: UNDERTIE symbol. Possibly Cambria does too.

December 27, 2011

Doesn't look like that code-point in any of the usual suspects. How/where is it used?

Cheers, Si

December 27, 2011

This wikipedia article on French liaison is filled with them: http://en.wikipedia.org/wiki/Liaison_(French)

December 28, 2011

And also the above mentioned Longman Pronunciation Dictionary.

December 28, 2011

Thanks, adding it to the "list".

Cheers, Si

December 28, 2011

I second the request for the undertie symbol. 2040: CHARACTER TIE would also be welcome (it's the same shape, only flipped and higher up). When I'm working on Google Docs or posting comments online and want to use the undertie symbol, I'm forced to use the underscore as a hack because of the lack of font support.

I use the undertie primarily for pronunciation transcriptions of English (for optional compression of syllables, just like the Longman Pronunciation Dictionary) and French (for liaison).

If the tie symbols do end up being added to Calibri and Cambria, I hope they are not designed too wide. In most fonts that do have the tie symbols, they are excessively wide, perhaps because out of ignorance of what the tie symbols are used for, designers have simply used the shape for the ligature ties (which need to be wide because they go over two letters) without modification.

It won't do to have an undertie taking up the space of two letters. It would draw way too much attention to itself. The undertie needs to be comparable in width with a single space or a period. To illustrate how narrow the undertie should be, here is a screen grab from the CD-ROM version of the Longman Pronunciation Dictionary:

December 28, 2011

Regarding a point raised earlier in the thread, I discovered a while ago that there are situations where the bars of the phonetic symbol ɪ are indispensable even in a sans design.

Phoneticians don't always use unmodified phonetic symbols, but sometimes combine them with diacritics that are also part of the IPA. This is especially important for narrow transcriptions.

I was exploring ways to express sound variations in different accents of English with shorthand symbols in a way consistent with IPA, and one of the best ways I could come up with for one instance was to use the IPA centralization diacritic, the diaeresis, with the phonetic symbols /i/ and /ɪ/, to produce /ï/ and /ɪ̈/ respectively (doesn't seem to display properly, but image below). However, this pair would not be distinguishable if ɪ lacked the bars.

You can also see that Wikipedia uses ɪ with a diaeresis on its vowel chart, because there is no dedicated unmodified phonetic symbol for that particular vowel. So this sort of combination with diacritics isn't entirely rare.

December 28, 2011

One other annoying thing in phonetics is the esh symbol ʃ. This is a particular pain for good overtie and undertie implementations, since the most pedantic representation in IPA of the initial sound in ‘church’ is t͡ʃ or t͜ʃ.

I also somewhat dislike the ʃ design in most fonts that have it, including Cambria which is otherwise my primary face in my documents. The ʃ is always treated as though it’s a cousin to the integral sign ∫, so that it has either a perfectly upright stem or a slightly forward-tilted stem. This is wrong, however. Instead the esh ʃ is more of a cousin to s and S. The versions in fonts designed for linguists, like Gentium and Doulos SIL, have a slight backward tilt to the stem with a bit more compensatory curl in the top and bottom arms. This enhances the S similarity and also makes it less troublesome for kerning and getting the advance width right. The associated symbols like ʄ, ʆ, ʅ, ᶋ, ᶘ, and ᶴ all benefit from the same design considerations. The esh ʃ also has some problems with superscript-level spacing modifiers like ʼ (U+02BC), so that the symbol for the ejective postalveolar affricate tʃʼ (or t͜ʃʼ or t͡ʃʼ if you’re especially picky) ends up having ʼ collide with the upper arm of ʃ in most faces I’ve seen.

In context the difference between dotless ı and small-cap ɪ is not terribly important because the two are basically mutually exclusive. A lot of linguists don’t really know or care that there is a difference. Diacritics can cause problems because the dotless ı is then the same as regular i since the tittle is normally deleted in phonetics use.

Cambria really does a great job on a lot of Latin diacritics, which is why it’s my main font. I particularly like that ḵ and x̱ have the same consistent height for U+0331, for example. Also Cambria’s Greek blends well with the Latin, which is essential for replicating old transcriptions like dùhίdιnάx̣ and xʼύxʼu-ɢa. But for phonetics I use Charis SIL as a companion font, it has about the same weight and when set off in brackets or slashes it doesn’t look out of place in a sea of Cambria.

I’ve never used Calibri extensively simply because it gets the position of U+0331 wrong in comparison with the precomposed forms, so that the macron below is higher beneath ḵ than it is beneath x̱. That makes the Tlingit word tuḵx̱ʼé ‘anus’ look crappy.

December 28, 2011

The ʃ is always treated as though it’s a cousin to the integral sign ∫, so that it has either a perfectly upright stem or a slightly forward-tilted stem.

Well, the integral sign is a cousin of the S, it stands for a limit of sums, and its shape in the Russian mathematical typography shows it, as you can see in the Wiki on the Integral Symbol of which here is a grab [12K]

December 28, 2011

The integral sign has more freedom in its shape since it doesn't have the constraints of the phonetic esh symbol. The latter has to work as a phonetic symbol alongside the lowercase latin letters and similar symbols that make up the various phonetic alphabets, and in addition it can have diacritics.

Many of the diacritics used in the IPA go below the letter they modify. If the letter has a descender, the same diacritic is used above the letter. So the 'ring below' diacritic for devoicing normally goes below the letter as in 'd̥', but it goes above the letter as in 'ɡ̊'. But the esh symbol has both a descender and an ascender, so it's tricky what to do with such diacritics (the 'ring below' wouldn't make sense since the esh stands for a sound which is already voiceless, but we could combine it with a 'caron below' for voicing). One solution I've seen puts the diacritic as a sort of a spacing modifier after the esh symbol, not under it. The trouble is that some of these diacritics are only encoded as combining diacritics in Unicode and not as spacing modifiers, so you have to cheat by combining the diacritic with a space.

December 28, 2011

If the letter has a descender, the same diacritic is used above the letter.

In this entry of James' Tlingit dictionary, I see the diacritics below the g.

December 28, 2011

Putting the diacritics above letters with descenders is a convention, but not a rigid one, and I don't think it applies to all diacritics equally. The underscore diacritic would only look natural below a letter, for example.

December 28, 2011

I tried to convince some of the orthographic community to accept ḡ in place of g̱ and the response ranged between “meh” and “are you also planning to eat kittens and immolate our infants for Moloch the vast stone god of war?”. So, not really a great reception. It is also hard to get existing software to believe that G̱ could be an uppercase pairing with ḡ. The ǥ (U+01E5 ‘Latin Small Letter G with Stroke’) is a potential option to avoid g̱ since most handwritten forms of g̱ actually have the macron crossing the descender. But ǥ case pairs with Ǥ (U+01E4) which is distinctly unsatisfactory, and some fonts have the stroke of ǥ crossing the stem into the bowl rather than across the descender. One attempt by Keri Edwards was to use G̱ and ɢ̱ so the lowercase was actually ɢ (U+0262 ‘Latin Letter Small Capital G’) with the U+0331 ‘Combining Macron Below’ diacritic, but that also breaks case pairing and is unpleasant to some people. So Tlingit in the end has stuck with g̱. It works pretty well in a lot of fonts, particularly with whatever the defaults are on Facebook in most browsers, and also in a few of the free email websites. As you can see here, when web designers try to be smart about font-face selection they usually end up with g̱ being broken somehow.

But that situation is an orthography and not phonetic transcription. In phonetic transcription you either go with what the IPA has recommended, which is above with descenders like ɡ̊, or you do whatever you want because your particular transcription tradition has no fixed rules.

And yes, the integral sign is in practice a completely different animal from esh ʃ. The esh has to blend in with the rest of the Latin-ish letters. It also has to have good default spacing and kerning because phonetic symbols aren’t handled with a fancy layout engine like those used for mathematics. For phonetics the esh should ideally have a shape similar to the Russian integral example posted above because that harmonizes well with the rest of the Latin letters in running text.

December 28, 2011

I agree that math fonts and text fonts have different requirements. As for integral signs, here is one that slants backwards (taken from Landau et Lifchitz, Mécanique (Mechanics), Moscow).

(the bar at the left comes from an absolute value, and I kept it to make sure everything is vertically aligned).

December 29, 2011

Very nice integral Michel :) one day I'll added Russian-like integral to XITS math font. It already has upright integrals (inherited from STIX) as alternate.

December 29, 2011

Two things on my wish list for phonetics fonts are slashed zero instead of a slashed circle for ∅ U+2205 ‘Empty Set’ and a gelded question mark alternate for ʔ U+0294 ‘Latin Letter Glottal Stop’.

The first wish is because slashed zero is a far more common null symbol in linguistics than the mathematical practice of a slashed circle (which comes from Nicholas Bourbaki as I recall). It’s available as an alternate for U+0030 ‘Digit Zero’ in many fonts (the OpenType zero option), but as far as I know it’s never been implemented in any font as an alternate for ∅ U+2205 ‘Empty Set’. Currently I use some TeX shenanigans to get the displayed form as 0 in PDF, but the plain text version in the PDF as ∅ so that it’s semantically correct for text processing. That’s an ugly kludge. Many (most?) linguists are less typographically and technically inclined than I am and hence they abuse Ø or ø because these look somewhat better in text than the slashed circle of ∅ found in most typefaces nowadays. Cambria tried hard to get away from the geometric circle with its ‘slashed ovoid’, but I think in the end it has pleased nobody. (I like it, but I still prefer the slashed zero more. A few mathematicians I’ve talked to still prefer the circle more.)

The gelded question mark hearkens back to the original development of the glottal stop symbol. I don’t know who first invented it, but after ʼ was used for a while someone wanted a more visually obvious symbol so they started using a question mark without the ball (hence ‘gelded’). The IPA community invented the form with a stem extending to the baseline and a serifed base, but the gelded form is still in use, though mostly only in handwriting nowadays. To properly replicate forms from older documents a gelded alternate for ʔ U+0294 ‘Latin Letter Glottal Stop’ is really appropriate, but so far I know of no fonts that have implemented it. I supposed that variants of ʕ, ʖ, ʡ, and ʢ might be desirable too, but I think just offering the alternate for ʔ would be a good enough gesture. Anyone implementing a modern face like the Scotch Modern used by the US Government Printing Office would be obliged to make the default glottal stop a gelded question mark for historical veracity.

One thing that I like about the SIL fonts (Charis SIL & Doulos SIL) and Gentium that I haven’t seen much support for elsewhere are the slightly larger and heavier variants of the apostrophic modifier letters, U+02BB – U+02BD and U+02EE. Most fonts simply duplicate the shapes of the quotation marks, but the modifier letters are properly alphabetic letters rather than punctuation. If they are ever so slightly distinct then they are more easily differentiated at the end of quotations, e.g. ‘xʼúxʼ’ or “xʼúxʼ”. One solution, used by Gentium, is to shift the vertical position and change the overall size of the modifier letters a slight bit. The SIL fonts instead change the weight and length of the modifier letters so that they fill the surrounding space a bit more and thus fit better into the word shape (bouma). On this forum the differentiation seems to be an accident since Georgia doesn’t include the spacing modifier letters, so that they are pulled from some other font by browser and layout engine magic.

December 29, 2011

I love seeing "bouma" in a thread I wasn't in!
As you were.

hhp

Phonetic symbols in Calibri and Cambria

Recommended Posts

Member Jon…

Link to comment

Member Si_…

Link to comment

Member Joh…

Link to comment

Member Joh…

Link to comment

Member joh…

Link to comment

Member Joh…

Link to comment

Member Si_…

Link to comment

Member Ric…

Link to comment

Member Syl…

Link to comment

Member Si_…

Link to comment

Member Mic…

Link to comment

Member Syl…

Link to comment

Member Si_…

Link to comment

Member Jon…

Link to comment

Member Jon…

Link to comment

Member jcr…

Link to comment

Member Mic…

Link to comment

Member Jon…

Link to comment

Member Mic…

Link to comment

Member Jon…

Link to comment

Member jcr…

Link to comment

Member Mic…

Link to comment

Member Kha…

Link to comment

Member jcr…

Link to comment

Member hra…

Link to comment

Create an account or sign in to comment

Create an account

Sign in

Our partners

Recent Discussions

Home

Forums

News & Events

Fonts

Knowledge

Exclusive

Legal

Important Information