Discussion:
[Docutils-develop] smartquotes config for French guillemets
jfbu
2017-03-31 10:03:35 UTC
Permalink
Hi,

(I hope soft wrapped long lines are not a problem)

currently at

https://sourceforge.net/p/docutils/code/HEAD/tree/trunk/docutils/docutils/utils/smartquotes.py#l440

the French guillemets are set-up to use narrow no-break space.

As I discussed off-list with @milde, who advises me to post here a summary here for future reference, the use of narrow no-break space or of a regular no-break inter-word space is debated.

There are two levels of discussion, generally speaking, regarding spacing with punctuation in French:

- nationalistic differences France, Belgium, Switzerland, Québec, Canada (I know Québec is part of Canada ;-) ), etc...

- cultural differences: typography versus people dealing more exclusively with web contents,

The official rules regarding French typography are spelled out in

Lexique des règles typographiques en usage a l'Imprimerie nationale, third edition, 1990.
@ Imprimerie nationale - 1990 - ISBN 2-11-081075-0

Therein it is absolutely clearly stated that French guillemets « and » should be used with an inter-word space, non breakable of course, not a "thin" or "narrow" space.

Actually, regarding LaTeX, the "frenchb" babel module uses a custom space (with some shrink and stretch) occupying 80% of an inter-word space.

The spacing in front of !, ? and ; is in French typography a "thin space" (une espace fine). This is undisputed even by web graphics designer, because actually the latter promote use of "thin space" everywhere. The Swiss also use a thin space in front of the colon :, but the French use a full inter-word space like for French guillemets. This is implemented in LaTeX by babel+frenchb: the space prior to !, ? and ; is 50% of inter-word space. It looks definitely shorter than the space related to French guillemets (80%) and the space prior to the colon (100%).

The use of the narrow no-break space by smartquotes.py is thus currently at odds with the official French reference by Imprimerie Nationale for typography.

It would be nice to have the setting configurable and even arguably that its default value should adopt the official French ones, as implemented (up to 80%) by LaTeX's frenchb module.

I am now collecting various links, and I owe a great thanks to @milde who gave me lots of information:

(in fact I will simply copy paste from his emails to me; the top level quotes are his, nested ones are mine)
the U+202F is used, "narrow no-break space".
This follows the adviceinhttps://fr.wikipedia.org/wiki/Guillemet#Usage_en_fran.C3.A7ais
En français, on sépare les guillemets typographiques ou français (« »)
de l’expression qu’ils mettent en exergue par une espace insécable
(fine si possible).
I would like to comment here that the wikipedia link cites no source and that I am planning to edit it to remove the "(fine si possible)" which is without reference and contradicts the

Lexique des règles typographiques en usage a l'Imprimerie nationale, third edition, 1990.
@ Imprimerie nationale - 1990 - ISBN 2-11-081075-0

pages 148-149

I reproduce the extract here form page 148 of the Lexique :

- Le deux-points, le tiret, les guillemets sont précédés et suivis de l'espace existant entre les mots de la ligne;

This says that French guillemets are to be associated (according to that reference) with the inter-word space, not the "espace fine".
IMO, it is undisputed that the space should be "insecable" but
there is no clear indication regarding the amount of space.
On the one side, there are proponents of a "full inter-word space"
http://www.graphicvertigo.com/elephant/les-regles-typographiques-de-la-ponctuation-francaise-episode-01/
...
http://damien.jullemier.pagesperso-orange.fr/typ/typ-espace-ponctuation.htm
I am adding this one:
http://typographisme.net/post/Les-espaces-typographiques-et-le-web
... une espace insécable (fine si possible).
which to me indicates that an espace fine is preferable but not always
available.
Also http://www.typoguide.ch says
espace normale « espace fine
espace fine » espace normale
(but I agree that Swiss typographie may have different rules).
Orthotypographie : les guillemets français sont séparés des mots
qu’ils entourent par une espace, toujours insécable, et qui peut être
une fine, un quart de cadratin ou une espace justifiante, selon les
écoles et... les capacités des logiciels utilisés.
--- http://listetypo.free.fr/ortho/guillemets.html
See also the essay on this topic by Jean Méron
http://listetypo.free.fr/meron/new/Guilles.pdf
It may depend upon national distinction,
This is also clear. Fortunately, this can be accounted for by the
"country"-subtag. smartquotes.py already has "fr-ch" (without spaces).
...
... une espace insécable (fine si possible).
which to me indicates that an espace fine is preferable but not always
available.
Yes that's indeed what it says, but I wonder how authoritative this should
be treated.
...
(from texdoc babel-french)
The next step is to provide correct spacing after \guillemotleft and
before \guillemotright: a space precedes and follows quotation marks
but no line break is allowed neither after the opening one, nor
before the closing one. \FBguillspace which does the spacing, has
been fine tuned by Thierry Bouche to 80% of an inter- word space but
with reduced stretchability.
This seems a compromise between the "official" regle to use a full
inter-word space and the "estetical" regle to use a narrower space.
[...]
IMO, there should be configurability as well as a sensible default.
Could it be possible for Docutils smartquotes to use U+00A0 and not
U+202F in association with the French guillemets ?
After spending some hours reading the sources, I tend to agree that
smartquotes.smartchars.quotes = {
# ...
'fr': (u'« ', u' »', u'“', u'”'), # full no-break space
'fr-CA': (u'« ', u' »', u'“', u'”'), # narrow no-break space
'fr-x-altquot': u'«»“”', # for use with manually set spaces or Babel (LaTeX)
'fr-ch': u'«»‹›',
'fr-ch-x-altquot': (u'« ', u' »', u'‹ ', u' ›'), # narrow no-break space
# ...
}
For Docutils, configuration is done using the ConfigParser.py module.
Therefore free configuration of replacement strings would require a ...
...
configuration setting to specify typographical quotes for a specific
smart-quotes-custom: it: «»“„
or
smart-quotes-custom: fr: ("« ", " »", "“", "”")
For the time beeing, as a workaround, you can use the "alt quotes"
without spaces in combination with manually set non-break-spaces or
Babel.
I should mention that the discussion was motivated by a PR at Sphinx github site,

https://github.com/sphinx-doc/sphinx/pull/3562

whose aims is to drop old custom smartypants in favour of Docutils smart_quotes option.

Previous discussion revealed a problem with elision character (perhaps one says apostrophe in English) being confounded with right single quote (https://sourceforge.net/p/docutils/bugs/313/).

Currently at Sphinx we intend to monkey-patch current smartquotes according to the patch kindly provided by @milde, and of course in future when Docutils official release will have incorporated it and we can drop support for earlier Docutils, we will simply inherit official version.

The problem with the U+202F is also that it is well adapted to represent an "espace fine" which is used before high punctuation (apart the colon), hence in LaTeX, if the character is declared to utf8 inputenc, it is complicated to get it to do the right thing both for French guillemets (no-break inter word space) and high punctuation (thin space).

As polyglossia French module does its things slightly different from babel+frenchb this also creates further issues.

Thus, I would like to advocate the use of U+00A0 per default with a suitable user config setting to use if so-desired the U+202F or to not use anything or to leave it entirely configurable by Docutils user.

Thanks for your patience with this very long message, and a great many thanks to @milde for his very kind answers to my queries via direct mail to him, thanks !

Best,

Jean-François B.
(jfbu at http://github.com/sphinx-doc/sphinx)
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to repl
Guenter Milde
2017-03-31 12:33:50 UTC
Permalink
On 2017-03-31, jfbu wrote:

...
Post by jfbu
The official rules regarding French typography are spelled out in
Lexique des règles typographiques en usage a l'Imprimerie nationale,
...
Post by jfbu
- Le deux-points, le tiret, les guillemets sont précédés et suivis de
l'espace existant entre les mots de la ligne;
This says that French guillemets are to be associated (according to
that reference) with the inter-word space, not the "espace fine".
... une espace insécable (fine si possible).
which to me indicates that an espace fine is preferable but not always
available.
...
Post by jfbu
Orthotypographie : les guillemets français sont séparés des mots
qu’ils entourent par une espace, toujours insécable, et qui peut être
une fine, un quart de cadratin ou une espace justifiante, selon les
écoles et... les capacités des logiciels utilisés.
--- http://listetypo.free.fr/ortho/guillemets.html
...
Post by jfbu
[...]
IMO, there should be configurability as well as a sensible default.
...
Post by jfbu
Thus, I would like to advocate the use of U+00A0 per default with a
suitable user config setting to use if so-desired the U+202F or to not
use anything or to leave it entirely configurable by Docutils user.
This is now implemented in the SVN repo (rev 8051).
Post by jfbu
After spending some hours reading the [text] sources, I tend to agree
that the "normal" NBSP is a suitable default for French.
After some more reading and consideration, the French quotes are now defined
as::

'fr': (u'« ', u' »', u'“', u'”'), # full no-break space
'fr-x-altquot': (u'« ', u' »', u'“', u'”'), # narrow no-break space
'fr-ch': u'«»‹›',
'fr-ch-x-altquot': (u'« ', u' »', u'‹ ', u' ›'), # narrow no-break space, http://typoguide.ch/

The --smart-quotes=alt setting now selects narrow spaces inside guillemets
in French documents, the default is a "normal" NBSP.

The Canadian government seems to propagate the same rules as fr
http://www.btb.termiumplus.gc.ca/tpv2guides/guides/redac/index-fra.html

fr-CH uses simple guillemets as secondary quotes, by default without inner
space, alternatively with small spaces.

The `Imprimerie` frowns upon the use of English quotes as secondary quotes:

De même, on n'emploiera qu'exceptionnellement dans un texte en français
les guillemets anglais ouvrants (") et fermants (").

--- S. 51

The current defaults assume that this is taken notice of by the author --
using " for both primary and secondary quotes and that the ' intends the
exceptional use of English quotes. It may be an ideo to change this to

'fr': (u'« ', u' »', u'’', u'’')

so that the APOSTROPHE character ' becomes a typographical apostrophe in all
positions.


Günter
Dmitry Shachnev
2017-03-31 13:20:36 UTC
Permalink
Hi Guenter!
Post by Guenter Milde
Post by jfbu
Thus, I would like to advocate the use of U+00A0 per default with a
suitable user config setting to use if so-desired the U+202F or to not
use anything or to leave it entirely configurable by Docutils user.
This is now implemented in the SVN repo (rev 8051).
I think there must be some kind of mistake in your commit: quotes['fr']
still uses U+202F (the narrow non-break space).
Post by Guenter Milde
Post by jfbu
from docutils.utils import smartquotes
smartquotes.__file__
'docutils/utils/smartquotes.py'
Post by Guenter Milde
Post by jfbu
quotes = smartquotes.smartchars.quotes
quotes['fr'] == quotes['fr-x-altquot']
True
Post by Guenter Milde
Post by jfbu
quotes['fr']
(u'\xab\u202f', u'\u202f\xbb', u'\u201c', u'\u201d')

--
Dmitry Shachnev
Guenter Milde
2017-03-31 15:48:14 UTC
Permalink
Post by Dmitry Shachnev
I think there must be some kind of mistake in your commit: quotes['fr']
still uses U+202F (the narrow non-break space).
Indeed. Fixed in Revision 8054.

Thanks for reporting,

Günter


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Re

Loading...