Discussion:
[Docutils-develop] attribute of SmartQuotes transform
jfbu
2017-05-28 09:10:32 UTC
Permalink
Hi,

the SmartQuotes class of docutils.transforms.universal
has a method apply() which contains

# Iterator educating quotes in plain text:
# (see "utils/smartquotes.py" for the attribute setting)
teacher = smartquotes.educate_tokens(self.get_tokens(txtnodes),
attr='qDe', language=lang)

The attribute 'qDe' is hard-coded. Could it be made
user-configurable via docutils.conf ?

I am asking this because originally I was looking for a quick work-around
with a problem with ``--`` being smartquotified into en-dash

(I know from "utils/smartquotes.py" docstring that I can use ``-\\-``
mark-up; this is Sphinx context)

My ``--`` problem will hopefully get resolved otherwise, but perhaps my query
makes sense nevertheless

Best,

Jean-François



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
Guenter Milde
2017-05-28 14:55:58 UTC
Permalink
Post by jfbu
the SmartQuotes class of docutils.transforms.universal
has a method apply() which contains
# (see "utils/smartquotes.py" for the attribute setting)
teacher = smartquotes.educate_tokens(self.get_tokens(txtnodes),
attr='qDe', language=lang)
The attribute 'qDe' is hard-coded. Could it be made
user-configurable via docutils.conf ?
I am asking this because originally I was looking for a quick work-around
with a problem with ``--`` being smartquotified into en-dash
(I know from "utils/smartquotes.py" docstring that I can use ``-\\-``
mark-up; this is Sphinx context)
In most cases, I'd recommend an inline literal (``--``).
Post by jfbu
My ``--`` problem will hopefully get resolved otherwise, but perhaps my
query makes sense nevertheless.
You could file a feature request at
https://sourceforge.net/p/docutils/feature-requests/

Günter


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All"
jfbu
2017-05-28 18:00:00 UTC
Permalink
Hi Günter
Post by Guenter Milde
[...]
Post by jfbu
(I know from "utils/smartquotes.py" docstring that I can use ``-\\-``
mark-up; this is Sphinx context)
In most cases, I'd recommend an inline literal (``--``).
This was a Sphinx issue with ``option`` directive. To fix it, we
have added our own Parser class in order to replace the
docutils.universal.SmartQuotes by our own SphinxSmartQuotes

It was needed because universal.SmartQuotes has a hard-coded list
of nodes which escape smart quotes processing.

Sphinx needed to add more nodes, here is what it does now at 1.6.2

for txtnode in txtnodes:
nodetype = texttype[isinstance(txtnode.parent,
(nodes.literal,
nodes.literal_block,
addnodes.literal_emphasis,
addnodes.literal_strong,
addnodes.desc_signature,
addnodes.productionlist,
addnodes.desc_optional,
addnodes.desc_name,
nodes.math,
nodes.image,
nodes.raw,
nodes.problematic))]

There was some discussion about whether literal_emphasis (among others)
should have inherited from literal, but it was not the case so far in
Sphinx, so by fear of consequences I did not do it and added it
to the list.
Post by Guenter Milde
Post by jfbu
My ``--`` problem will hopefully get resolved otherwise, but perhaps my
query makes sense nevertheless.
You could file a feature request at
https://sourceforge.net/p/docutils/feature-requests/
I still have this problem of not having a SourceForge account...
The Create Ticket is grayed out for me.

But my feature request would be about making the attr='qDe'
in teacher arguments in apply() method of SmartQuotes object
a customizable attribute for example

and the node class list in the get_tokens() method also
could be easier to customize (but now that we inherited
the class and overwrote the get_tokens()
of course we can do whatever needed on our side)

for details of what we needed to do:

https://github.com/sphinx-doc/sphinx/pull/3816

and then

https://github.com/sphinx-doc/sphinx/pull/3819

Best,

Jean-François


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
jfbu
2017-05-28 18:07:09 UTC
Permalink
Post by jfbu
and then
https://github.com/sphinx-doc/sphinx/pull/3819
sorry I meant

https://github.com/sphinx-doc/sphinx/pull/3818

it was somewhat hectic Sunday at Sphinx...

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
Guenter Milde
2017-05-30 20:58:46 UTC
Permalink
Dear Jean François,
Post by jfbu
Post by Guenter Milde
[...]
Post by jfbu
(I know from "utils/smartquotes.py" docstring that I can use ``-\\-``
mark-up; this is Sphinx context)
In most cases, I'd recommend an inline literal (``--``).
This was a Sphinx issue with ``option`` directive. To fix it, we
have added our own Parser class in order to replace the
docutils.universal.SmartQuotes by our own SphinxSmartQuotes
It was needed because universal.SmartQuotes has a hard-coded list
of nodes which escape smart quotes processing.
I changed this by storing the lists as class attributes (in the repository
version).
Post by jfbu
Sphinx needed to add more nodes, here is what it does now at 1.6.2
nodetype = texttype[isinstance(txtnode.parent,
(nodes.literal,
nodes.literal_block,
addnodes.literal_emphasis,
addnodes.literal_strong,
addnodes.desc_signature,
addnodes.productionlist,
addnodes.desc_optional,
addnodes.desc_name,
nodes.math,
nodes.image,
nodes.raw,
nodes.problematic))]
This is a long list. However,

block-level elements can be just skipped.

E.g. nodes.literal_block is an instance of nodes.FixedTextElement and
will not be processed by SmartQuotes.

(If it is not an instance of nodes.FixedTextElement in Sphinx, it can now
be appended to transforms.universal.SmartQuotes.nodes_to_skip.)
Post by jfbu
There was some discussion about whether literal_emphasis (among others)
should have inherited from literal, but it was not the case so far in
Sphinx, so by fear of consequences I did not do it and added it
to the list.
...
Post by jfbu
But my feature request would be about making the attr='qDe'
in teacher arguments in apply() method of SmartQuotes object
a customizable attribute for example
Done in the repo.


Günter


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All"
jfbu
2017-05-30 22:03:01 UTC
Permalink
Hi Günter

thanks and I will reply tomorrow
(see https://github.com/sphinx-doc/sphinx/issues/3826 for current Sphinx
status)
but I currently face one of those midnight findings... which I wish
to report here

I was trying to update Sphinx import of Docutils smartquotes.py
to 0.14rc1 level (backporting, sort of to users still with 0.13.1)
but had an issue I could not understand with processing of

''

It comes out ok with Docutils 0.14rc1 but not with Docutils 0.13.1
+ Sphinx monkey patching even updated. I scratched my head for a while
before I realized the transforms were applied with attr = '2'. This
comes from

teacher = smartquotes.educate_tokens(self.get_tokens(txtnodes),
attr='2', language=lang)

in 0.13.1 transforms.universal.py

This activates the backticks processing. To the contrary with 0.14rc1,
the default is attr='qDe' which does not activate backticks processors.

I tested with 0.14rc1 replacing attr='qDe' by attr = '2' (or '1', iirc)
and then I do get the "bad" processing:

isolated '' becomes ”.

Now I wonder what this has to do with backticks, and whether it is
an issue in Docutils ?

This all being said, the processing of '' should never have happened
in Sphinx context
(it comes from documentation of Python doc, ``foo = ''``) but
see link above, I forgot yesterday desc_annotation. And autodoc
generated documentation displays the problem, because of that.

Good night,

Jean-François


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Pl
jfbu
2017-05-31 06:56:01 UTC
Permalink
Post by jfbu
isolated '' becomes ”.
Now I wonder what this has to do with backticks, and whether it is
an issue in Docutils ?
Replying myself: it is all very logical (what else with
computers?) when one looks
at educateBackticks() as it replaces '' by the language
closing double quote. I have no idea whether this is
an "issue" (in how many contexts is an isolated ''
probable, apart from a str definition in Python but
this is typically where you don't want any smart quotes
anyway, so you are not likely to see the "issue"),
the point is only that 0.13.1 was
using attr='2' in transforms.universal.SmartQuotes()
but 0.14rc1 uses attr = 'qDe' and thus the
isolated '' is converted not into a closing double quote
but in a pair of opening+closing quotes.

This is relevant to Sphinx to the extent that since
1.6.1 it hands over to Docutils the smart quotes processing
and did not have direct control of the attr=default setting
in transforms.universal.SmartQuotes() but thanks to
Günter latest revisions this is now easier.

(and at Sphinx, under my nefarious influence, most of
the discussion before the 1.6.1 merge of new SmartQuotes
was devoted to fix the French issues, and this discussion
had various steps and took time, and during all this
I overlooked that the merged patch was too radical in
removing the earlier mechanism at Sphinx for inhibition
of its own earlier smartypants mechanism)

Best,
Jean-François


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
Guenter Milde
2017-05-31 12:24:36 UTC
Permalink
On 2017-05-31, jfbu wrote:

...
the point is only that 0.13.1 was using attr='2' in
transforms.universal.SmartQuotes() but 0.14rc1 uses attr = 'qDe' and
thus the isolated '' is converted not into a closing double quote but
in a pair of opening+closing quotes.
The change to skip "backtick quote conversion" was motivated by considering:

* reStructuredText uses Character 0x60 ('`' GRAVE ACCENT) as inline
markup character, therefore the ``backtick quotes'' convention (common in
TeX) does not make sense in rST documents. (You would need to escape the
backticks like \`\`here''.)

* Switching off "backtick quote" conversion reduces the risk of
conversion errors and speeds up processing.
This is relevant to Sphinx to the extent that since
1.6.1 it hands over to Docutils the smart quotes processing
and did not have direct control of the attr=default setting
in transforms.universal.SmartQuotes()
The changed output should not bother, as both, previous and current
conversion is most likely unwanted.

Important is to suppress smartquote-conversion for all literal text:

a) Sphinx developers must consider/handle the additional "literal nodes"
in Sphinx.

b) End users must remember to escape or mark as literal any quotes that
should not be "educated", if setting "smart-quotes" to True.


Because of b) (and the possible mis-conversions), I suggest making the
smart quotes feature an opt-in also in Sphinx.

Günter


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to th
jfbu
2017-05-31 14:28:55 UTC
Permalink
Hi Günter
Post by Guenter Milde
...
* reStructuredText uses Character 0x60 ('`' GRAVE ACCENT) as inline
markup character, therefore the ``backtick quotes'' convention (common in
TeX) does not make sense in rST documents. (You would need to escape the
backticks like \`\`here''.)
* Switching off "backtick quote" conversion reduces the risk of
conversion errors and speeds up processing.
As Sphinx must support older Docutils, and is already monkey-patching
docutils.utils.smartquotes.educate_tokens() to incorporate the intervening
fixes, I am also backporting
this choice at https://github.com/sphinx-doc/sphinx/pull/3836

This is simpler than modifying the SmartQuotes.apply() via a rewrite
inside SphinxSmartQuotes class.

I also backported latest language quote additions at
https://github.com/sphinx-doc/sphinx/pull/3832

When used with Docutils 0.14, Sphinx of course drops all these backports.
Post by Guenter Milde
...
The changed output should not bother, as both, previous and current
conversion is most likely unwanted.
indeed ;-)
Post by Guenter Milde
a) Sphinx developers must consider/handle the additional "literal nodes"
in Sphinx.
yes, will be done for 1.6.3. Awareness that our switch to Docutils smart
quotes had been done a bit hastily emerged only after 1.6 release.
Post by Guenter Milde
b) End users must remember to escape or mark as literal any quotes that
should not be "educated", if setting "smart-quotes" to True.
Because of b) (and the possible mis-conversions), I suggest making the
smart quotes feature an opt-in also in Sphinx.
I would tend to agree (and obviously we will be kept busy with smart quotes
again, as currently there is no good interaction with smartquotes-locales)

but afaict
Sphinx always has had True as default for html_use_smartypants config value,

It looked thus natural to keep it that way.

Jean-François
Guenter Milde
2017-06-01 13:28:29 UTC
Permalink
Dear Jean-Mark,
...
Post by jfbu
Post by Guenter Milde
b) End users must remember to escape or mark as literal any quotes that
should not be "educated", if setting "smart-quotes" to True.
Because of b) (and the possible mis-conversions), I suggest making the
smart quotes feature an opt-in also in Sphinx.
I would tend to agree (and obviously we will be kept busy with smart quotes
again, as currently there is no good interaction with smartquotes-locales)
but afaict Sphinx always has had True as default for
html_use_smartypants config value,
It looked thus natural to keep it that way.
Recommended reading:
http://docutils.sourceforge.net/docs/user/smartquotes.html#why-you-might-not-want-to-use-smart-quotes-in-your-documents
(from the original SmartyPants documentation).

Günter


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Pleas

jfbu
2017-05-31 08:21:58 UTC
Permalink
Post by Guenter Milde
Post by jfbu
It was needed because universal.SmartQuotes has a hard-coded list
of nodes which escape smart quotes processing.
I changed this by storing the lists as class attributes (in the repository
version).
Thanks! Takeshi did similarly at
https://github.com/sphinx-doc/sphinx/pull/3825/files
and we must do that to support Docutils < 0.14 also.

In the above PR #3825 the node list is the bad one from my 1.6.2
last minute fix, because the further discussion is at
https://github.com/sphinx-doc/sphinx/issues/3826
Post by Guenter Milde
This is a long list. However,
block-level elements can be just skipped.
E.g. nodes.literal_block is an instance of nodes.FixedTextElement and
will not be processed by SmartQuotes.
(If it is not an instance of nodes.FixedTextElement in Sphinx, it can now
be appended to transforms.universal.SmartQuotes.nodes_to_skip.)
afaik and quickly checked, literal_block is imported as is from
docutils.nodes and not modified. Takeshi had noticed my mistake and
focuses discussion on TextElement instances at Sphinx issue 3826
Post by Guenter Milde
Post by jfbu
But my feature request would be about making the attr='qDe'
in teacher arguments in apply() method of SmartQuotes object
a customizable attribute for example
Done in the repo.
seen, thanks!

Best,
Jean-François


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
Loading...