Discussion:
[Docutils-develop] docutils.dtd: id vs. ids
Guenter Milde
2017-01-01 22:24:12 UTC
Permalink
Dear Docutils developers, dear David,

Testing docutils.dtd with ::

xmllint --dtdvalid docutils.dtd standalone_rst_docutils_xml.xml

shows ca. 100 error messages like ::

...
standalone_rst_docutils_xml.xml:1082: element footnote: validity error : IDREFS attribute backrefs references an unknown ID "id31"
standalone_rst_docutils_xml.xml:154: element reference: validity error : IDREF attribute refid references an unknown ID "topics-sidebars-and-rubrics"
...

and concludes:

Document standalone_rst_docutils_xml.xml does not validate against docutils.dtd

The problem is, that in XML there is no datatype IDS for a list of ID
values (similar to NMTOKEN/NMTOKENS). Hence, the docutils.dtd uses ::

" ids NMTOKENS #IMPLIED

OTOH, there are references to the ids ::

" refid IDREF #IMPLIED ">

" backrefs IDREFS #IMPLIED ">


However, xmllints does not know that NMTOKENS are used as ID and hence
reports validity errors.

When (as a test) changing the datatype of ids to ID ::

- " ids NMTOKENS #IMPLIED
+ " ids ID #IMPLIED

xmllint reports 22 errors ::

...
standalone_rst_docutils_xml.xml:243: element reference: validity error : IDREF attribute refid references an unknown ID "subtitle"
standalone_rst_docutils_xml.xml:1583: element system_message: validity error : IDREFS attribute backrefs references an unknown ID "id86"


The XML standard says that an id must be unique and that only one id per
element is allowed.

git blame says, that id was changed to ids (and ID to NMTOKEN) in Oktober
2005.


What was the reason for multiple ids on one element?

Can we avoid this?
How can we proceed?

1. normalize ids (use the first and change references, say)

a) during parsing
b) in a transform

2. normalize ids just for XML output

3. don't care for xmllint

4. use "NMTOKEN/NMTOKENS" instead of "REFID/S" in
refid and backrefs? (don't test for matching references and id uniqueness)


My preference would be to change "ids" to "id" and use one id per object
either during parsing or via a transform.

Günter



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply
David Goodger
2017-01-02 19:36:15 UTC
Permalink
Post by Guenter Milde
Dear Docutils developers, dear David,
xmllint --dtdvalid docutils.dtd standalone_rst_docutils_xml.xml
...
standalone_rst_docutils_xml.xml:1082: element footnote: validity error : IDREFS attribute backrefs references an unknown ID "id31"
standalone_rst_docutils_xml.xml:154: element reference: validity error : IDREF attribute refid references an unknown ID "topics-sidebars-and-rubrics"
...
Document standalone_rst_docutils_xml.xml does not validate against docutils.dtd
The problem is, that in XML there is no datatype IDS for a list of ID
" ids NMTOKENS #IMPLIED
" refid IDREF #IMPLIED ">
" backrefs IDREFS #IMPLIED ">
However, xmllints does not know that NMTOKENS are used as ID and hence
reports validity errors.
- " ids NMTOKENS #IMPLIED
+ " ids ID #IMPLIED
...
standalone_rst_docutils_xml.xml:243: element reference: validity error : IDREF attribute refid references an unknown ID "subtitle"
standalone_rst_docutils_xml.xml:1583: element system_message: validity error : IDREFS attribute backrefs references an unknown ID "id86"
The XML standard says that an id must be unique and that only one id per
element is allowed.
git blame says, that id was changed to ids (and ID to NMTOKEN) in Oktober
2005.
What was the reason for multiple ids on one element?
I believe it was a practical addition, as we needed it. Maybe because
of multiple hyperlink targets assigned to an object.
Post by Guenter Milde
Can we avoid this?
How can we proceed?
1. normalize ids (use the first and change references, say)
a) during parsing
b) in a transform
2. normalize ids just for XML output
3. don't care for xmllint
4. use "NMTOKEN/NMTOKENS" instead of "REFID/S" in
refid and backrefs? (don't test for matching references and id uniqueness)
My preference would be to change "ids" to "id" and use one id per object
either during parsing or via a transform.
Why do you want to make a change? What's the purpose, the desired improvement?

I don't care about xmllint at all. The DTD was never meant to be
prescriptive, just descriptive. It merely describes the internal data
structure used by Docutils: the Doctree. It was never meant to be used
to validate documents (you may be the first to try). Doctree-XML
documents are relatively rare anyway.

You're letting the tail wag the dog. Don't do that.

If a validatable XML output format is needed, a Writer can be
implemented for that. It would have to have a slightly different DTD,
and that's OK.

David Goodger
<http://python.net/~goodger>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
Guenter Milde
2017-01-05 13:37:44 UTC
Permalink
Dear Docutils developers, dear David,
Post by David Goodger
Post by Guenter Milde
My preference would be to change "ids" to "id" and use one id per object
either during parsing or via a transform.
Why do you want to make a change?
What's the purpose, the desired improvement?
I want Docutils to produce valid output documents.
Post by David Goodger
I don't care about xmllint at all. The DTD was never meant to be
prescriptive, just descriptive. It merely describes the internal data
structure used by Docutils: the Doctree.
It was never meant to be used to validate documents (you may be the
first to try).
So I got astray by interpreting "formally defined" in

The Docutils document structure is formally defined by the `Docutils
Generic DTD`_ XML document type definition, docutils.dtd_

and the reference to docutils.dtd in the DOCTYPE declaration of Docutils XML
documents in the context of XML/DTD conventions.



In case your preference stays
Post by David Goodger
Post by Guenter Milde
3. don't care for xmllint,
we should remove the DOCTYPE declaration from the XML-writer
output and clearly state in our documentation that Docutils XML documents
are `well formed`__ but not valid documents according to the Docutils
Document Type Definition.

__ https://www.w3.org/TR/REC-xml/#sec-well-formed

+1 relatively simple

-1 we miss a chance to spot contradictions between the description of the
data structure and the actual implementation.

...
Post by David Goodger
If a validatable XML output format is needed, a Writer can be
implemented for that. It would have to have a slightly different DTD,
and that's OK.
A slightly different DTD would already achieve valid output without any
Post by David Goodger
Post by Guenter Milde
4. use "NMTOKEN/NMTOKENS" instead of "REFID/S" in
refid and backrefs?
+1 simplest change to get valid XML output

-1 less informative DTD
not making use of the pre-defined Datatypes for internal references.

-0 no test for matching references


My preference remains to
Post by David Goodger
Post by Guenter Milde
2. normalize ids
to one id per element at least for the XML output. This can be
encapsulated in a transform called by the XML writer.

There is no need to rename "ids" to "id". The XML standard (while
allowing only `one ID per element`_) does not prescribe a name for the
ID-attribute.

__ https://www.w3.org/TR/REC-xml/#one-id-per-el

+1 appropriate attribute types (ID, IDREF, IDREFS) in the DTD

-1 more work (write a transform)

-1 difference between node tree and DTD/XML representation


Günter




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply t
David Goodger
2017-01-16 22:26:57 UTC
Permalink
Post by Guenter Milde
Dear Docutils developers, dear David,
Post by David Goodger
Post by Guenter Milde
My preference would be to change "ids" to "id" and use one id per object
either during parsing or via a transform.
Why do you want to make a change?
What's the purpose, the desired improvement?
I want Docutils to produce valid output documents.
Doesn't it already?
Post by Guenter Milde
Post by David Goodger
I don't care about xmllint at all. The DTD was never meant to be
prescriptive, just descriptive. It merely describes the internal data
structure used by Docutils: the Doctree.
It was never meant to be used to validate documents (you may be the
first to try).
So I got astray by interpreting "formally defined" in
The Docutils document structure is formally defined by the `Docutils
Generic DTD`_ XML document type definition, docutils.dtd_
and the reference to docutils.dtd in the DOCTYPE declaration of Docutils XML
documents in the context of XML/DTD conventions.
I meant "formally defined" = "as defined by this formal language, an
XML DTD". I didn't mean to imply that the DTD should be used in a
system.
Post by Guenter Milde
In case your preference stays
Post by David Goodger
Post by Guenter Milde
3. don't care for xmllint,
we should remove the DOCTYPE declaration from the XML-writer
output and clearly state in our documentation that Docutils XML documents
are `well formed`__ but not valid documents according to the Docutils
Document Type Definition.
__ https://www.w3.org/TR/REC-xml/#sec-well-formed
+1 relatively simple
-1 we miss a chance to spot contradictions between the description of the
data structure and the actual implementation.
If this is useful, it's a valid reason for making changes. How useful
is it? Is the added utility worth the time and trouble of making the
changes?
Post by Guenter Milde
Post by David Goodger
If a validatable XML output format is needed, a Writer can be
implemented for that. It would have to have a slightly different DTD,
and that's OK.
A slightly different DTD would already achieve valid output without any
Post by David Goodger
Post by Guenter Milde
4. use "NMTOKEN/NMTOKENS" instead of "REFID/S" in
refid and backrefs?
+1 simplest change to get valid XML output
-1 less informative DTD
not making use of the pre-defined Datatypes for internal references.
-0 no test for matching references
I don't like it. The whole point of the DTD is to be informative documentation.

I'd rather have an "invalid" DTD using the IDS type than water it down
this way. We could add a note to the DTD that it's descriptive only,
not to be used for validation.
Post by Guenter Milde
My preference remains to
Post by David Goodger
Post by Guenter Milde
2. normalize ids
to one id per element at least for the XML output. This can be
encapsulated in a transform called by the XML writer.
There is no need to rename "ids" to "id". The XML standard (while
allowing only `one ID per element`_) does not prescribe a name for the
ID-attribute.
__ https://www.w3.org/TR/REC-xml/#one-id-per-el
+1 appropriate attribute types (ID, IDREF, IDREFS) in the DTD
-1 more work (write a transform)
-1 difference between node tree and DTD/XML representation
That last one is the biggest issue for me. The descriptive/generic DTD
is meant to describe & document the internal Doctree data structure in
Docutils, *not* its XML output. We should not compromise the accuracy
of the descriptive DTD: it has to match the internal data structure.

As a workaround, perhaps we could add a functional DTD (for validation
of the XML output) in addition to the descriptive/generic DTD
(docs/ref/docutils.dtd). And the attribute should be renamed (by the
transform and in this DTD) to "id".

David Goodger
<http://python.net/~goodger>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
Guenter Milde
2017-01-17 22:15:55 UTC
Permalink
Post by David Goodger
Post by Guenter Milde
Dear Docutils developers, dear David,
Post by David Goodger
Post by Guenter Milde
My preference would be to change "ids" to "id" and use one id per object
either during parsing or via a transform.
Why do you want to make a change?
What's the purpose, the desired improvement?
I want Docutils to produce valid output documents.
Doesn't it already?
No, the XML output does not validate against the DTD it specifies in the
!DOCTYPE tag.

OTOH, HTML output passes validation with http://validator.w3.org/
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
I don't care about xmllint at all. The DTD was never meant to be
prescriptive, just descriptive. It merely describes the internal data
structure used by Docutils: the Doctree.
It was never meant to be used to validate documents (you may be the
first to try).
So I got astray by interpreting "formally defined" in
The Docutils document structure is formally defined by the `Docutils
Generic DTD`_ XML document type definition, docutils.dtd_
and the reference to docutils.dtd in the DOCTYPE declaration of Docutils XML
documents in the context of XML/DTD conventions.
I meant "formally defined" = "as defined by this formal language, an
XML DTD". I didn't mean to imply that the DTD should be used in a
system.
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
3. don't care for xmllint,
...
Post by David Goodger
Post by Guenter Milde
-1 we miss a chance to spot contradictions between the description of the
data structure and the actual implementation.
If this is useful, it's a valid reason for making changes. How useful
is it? Is the added utility worth the time and trouble of making the
changes?
Validating the the XML output against the DTD is a test whether

* "the description of the data structure in a formal language" follows
the rules of this formal language, and

* the actual output matches the description.

Currently, Docutils fails this test. This is an unsatisfactory
situation for a project that puts so much emphasis on testing.

For future development, validating can ensure that the data structure and
its description in the DTD keep in sync.

When making changes to the code, testing is a must.

-- policies.txt

Second bonus: cleanly defined interface to the XML world.
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
If a validatable XML output format is needed, a Writer can be
implemented for that. It would have to have a slightly different DTD,
and that's OK.
A slightly different DTD would already achieve valid output without any
Post by David Goodger
Post by Guenter Milde
4. use "NMTOKEN/NMTOKENS" instead of "REFID/S" in
refid and backrefs?
+1 simplest change to get valid XML output
-1 less informative DTD
not making use of the pre-defined Datatypes for internal references.
-0 no test for matching references
I don't like it. The whole point of the DTD is to be informative documentation.
I'd rather have an "invalid" DTD using the IDS type than water it down
this way. We could add a note to the DTD that it's descriptive only,
not to be used for validation.
I prefer a valid DTD with parts of the informative description in the
comments,

--- docutils.dtd (Revision 8012)
+++ docutils.dtd (Arbeitskopie)
@@ -64,10 +64,10 @@
<!ENTITY % refuri.att
" refuri CDATA #IMPLIED ">

-<!-- Internal reference to the `id` attribute of an element. -->
+<!-- Internal reference to the `ids` attribute of an element. -->
<!ENTITY % refid.att
- " refid IDREF #IMPLIED ">
+ " refid NMTOKEN #IMPLIED ">

-<!-- Space-separated list of id references, for backlinks. -->
+<!-- Space-separated list of `ids` references, for backlinks. -->
<!ENTITY % backrefs.att
- " backrefs IDREFS #IMPLIED ">
+ " backrefs NMTOKENS #IMPLIED ">

over a "DTD-like" document with a note saying "This document uses an
extension of the DTD language.", the definition of the new datatype
`IDS`, and the re-definition of `IDREF` and `IDREFS` to include
references to ids in `IDS`.

...
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
2. normalize ids
to one id per element at least for the XML output. This can be
encapsulated in a transform called by the XML writer.
There is no need to rename "ids" to "id". The XML standard (while
allowing only `one ID per element`_) does not prescribe a name for the
ID-attribute.
__ https://www.w3.org/TR/REC-xml/#one-id-per-el
+1 appropriate attribute types (ID, IDREF, IDREFS) in the DTD
-1 more work (write a transform)
-1 difference between node tree and DTD/XML representation
That last one is the biggest issue for me. The descriptive/generic DTD
is meant to describe & document the internal Doctree data structure in
Docutils, *not* its XML output. We should not compromise the accuracy
of the descriptive DTD: it has to match the internal data structure.
I agree.
Post by David Goodger
As a workaround, perhaps we could add a functional DTD (for validation
of the XML output) in addition to the descriptive/generic DTD
(docs/ref/docutils.dtd). And the attribute should be renamed (by the
transform and in this DTD) to "id".
-1 no possibility to test the "descriptive" DTD
-1 more work (keep 2 documents up to date),
-1 confusing (this situation would need extra explanation)


With normalization during parsing, we could just change the datatype of
`ids`:

- " ids NMTOKENS #IMPLIED
+ " ids ID #IMPLIED

+1 appropriate attribute types (ID, IDREF, IDREFS) in the DTD
+1 allows testing the "descriptive" DTD and the XML output
+1 no difference between node tree and DTD/XML representation
+1 simple
+0 allows simplification of functions using the ID
-1 API change
-0 name `ids` but datatype `ID`



With normalization in a transform, we could add the basic attribute `id`
alongside `ids`

<!ENTITY % basic.atts
" ids NMTOKENS #IMPLIED
+ id ID #IMPLIED
names CDATA #IMPLIED

and describe the normalization as an action similar to the handling of
`pending` elements.

+1 appropriate attribute types (ID, IDREF, IDREFS) in the DTD
+1 allows testing the "descriptive" DTD and the XML output
+1 no difference between node tree and DTD/XML representation
-1 two almost identical attributes


Günter
David Goodger
2017-01-29 22:27:24 UTC
Permalink
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
Dear Docutils developers, dear David,
Post by David Goodger
Post by Guenter Milde
My preference would be to change "ids" to "id" and use one id per object
either during parsing or via a transform.
Why do you want to make a change?
What's the purpose, the desired improvement?
I want Docutils to produce valid output documents.
Doesn't it already?
No, the XML output does not validate against the DTD it specifies in the
!DOCTYPE tag.
OTOH, HTML output passes validation with http://validator.w3.org/
Except for the one case of validating Docutils' XML output against its
DTD, everything else is valid, right?
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
I don't care about xmllint at all. The DTD was never meant to be
prescriptive, just descriptive. It merely describes the internal data
structure used by Docutils: the Doctree.
It was never meant to be used to validate documents (you may be the
first to try).
So I got astray by interpreting "formally defined" in
The Docutils document structure is formally defined by the `Docutils
Generic DTD`_ XML document type definition, docutils.dtd_
and the reference to docutils.dtd in the DOCTYPE declaration of Docutils XML
documents in the context of XML/DTD conventions.
I meant "formally defined" = "as defined by this formal language, an
XML DTD". I didn't mean to imply that the DTD should be used in a
system.
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
3. don't care for xmllint,
...
Post by David Goodger
Post by Guenter Milde
-1 we miss a chance to spot contradictions between the description of the
data structure and the actual implementation.
If this is useful, it's a valid reason for making changes. How useful
is it? Is the added utility worth the time and trouble of making the
changes?
Validating the the XML output against the DTD is a test whether
* "the description of the data structure in a formal language" follows
the rules of this formal language, and
* the actual output matches the description.
Currently, Docutils fails this test. This is an unsatisfactory
situation for a project that puts so much emphasis on testing.
For future development, validating can ensure that the data structure and
its description in the DTD keep in sync.
That is useful, yes.
Post by Guenter Milde
When making changes to the code, testing is a must.
-- policies.txt
Second bonus: cleanly defined interface to the XML world.
Also good.

In the rest of your reply, I count 3 different proposed solutions.
Which is your preferred variation?
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
If a validatable XML output format is needed, a Writer can be
implemented for that. It would have to have a slightly different DTD,
and that's OK.
A slightly different DTD would already achieve valid output without any
Post by David Goodger
Post by Guenter Milde
4. use "NMTOKEN/NMTOKENS" instead of "REFID/S" in
refid and backrefs?
+1 simplest change to get valid XML output
-1 less informative DTD
not making use of the pre-defined Datatypes for internal references.
-0 no test for matching references
I don't like it. The whole point of the DTD is to be informative documentation.
I'd rather have an "invalid" DTD using the IDS type than water it down
this way. We could add a note to the DTD that it's descriptive only,
not to be used for validation.
I prefer a valid DTD with parts of the informative description in the
comments,
--- docutils.dtd (Revision 8012)
+++ docutils.dtd (Arbeitskopie)
@@ -64,10 +64,10 @@
<!ENTITY % refuri.att
" refuri CDATA #IMPLIED ">
-<!-- Internal reference to the `id` attribute of an element. -->
+<!-- Internal reference to the `ids` attribute of an element. -->
<!ENTITY % refid.att
- " refid IDREF #IMPLIED ">
+ " refid NMTOKEN #IMPLIED ">
-<!-- Space-separated list of id references, for backlinks. -->
+<!-- Space-separated list of `ids` references, for backlinks. -->
<!ENTITY % backrefs.att
- " backrefs IDREFS #IMPLIED ">
+ " backrefs NMTOKENS #IMPLIED ">
over a "DTD-like" document with a note saying "This document uses an
extension of the DTD language.", the definition of the new datatype
`IDS`, and the re-definition of `IDREF` and `IDREFS` to include
references to ids in `IDS`.
Call this one (above) #1: change (loosen) the DTD so it validates
against the existing data, to reflect the data structure.

I'm ±0 on this. Loosening the DTD lessens its usefulness, both
functionally and as documentation.
Post by Guenter Milde
...
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
2. normalize ids
to one id per element at least for the XML output. This can be
encapsulated in a transform called by the XML writer.
There is no need to rename "ids" to "id". The XML standard (while
allowing only `one ID per element`_) does not prescribe a name for the
ID-attribute.
__ https://www.w3.org/TR/REC-xml/#one-id-per-el
+1 appropriate attribute types (ID, IDREF, IDREFS) in the DTD
-1 more work (write a transform)
-1 difference between node tree and DTD/XML representation
That last one is the biggest issue for me. The descriptive/generic DTD
is meant to describe & document the internal Doctree data structure in
Docutils, *not* its XML output. We should not compromise the accuracy
of the descriptive DTD: it has to match the internal data structure.
I agree.
Post by David Goodger
As a workaround, perhaps we could add a functional DTD (for validation
of the XML output) in addition to the descriptive/generic DTD
(docs/ref/docutils.dtd). And the attribute should be renamed (by the
transform and in this DTD) to "id".
-1 no possibility to test the "descriptive" DTD
-1 more work (keep 2 documents up to date),
-1 confusing (this situation would need extra explanation)
With normalization during parsing,
During *which* parsing? Parsing of reST or parsing of Docutils XML?

What would the normalization consist of?
Post by Guenter Milde
we could just change the datatype of
- " ids NMTOKENS #IMPLIED
+ " ids ID #IMPLIED
Call the above #2: normalization of _______ (fill in the blank).

My vote depends on which parsing gets normalized. If it's the internal
Doctree that gets normalized: -1, because this is the tail wagging the
dog. This backend testing detail should not dictate the frontend
parsing or internal data structure.
Post by Guenter Milde
+1 appropriate attribute types (ID, IDREF, IDREFS) in the DTD
+1 allows testing the "descriptive" DTD and the XML output
+1 no difference between node tree and DTD/XML representation
+1 simple
+0 allows simplification of functions using the ID
-1 API change
-0 name `ids` but datatype `ID`
-1 on the last one. The name should match and describe the contents.
Post by Guenter Milde
With normalization in a transform, we could add the basic attribute `id`
alongside `ids`
<!ENTITY % basic.atts
" ids NMTOKENS #IMPLIED
+ id ID #IMPLIED
names CDATA #IMPLIED
and describe the normalization as an action similar to the handling of
`pending` elements.
Call the above #3: normalization via transform

Would this merely be a transform of the Doctree into validatable XML,
triggered by the XML writer? And not used for anything other than
that?
Post by Guenter Milde
+1 appropriate attribute types (ID, IDREF, IDREFS) in the DTD
+1 allows testing the "descriptive" DTD and the XML output
+1 no difference between node tree and DTD/XML representation
-1 two almost identical attributes
A transform could very easily remove the "ids" attribute after
transforming it into "id".

Note: an element could have multiple IDs in its "ids" attribute. How
exactly do these get transformed into multiple "id" attributes? I
imagine there would have to be elements (like "target") added
appropriately, one per excess ID.

I like option #3 the best. I'd add in one aspect of option #1: add
comments to the DTD describing the internal data structure. For
example (modified snippet of docutils.dtd):

<!--
Attributes shared by all elements in this DTD:

- `id` is a unique identifier, typically assigned by the system.
(Internally, Docutils uses an `ids` attribute consisting of zero or
more unique identifiers.)
- `names` are identifiers assigned in the markup.
- `dupnames` is the same as `name`, used when it's a duplicate.
- `source` is the name of the source of this document or fragment.
- `classes` is used to transmit individuality information forward.
-->
<!ENTITY % basic.atts
" id ID #IMPLIED
names CDATA #IMPLIED
dupnames CDATA #IMPLIED
source CDATA #IMPLIED
classes NMTOKENS #IMPLIED
%additional.basic.atts; ">

I think it's time for a concrete proposal.

David Goodger
<http://python.net/~goodger>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Pleas
Guenter Milde
2017-02-03 12:37:02 UTC
Permalink
Dear Docutils developers, dear David,
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
My preference would be to change "ids" to "id" and use one id per object
either during parsing or via a transform.
Why do you want to make a change?
What's the purpose, the desired improvement?
I want Docutils to produce valid output documents.
Doesn't it already?
No, the XML output does not validate against the DTD it specifies in the
!DOCTYPE tag.
OTOH, HTML output passes validation with http://validator.w3.org/
Except for the one case of validating Docutils' XML output against its
DTD, everything else is valid, right?
Yes.¹

¹ * The output of functional tests is validated (and re-validated whenever
there is a change in the expected output).

* Generating invalid HTML or ODT or LaTeX documents from valid rST
counts as a valid Docutils bug (except for problems due to custom
preamble, CSS or raw input).
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
I don't care about xmllint at all. The DTD was never meant to be
prescriptive, just descriptive. It merely describes the internal data
structure used by Docutils: the Doctree.
It was never meant to be used to validate documents (you may be the
first to try).
Call this one (above) #0: don't care

...
Post by David Goodger
Post by Guenter Milde
Validating the the XML output against the DTD is a test whether
* "the description of the data structure in a formal language" follows
the rules of this formal language, and
* the actual output matches the description.
Currently, Docutils fails this test. This is an unsatisfactory
situation for a project that puts so much emphasis on testing.
For future development, validating can ensure that the data structure and
its description in the DTD keep in sync.
That is useful, yes.
...
Post by David Goodger
Post by Guenter Milde
Second bonus: cleanly defined interface to the XML world.
Also good.
In the rest of your reply, I count 3 different proposed solutions.
Which is your preferred variation?
They have rising cost with a nearly stable cost/benefit ratio, so the jury
is still out.

For 0.99.13.2, I prefer #1 `change (loosen) the DTD so it validates`.
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
+1 simplest change to get valid XML output
...
Post by David Goodger
I'm ±0 on this. Loosening the DTD lessens its usefulness, both
functionally and as documentation.
In the long term,
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
My preference would be to change "ids" to "id" and use one id per
object either during parsing or via a transform.
A detailled analysis proposal for this will follow in a separate posting.
Post by David Goodger
I think it's time for a concrete proposal.
For 0.13.2, I propose to complete the the first part of #1 `loosen the DTD`
by you (from So 02 Okt 2005 03:09:01 CEST) ::

Updated for plural attributes "classes", "ids", "names", "dupnames".


@@ -53,11 +53,11 @@ Attributes shared by all elements in this DTD:
- `class` is used to transmit individuality information forward.
-->
<!ENTITY % basic.atts
- " id ID #IMPLIED
- name CDATA #IMPLIED
- dupname CDATA #IMPLIED
+ " ids NMTOKENS #IMPLIED
+ names CDATA #IMPLIED
+ dupnames CDATA #IMPLIED
source CDATA #IMPLIED
- class NMTOKENS #IMPLIED
+ classes NMTOKENS #IMPLIED
%additional.basic.atts; ">

<!-- External reference to a URI/URL. -->

by the patch ::

--- docutils.dtd (Revision 8017)
+++ docutils.dtd (Arbeitskopie)
@@ -64,13 +64,13 @@
<!ENTITY % refuri.att
" refuri CDATA #IMPLIED ">

-<!-- Internal reference to the `id` attribute of an element. -->
+<!-- Internal reference to the `ids` attribute of an element. -->
<!ENTITY % refid.att
- " refid IDREF #IMPLIED ">
+ " refid NMTOKEN #IMPLIED ">

-<!-- Space-separated list of id references, for backlinks. -->
+<!-- Space-separated list of `ids` references, for backlinks. -->
<!ENTITY % backrefs.att
- " backrefs IDREFS #IMPLIED ">
+ " backrefs NMTOKENS #IMPLIED ">

<!--
Internal reference to the `name` attribute of an element. On a

so that we can validate the XML with, e.g., ::

xmllint standalone_rst_docutils_xml.xml --dtdvalid docutils.dtd


OK to commit?

Günter
David Goodger
2017-02-12 20:43:44 UTC
Permalink
Post by Guenter Milde
For 0.13.2, I propose to complete the the first part of #1 `loosen the DTD`
Updated for plural attributes "classes", "ids", "names", "dupnames".
- `class` is used to transmit individuality information forward.
-->
<!ENTITY % basic.atts
- " id ID #IMPLIED
- name CDATA #IMPLIED
- dupname CDATA #IMPLIED
+ " ids NMTOKENS #IMPLIED
+ names CDATA #IMPLIED
+ dupnames CDATA #IMPLIED
source CDATA #IMPLIED
- class NMTOKENS #IMPLIED
+ classes NMTOKENS #IMPLIED
%additional.basic.atts; ">
<!-- External reference to a URI/URL. -->
--- docutils.dtd (Revision 8017)
+++ docutils.dtd (Arbeitskopie)
@@ -64,13 +64,13 @@
<!ENTITY % refuri.att
" refuri CDATA #IMPLIED ">
-<!-- Internal reference to the `id` attribute of an element. -->
+<!-- Internal reference to the `ids` attribute of an element. -->
<!ENTITY % refid.att
- " refid IDREF #IMPLIED ">
+ " refid NMTOKEN #IMPLIED ">
-<!-- Space-separated list of id references, for backlinks. -->
+<!-- Space-separated list of `ids` references, for backlinks. -->
<!ENTITY % backrefs.att
- " backrefs IDREFS #IMPLIED ">
+ " backrefs NMTOKENS #IMPLIED ">
<!--
Internal reference to the `name` attribute of an element. On a
xmllint standalone_rst_docutils_xml.xml --dtdvalid docutils.dtd
OK to commit?
Yes. I would only add a comment, something like: "The NMTOKENS type is
used because XML doesn't support a multiple-ID "IDS" type."

David Goodger
<http://python.net/~goodger>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
Guenter Milde
2017-02-13 09:58:13 UTC
Permalink
Post by David Goodger
Post by Guenter Milde
For 0.13.2, I propose to complete the the first part of #1 `loosen the DTD`
...
Post by David Goodger
Post by Guenter Milde
OK to commit?
Yes. I would only add a comment, something like: "The NMTOKENS type is
used because XML doesn't support a multiple-ID "IDS" type."
Done in [r8031].

Thank you for the cooperation.

Günter


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use
Guenter Milde
2017-02-03 13:45:51 UTC
Permalink
...
Post by David Goodger
In the rest of your reply, I count 3 different proposed solutions.
Which is your preferred variation?
As a long-term goal, #3: normalization via transform

(A closer look turned out, that multiple IDs are added after
parsing, during the "transform" stage. This rules out "normalization
during parsing".)
Post by David Goodger
What would the normalization consist of?
...
Post by David Goodger
Note: an element could have multiple IDs in its "ids" attribute. How
exactly do these get transformed into multiple "id" attributes? I
imagine there would have to be elements (like "target") added
appropriately, one per excess ID.
A transform working on the doctree. Called after
transforms.references.PropagateTargets to clean up:

Walk along the doctree.
In case an object has multiple IDs,

* select one,
* change all REFIDs using others to point to the selected ID, and
* delete all other IDs.

Note: Alternatively, we can modify PropagateTargets so that only one ID
per object is used, whichever is simpler/cleaner.

+1 Docutils DTD can use the ID and REFID datatype
-> better testing
-> DTD becomes simpler, more meaningful documentation
Post by David Goodger
My vote depends on which parsing gets normalized. If it's the internal
Doctree that gets normalized: -1, because this is the tail wagging the
dog. This backend testing detail should not dictate the frontend
parsing or internal data structure.
There are additional benefits:

+1 Simpler document model.

+2 Document model matches better the model used for HTML and XML.

-> Cleaner HTML (obsoletes the hack with empty <span>s)

-> Better interface to the XML world (XML experts will feel more at ease
with the docutils XML model, simpler post-processing when standard
datatypes for ID and REFID can be used).

Docutils pushing an IDS datatype would be the tail wagging the dog.
Only a strong use case merits a deviation from "one ID per object"
rule established in the XML and HTML world.

+1 Simpler code for Docutils transforms and writers
(e.g. no need to assert single ID (5 instances) or loop over a list (10
instances).



OTOH, there is a use case for "hand-written" IDs: links to an anchor inside
a generated HTML document:

In some cases, auto-generated section heading IDs are completely
unrelated to the content, e.g., for headings in non-Latin scripts and
for documents with several identical headings.

It should, however, suffice to select one "custom" ID, e.g., the ID of the
first empty "target" directive preceding the object.


I see this as a high cost high benefit option, a more concrete proposal
must be established in dialogue.


Thanks

Günter




================================
multiple IDs Test Document
================================

Multiple IDs are given to an object by
`transforms.references.PropagateTargets`:

Propagate empty internal targets to the next element.

Given the following nodes::

<target ids="internal1" names="internal1">
<target anonymous="1" ids="id1">
<target ids="internal2" names="internal2">
<paragraph>
This is a test.

PropagateTargets propagates the ids and names of the internal
targets preceding the paragraph to the paragraph itself::

<target refid="internal1">
<target anonymous="1" refid="id1">
<target refid="internal2">
<paragraph ids="internal2 id1 internal1" names="internal2 internal1">
This is a test.


Examples are

1. section headings with auto-generated and "manually set" id,

2. objects preceded by multiple "chained" target directives,

.. _guenters example:

Günter's Example
----------------

This example heading is transformed to the XML::

<section ids="gunter-s-example guenters-example"
names="günter's\ example guenters\ example">
<title>Günter’s Example</title>

As HTML does not allow multiple IDs, the "excess IDs" are put into
additional empty span elements by the HTML writers::

<div class="section" id="gunter-s-example">
<span id="guenters-example"></span><h1>Günter’s Example</h1>

LaTeX allows multiple labels, so the conversion is straightforward::

\section{Günter’s Example%
\label{gunter-s-example}%
\label{guenters-example}%
}

Discussion
~~~~~~~~~~

Generally, there is no need for more than one ID for an object. This is why
XML and HTML don't provide for multiple IDs.

This allows the _`normalisation of multiple IDs`:

In case an object has multiple IDs,

* select one,
* change all REFIDs using others to point to the selected ID, and
* delete all other IDs.

+1 simpler document model
+1 Docutils DTD can use the ID and REFID datatype
+1 simpler code for Docutils writers
(e.g. no need to assert single ID (5 instances) or loop over a list (10
instances).

OTOH, there is a use case for "hand-written" IDs: links to an anchor inside
a generated HTML document.

In some cases, auto-generated section heading IDs are completely unrelated
to the content, e.g., for headings in non-Latin scripts and for documents
with several identical headings.

In the example, the URL ``text-multiple-ids.html#guenters-example`` is
better than using the auto-generated id, because it contains a correct
transcription of the name.



multiple target directives
--------------------------

.. _target 1:
.. _target 2:
.. _target 3:

This paragraph with multiple IDs becomes in Docutils XML::

<target refid="target-1"></target>
<target refid="target-2"></target>
<target refid="target-3"></target>
<paragraph ids="target-3 target-2 target-1"
names="target\ 3 target\ 2 target\ 1">This
paragraph with multiple IDs becomes in Docutils
XML:</paragraph>

and in HTML::

<p id="target-3"><span id="target-2"></span><span
id="target-1"></span>This ...




Nonexistent internal targets
----------------------------

Multiple IDs are also created by
`transforms.IndirectHyperlinks.resolve_indirect_targets` for nonexistent
targets (why?).

Examples are references to nonexistent footnotes and citations:

A reference to a nonexistent footnote: [5]_ becomes::

<problematic ids="id4 id1" refid="id3">[5]_</problematic>

and a reference to a [nonexistent]_ citation becomes

<problematic ids="id6 id2" refid="id5">

Two IDs are auto-generated but only the first ID is used!

So, possibly, this is just a "junk ID" (Bug?)








------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the
David Goodger
2017-02-12 21:10:35 UTC
Permalink
Post by Guenter Milde
...
Post by David Goodger
In the rest of your reply, I count 3 different proposed solutions.
Which is your preferred variation?
As a long-term goal, #3: normalization via transform
(A closer look turned out, that multiple IDs are added after
parsing, during the "transform" stage. This rules out "normalization
during parsing".)
Post by David Goodger
What would the normalization consist of?
...
Post by David Goodger
Note: an element could have multiple IDs in its "ids" attribute. How
exactly do these get transformed into multiple "id" attributes? I
imagine there would have to be elements (like "target") added
appropriately, one per excess ID.
A transform working on the doctree. Called after
Walk along the doctree.
In case an object has multiple IDs,
* select one,
* change all REFIDs using others to point to the selected ID, and
* delete all other IDs.
Automatically generated IDs are fine to remove. But what if there are
multiple manual explicit IDs on a node? Which do we choose to keep? We
must keep them all.
Post by Guenter Milde
Note: Alternatively, we can modify PropagateTargets so that only one ID
per object is used, whichever is simpler/cleaner.
+1 Docutils DTD can use the ID and REFID datatype
-> better testing
-> DTD becomes simpler, more meaningful documentation
Post by David Goodger
My vote depends on which parsing gets normalized. If it's the internal
Doctree that gets normalized: -1, because this is the tail wagging the
dog. This backend testing detail should not dictate the frontend
parsing or internal data structure.
+1 Simpler document model.
+2 Document model matches better the model used for HTML and XML.
You can only have +1/+0/-0/-1. No +2's.
Post by Guenter Milde
-> Cleaner HTML (obsoletes the hack with empty <span>s)
-> Better interface to the XML world (XML experts will feel more at ease
with the docutils XML model, simpler post-processing when standard
datatypes for ID and REFID can be used).
Docutils pushing an IDS datatype would be the tail wagging the dog.
No, Docutils is the master here. The document model is central to
Docutils. It is influenced by HTML and XML standards, but these are
only used as dumb output formats. They do not dictate the Docutils
document model. We're not trying to push an IDS datatype on these
standards; we've just implemented our document model in a specific
way.

I have been writing a tool to convert my old Evernote notes to reST.
Evernote uses HTML internally, and it's an unholy mess of
inconsistencies. That way lies madness, and Docutils will not move in
that direction.
Post by Guenter Milde
Only a strong use case merits a deviation from "one ID per object"
rule established in the XML and HTML world.
The use case is: the user (document author) wants multiple IDs per
object. Docutils allows this. The HTML and XML output must adapt.

Here's a situation: A document is written. Later on, two sections are
merged into one. The old IDs (old section titles) are preserved as
explicit IDs (targets) so that existing deep hyperlinks still work. So
a single section will end up with multiple IDs. We must keep them all.
Post by Guenter Milde
+1 Simpler code for Docutils transforms and writers
(e.g. no need to assert single ID (5 instances) or loop over a list (10
instances).
The code is already written. Stop trying to retroactively simplify, please.
Post by Guenter Milde
OTOH, there is a use case for "hand-written" IDs: links to an anchor inside
In some cases, auto-generated section heading IDs are completely
unrelated to the content, e.g., for headings in non-Latin scripts and
for documents with several identical headings.
It should, however, suffice to select one "custom" ID, e.g., the ID of the
first empty "target" directive preceding the object.
I am strongly opposed to this "solution".
Post by Guenter Milde
I see this as a high cost high benefit option, a more concrete proposal
must be established in dialogue.
The cost is too high, and the benefit is too low.
Post by Guenter Milde
================================
multiple IDs Test Document
================================
Multiple IDs are given to an object by
Propagate empty internal targets to the next element.
<target ids="internal1" names="internal1">
<target anonymous="1" ids="id1">
<target ids="internal2" names="internal2">
<paragraph>
This is a test.
PropagateTargets propagates the ids and names of the internal
<target refid="internal1">
<target anonymous="1" refid="id1">
<target refid="internal2">
<paragraph ids="internal2 id1 internal1" names="internal2 internal1">
This is a test.
Examples are
1. section headings with auto-generated and "manually set" id,
2. objects preceded by multiple "chained" target directives,
Günter's Example
----------------
<section ids="gunter-s-example guenters-example"
names="günter's\ example guenters\ example">
<title>Günter’s Example</title>
As HTML does not allow multiple IDs, the "excess IDs" are put into
<div class="section" id="gunter-s-example">
<span id="guenters-example"></span><h1>Günter’s Example</h1>
I assert that there is nothing wrong with this. No need to "improve" it.
Post by Guenter Milde
\section{Günter’s Example%
\label{gunter-s-example}%
\label{guenters-example}%
}
Discussion
~~~~~~~~~~
Generally, there is no need for more than one ID for an object. This is why
XML and HTML don't provide for multiple IDs.
This is a gross oversimplification. Sometimes this is true, sometimes
it is not. You and I have both shown valid real-world examples where
this is not true.
Post by Guenter Milde
In case an object has multiple IDs,
* select one,
* change all REFIDs using others to point to the selected ID, and
* delete all other IDs.
+1 simpler document model
+1 Docutils DTD can use the ID and REFID datatype
+1 simpler code for Docutils writers
(e.g. no need to assert single ID (5 instances) or loop over a list (10
instances).
-1: Which to select? What if all are significant/intentional? This
would result in broken links.
Post by Guenter Milde
OTOH, there is a use case for "hand-written" IDs: links to an anchor inside
a generated HTML document.
Exactly.
Post by Guenter Milde
In some cases, auto-generated section heading IDs are completely unrelated
to the content, e.g., for headings in non-Latin scripts and for documents
with several identical headings.
In the example, the URL ``text-multiple-ids.html#guenters-example`` is
better than using the auto-generated id, because it contains a correct
transcription of the name.
multiple target directives
--------------------------
<target refid="target-1"></target>
<target refid="target-2"></target>
<target refid="target-3"></target>
<paragraph ids="target-3 target-2 target-1"
names="target\ 3 target\ 2 target\ 1">This
paragraph with multiple IDs becomes in Docutils
XML:</paragraph>
<p id="target-3"><span id="target-2"></span><span
id="target-1"></span>This ...
Nonexistent internal targets
----------------------------
Multiple IDs are also created by
`transforms.IndirectHyperlinks.resolve_indirect_targets` for nonexistent
targets (why?).
<problematic ids="id4 id1" refid="id3">[5]_</problematic>
and a reference to a [nonexistent]_ citation becomes
<problematic ids="id6 id2" refid="id5">
Two IDs are auto-generated but only the first ID is used!
So, possibly, this is just a "junk ID" (Bug?)
Possibly.

DG
Guenter Milde
2017-04-20 19:47:45 UTC
Permalink
...
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Note: an element could have multiple IDs in its "ids" attribute. How
exactly do these get transformed into multiple "id" attributes? I
imagine there would have to be elements (like "target") added
appropriately, one per excess ID.
Actually, as only the ID-transfer from targets in `PropagateTargets`
creates multiple IDs, there is no need to create additional elements ---
they are already there.
Post by David Goodger
Automatically generated IDs are fine to remove. But what if there are
multiple manual explicit IDs on a node? Which do we choose to keep? We
must keep them all.
We would need to prevent the ID-transfer if there is already a
non-automatic (or content-based) ID on the next element.
Post by David Goodger
Post by Guenter Milde
+1 Simpler document model.
+2 Document model matches better the model used for HTML and XML.
You can only have +1/+0/-0/-1. No +2's.
Where is this laid down?
Why not allow this

a) as a means to weight the items, or
b) as a shortcut for

+1 Document model matches better the model used for XML.
+1 Document model matches better the model used for HTML
?

...
Post by David Goodger
Post by Guenter Milde
-> Better interface to the XML world (XML experts will feel more
at ease with the docutils XML model, simpler post-processing when
standard datatypes for ID and REFID can be used).
Docutils pushing an IDS datatype would be the tail wagging the dog.
No, Docutils is the master here. The document model is central to
Docutils. It is influenced by HTML and XML standards, but these are
only used as dumb output formats.
By restricting Docutils to "dumb" output, we are wasting potential.

...
Post by David Goodger
Post by Guenter Milde
Only a strong use case merits a deviation from "one ID per object"
rule established in the XML and HTML world.
The use case is: the user (document author) wants multiple IDs per
object.
...
Post by David Goodger
Here's a situation: A document is written. Later on, two sections are
merged into one. The old IDs (old section titles) are preserved as
explicit IDs (targets) so that existing deep hyperlinks still work. So
a single section will end up with multiple IDs. We must keep them all.
OK.
Post by David Goodger
Docutils allows this.
The Docutils document model allows this, but rST syntax does not support
this directly, only via additional "target" elements.
Post by David Goodger
The HTML and XML output must adapt.
HTML is the only output format where manually set targets make sense
(because its the only format where you can have deep links from the
outside). To generate valid HTML, we recreate target elements for the
"excess" IDs in the HTML writer.
It would be simpler to just keep the target elements for "excess" IDs.
Post by David Goodger
Post by Guenter Milde
I see this as a high cost high benefit option, a more concrete
proposal must be established in dialogue.
The cost is too high, and the benefit is too low.
So I drop this feature request.

Günter

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply All" to reply to the list.
David Goodger
2017-05-03 22:23:42 UTC
Permalink
Post by Guenter Milde
...
Post by David Goodger
Post by Guenter Milde
Post by David Goodger
Note: an element could have multiple IDs in its "ids" attribute. How
exactly do these get transformed into multiple "id" attributes? I
imagine there would have to be elements (like "target") added
appropriately, one per excess ID.
Actually, as only the ID-transfer from targets in `PropagateTargets`
creates multiple IDs, there is no need to create additional elements ---
they are already there.
Sure.
Post by Guenter Milde
Post by David Goodger
Automatically generated IDs are fine to remove. But what if there are
multiple manual explicit IDs on a node? Which do we choose to keep? We
must keep them all.
We would need to prevent the ID-transfer if there is already a
non-automatic (or content-based) ID on the next element.
Post by David Goodger
Post by Guenter Milde
+1 Simpler document model.
+2 Document model matches better the model used for HTML and XML.
You can only have +1/+0/-0/-1. No +2's.
Where is this laid down?
https://www.python.org/dev/peps/pep-0010/
Post by Guenter Milde
Why not allow this
a) as a means to weight the items, or
b) as a shortcut for
+1 Document model matches better the model used for XML.
+1 Document model matches better the model used for HTML
?
That "No +2's" was ¾ ;-). But answering seriously:

Because sometimes we add up the votes, and outside of [-1,+1] would not be fair.

Enthusiasm is fine. But votes outside the range of [-1,+1] are
automatically rounded toward 0.
Post by Guenter Milde
...
Post by David Goodger
Post by Guenter Milde
-> Better interface to the XML world (XML experts will feel more
at ease with the docutils XML model, simpler post-processing when
standard datatypes for ID and REFID can be used).
Docutils pushing an IDS datatype would be the tail wagging the dog.
No, Docutils is the master here. The document model is central to
Docutils. It is influenced by HTML and XML standards, but these are
only used as dumb output formats.
By restricting Docutils to "dumb" output, we are wasting potential.
I'm not saying we shouldn't use smart features of HTML etc. I'm simply
saying that the features of HTML/XML/etc that we use depend on the
needs of the Docutils format, not the other way around.

The Docutils Doctree format is paramount here. XML is only used as an
expression of the internal Doctree format. If XML can't handle a
Docutils-specific construct, that's XML's problem.
Post by Guenter Milde
...
Post by David Goodger
Post by Guenter Milde
Only a strong use case merits a deviation from "one ID per object"
rule established in the XML and HTML world.
The use case is: the user (document author) wants multiple IDs per
object.
...
Post by David Goodger
Here's a situation: A document is written. Later on, two sections are
merged into one. The old IDs (old section titles) are preserved as
explicit IDs (targets) so that existing deep hyperlinks still work. So
a single section will end up with multiple IDs. We must keep them all.
OK.
Post by David Goodger
Docutils allows this.
The Docutils document model allows this, but rST syntax does not support
this directly, only via additional "target" elements.
How is that not direct reST support?

"""
.. _this is target one:
.. _this is target two:

This Is Target Three
====================

And some text.
"""
Post by Guenter Milde
Post by David Goodger
The HTML and XML output must adapt.
HTML is the only output format where manually set targets make sense
(because its the only format where you can have deep links from the
outside). To generate valid HTML, we recreate target elements for the
"excess" IDs in the HTML writer.
It would be simpler to just keep the target elements for "excess" IDs.
Why try to fix what ain't broke?

But I'm not sure I follow. Examples, please.

By "It would be simpler to just keep the target elements", do you mean
"in the Doctree" (as opposed to "in the HTML")? I don't have any
objection to that, beyond "why bother?".
Post by Guenter Milde
Post by David Goodger
Post by Guenter Milde
I see this as a high cost high benefit option, a more concrete
proposal must be established in dialogue.
The cost is too high, and the benefit is too low.
So I drop this feature request.
OK.

To be clear: I was objecting to the idea of removing the IDS attribute
from the Doctree, thereby removing that functionality. And I objected
to letting the tail wag the dog, when it should be the other way
around.

David Goodger
<http://python.net/~goodger>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Docutils-develop mailing list
Docutils-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/docutils-develop

Please use "Reply

Loading...