Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epub3-to-daisy202 produces invalid daisy202 from valid epub3 #150

Open
bertfrees opened this issue Oct 4, 2018 · 9 comments
Open

epub3-to-daisy202 produces invalid daisy202 from valid epub3 #150

bertfrees opened this issue Oct 4, 2018 · 9 comments

Comments

@bertfrees
Copy link
Member

See daisy/pipeline#529 (comment).

@bertfrees
Copy link
Member Author

bertfrees commented Oct 22, 2018

It turns out that the NCC file generated by epub3-to-daisy202 has several errors:

  • Missing body element (EPUB 3 to DAISY 2.02 - The generated NCC file is invalid #92)
  • It does not reference any SMIL files (and none are present in the DAISY fileset) (epub3-to-daisy202: text-only daisy 2.02 should also have SMIL files #86)
  • It contains some metadata fields that are not allowed according to the schema (tested with sample books from Farrah Little and validator from Pipeline 2)
    • <meta name="ibooks:specified-fonts" content="true"/>: value of attribute "name" is invalid
    • <meta name="viewport" content="width=device-width"/>: value of attribute "name" is invalid
    • <meta name="dcterms:modified" content="2018-10-22T15:57:05+00:00" />: value of attribute "name" is invalid
    • file type not allowed in DAISY 2.02 fileset: application/vnd.ms-opentype (expected a
      html, smil, mp2, mp3, wav, jpg, gif, png or css file type)
  • Various other validation issues (Jostein converted the C00000.epub sample book and validated the result with Pipeline 1, see Various fixes to EPUB 3 to DAISY 2.02 #153 (comment))
    • Attribute 'xmlns:...' must be declared for element type 'smil'
    • Attribute 'xmlns:epub' must be declared for element type 'html'
    • Invalid pseudo-function (from CSS validation)
    • File not found: C00000-2-toc.html (in file:/C:/Users/jostein/Desktop/C00000/C00000-01-cover.html)
    • required elements missing (in smil files)
    • required attributes missing (in smil files)
    • bad value for attribute 'name' (in ncc)
    • bad value for attribute 'content' (in ncc)
    • Could not compare calculated duration to stated duration since this information is missing in the NCC
    • NCC schematron: ncc:charset seems not to be in ncc exactly once
    • NCC schematron: ncc:pageFront seems not to be in ncc exactly once
    • NCC schematron: ncc:pageNormal seems not to be in ncc exactly once
    • NCC schematron: dc:identifier seems not to be in ncc exactly once
    • NCC schematron: dc:title seems not to be in ncc exactly once
    • NCC schematron: ncc:tocItems seems not to be in ncc exactly once
    • NCC schematron: ncc:totaltime seems not to be in ncc exactly once

@bertfrees
Copy link
Member Author

bertfrees commented Apr 23, 2019

Various other validation issues (Jostein converted the C00000.epub sample book and validated the result with Pipeline 1)

I just tried the Pipeline 2 validator. It also discovers some of the issues, but not all of them:

  • File not found: C00000-2-toc.html (in file:/C:/Users/jostein/Desktop/C00000/C00000-01-cover.html)
  • bad value for attribute 'name' (in ncc)
  • bad value for attribute 'content' (in ncc)

@rdeltour says there is a Java part in Pipeline 1 that was not ported to Pipeline 2, but it seems there are several other differences.

@bertfrees
Copy link
Member Author

bertfrees commented Apr 23, 2019

That the schematron errors aren't visible in the Pipeline 2 report is because they are embedded in the RelaxNG files, and Pipeline 2 doesn't support this.

@bertfrees
Copy link
Member Author

bertfrees commented Apr 25, 2019

  • <meta name="viewport" content="width=device-width"/>: value of attribute "name" is invalid
  • <meta name="dcterms:modified" content="2018-10-22T15:57:05+00:00" />: value of attribute "name" is invalid

@josteinaj These two are added in epub3-to-daisy202 (opf-to-html-metadata.xsl). See d51dcad. But it is invalid. Not according to Pipeline 1 though. What should I do with this?

@bertfrees
Copy link
Member Author

bertfrees commented Apr 25, 2019

  • Attribute 'xmlns:...' must be declared for element type 'smil'
  • Attribute 'xmlns:epub' must be declared for element type 'html'

Are these really validation issues, or are these shortcomings of Pipeline 1? Of course we can make sure that there are no unneeded namespace declarations in the files (EDIT: I did this now), but still... Should they cause errors? I can't reproduce this with the Pipeline 2 validator.

@bertfrees
Copy link
Member Author

  • File type not allowed in DAISY 2.02 fileset: ... (expected a html, smil, mp2, mp3, wav, jpg, gif, png or css file type)

Where can I find more info about the allowed file types? http://www.daisy.org/publications/specifications/daisy_202.html talks about the allowed audio file types, but it doesn't mention any image file types.

@bertfrees
Copy link
Member Author

That the SMIL related issues are not visible in the Pipeline 2 report is because these validation results are simply ignored. See 2b04ed2. @josteinaj Do you remember if was this on purpose?

@bertfrees
Copy link
Member Author

  • Could not compare calculated duration to stated duration since this information is missing in the NCC

In Pipeline 1, time checks are implemented in Java (ValidatorImplD202). In Pipeline 2 this is done in XSLT/XProc.

  • Invalid pseudo-function

This is also implemented in Java in Pipeline 1 (CssFileImpl).

@bertfrees
Copy link
Member Author

See PR: daisy/pipeline-modules#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant