Project

General

Profile

Actions

Feature #6557

closed

Add rendered metadata as pdf file in Morpho export

Added by Matt Jones almost 10 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
05/21/2014
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:

Description

Morpho currently exports metadata both as xml and html. Users have requested the rendered metadata also be provided in PDF format. Add stylesheets for rendering as PDF and include this in the export file.

The stylesheets for generating PDF can probably be general enough to be included in the EML package for many to use, and then simply imported and used within Morpho.

See related issue #6053 in Metacat for delivering Bagit packages. Ideally, Morpho's export would produce a Bagit compatible zip file equivalent to what one gets from Metacat.


Files

metadata.pdf (12.3 KB) metadata.pdf example output (rough) ben leinfelder, 05/22/2014 03:16 PM
eml-sample.pdf (14 KB) eml-sample.pdf Lauren Walker, 05/27/2014 03:44 PM
Actions #1

Updated by ben leinfelder almost 10 years ago

A couple comments/questions:
Using Apache FOP would be nice since we are going from XML->PDF. There's a ton in the FO spec for laying out the document format and the sky is kind of the limit on how we want it to look.
Do we want it to be exactly like the existing HTML metadata outputs?
Should it skinnable with the ability to add header graphics, change fonts, etc?

Actions #2

Updated by ben leinfelder almost 10 years ago

Looked into FOP, but would take a lot of coding.

Looked at HTML -> PDF options and there is "flying saucer" that uses iText. I tried it out and it's very promising. We do need to edit the existing EML XSLTs to make better "printable" HTML before we convert to PDF, but this is more tractable than writing FO XSLTs from scratch.

To clean up our nasty non-XHTML: http://jtidy.sourceforge.net/howto.html

To generate the PDF from the tidy XHTML: https://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html

Actions #3

Updated by ben leinfelder almost 10 years ago

Super simple code to do the transformation:

public void export(String inputFile, String outputFile) throws IOException, DocumentException {
        OutputStream os = new FileOutputStream(outputFile);

        String tidyFile = inputFile + ".tidy";
        OutputStream tidyOut = new FileOutputStream(tidyFile);

        Tidy tidy = new Tidy();
        tidy.setXHTML(true);
        tidy.parse(new FileInputStream(inputFile), tidyOut);

        String url = new File(tidyFile).toURI().toURL().toString();

        ITextRenderer renderer = new ITextRenderer();
        renderer.setDocument(url);
        renderer.layout();
        renderer.createPDF(os);
        os.close();

    }

Actions #4

Updated by ben leinfelder almost 10 years ago

I've added a class and corresponding test to the EML project that renders a sample EML file as both HTML and PDF using the default CSS. 'ant runonetest' will allow you to run it (HtmlToPdfTest is the default class to run). The output will be in build/tests/eml-sample.xml.html and build/tests/eml-sample.xml.pdf

Actions #5

Updated by ben leinfelder almost 10 years ago

  • Project changed from Morpho to EML
  • Assignee changed from ben leinfelder to Lauren Walker

Hoping Lauren can do a bit of work on the layout to make it narrow enough to fit on a page.

Actions #6

Updated by Lauren Walker almost 10 years ago

  • Status changed from In Progress to Resolved

I styled the EML -> HTML output a bit to make it more modern and simple, and made sure that it converts to a PDF without running off the page.

Actions #7

Updated by Lauren Walker almost 10 years ago

Attached is a test EML->HTML->PDF that was generated using the ant runonetest HtmlToPdfTest

Actions #8

Updated by ben leinfelder over 9 years ago

  • Project changed from EML to Morpho
  • Target version set to 1.10.3

Moving to Morpho release for feature tracking even though it is implemented in utilities project.

Actions

Also available in: Atom PDF