Latest revision as of 16:24, 25 November 2015

This page is dedicated to research on web-to-print approaches and workflows.

Starting points

Goal: transform reflowable (markup-based) digital publications into fixed-layout PDF.
Case study: Beyond Social. How can web-to-print (or wiki-to-print) be applied to Beyond Social.
Requisites:
- an online workflow, that can be run on a server
- support page numbers
- gather all articles in on to 1 document
Advanced features:
- impositions - how can impositions, instead of simple stack of pages, be integrate into this workflow?

Possible strategies - software

laTex - Document preparation system, focused on the creation of PDFs, uses its own markup, supported by Pandoc.
wkhtmltopdf - HTML to PDF converter based on Webkit.
Weasy Print - Python, visual rendering engine for HTML and CSS that can export to PDF
Mediawiki Collection Extension - the same system used by Wikipedia to create books in PDF and Epub formats.

assessment

For each strategy try to point out:

summary of the workflow
example prototype
advantages
disadvantages
how can it be integrated into teaching

Strategies

List you researched strategy below, with a bit of documentation that point others to the right direction if they want to try it

LaTex

LaTex is a type-setting/document preparation language, focused on producing typographicaly correct page-based documents as PDF.

positive aspects

LaTex is a markup language, in many ways similar to HTML or Markdown, and Pandoc offers good support for it, converting well from other markups.
Can produce quality PDFs: w/ support for: page numbers, hyphenation, bibliogrphy, references, hyperlinks

LaTex sample:

\section{Tools}                      
We organized the work in two spaces: a {\bf wiki} and a {\bf website}. The \href{http://beyond-social.org/wiki/index.php/Main_Page}{wiki} was established as the editorial space, while the \href{http://beyond-social.org/}{website}

Can be set to produce more experimental and generative outputs. (See works by Lafkon studio for an idea)

negative aspects

Produced PDF are by default academic looking, although this can be changed
Use is outmoded and mostly restricted to academia
Styling is defined by packages imported into the document, which is very different and incompatible with CSS. Styling a LaTex document:

\documentclass[10pt, a4paper]{book} % Document form: book, size: A4, font-size                                                                                                
\usepackage[hmargin=3.0cm, vmargin=2.0cm]{geometry} %document margins

sample output

final remarks

Although LaTex can be set to produce very interesting results and can be easily integrated within the current workflow, centered around Wikis, Pandoc, HTML and CSS; It constitutes a difficult tool to work with, let alone to teach. It might bring more confusion to students and contradict our approach for setting up hybrid publishing workflows, which has been based on essential web languages: HTML and CSS and simple tools: Wikis and Pandoc. The advice is to leave LaTex alone, although it might be an interesting venue to explore, for more experimental projects.

Weasyprint

Research/Web-to-print/WeasyPrint

WeasyPrint is a visual rendering engine for HTML and CSS that can export to PDF. It aims to support web standards for printing. WeasyPrint is free software made available under a BSD license.
It is based on various libraries but not on a full rendering engine like WebKit or Gecko. The CSS layout engine is written in Python, designed for pagination, and meant to be easy to hack on. ^[1]

Can be used as a Python library or as a standalone program. Remarks below refer to use as standalone program, so far.

positive aspects

Uses HTML and CSS to layout the PDF, which means a smoother learning curve from web to print.
Supports features like page size, page number, hyphenation in several languages (w/ pyphen lib)custom typography, allowing the production of a PDF with high level of control in terms of design.
Very simple and easy to understand syntax, does not require proficiency in command line.

Example:

weasyprint http://beyond-social.org/wiki/index.php/Hybrid_Publishing beyondsocial.pdf -s style.css

Example explained:

weasyprint source-html-document pdf-output -s css-file

-s being the flag to include the CSS that will overwrite existing CSS rules used in the web version

negative aspects

Can be difficult to install, due to the dependencies. In Debian no issue was experienced. In Mac OSX, still trying to manage the installation.
It's more than difficult to install! It's very hard. It's dependencies seem to belong to another era and have small communities and scarce documentation

sample output

style.css

html, body{
	background-color: #e0e0e0 !important;
	font-family:  "AmericanTypewriter", serif !important; /* the font needs to be in your computer. this is not the final font, please choose a font of your choice */
	color: #000 !important;
}

div#footer ul {
    list-style-type: none !important;
 }

@page{
	size: 8.5in 8.5in;
	background-color: white !important;
        counter-increment: page;
        font-family:  "AmericanTypewriter", monospace !important; 
	color: #000 !important;
  	margin: 1cm;
        font-size: 8pt;
}


h1{
	string-set: doctitle content();  /* not tested - not sure it is working */
	/* retrieves the content from h2.title - will be used later, in the page bottom*/
}

img{
	width: 100%;
	break-page-inside: never;
	}

#catlinks{display: none;}

div#footer{
	background-color: black !important;
	color: #fff !important;	
	/*border-radius: 2cm;*/
	font-family: sans-serif !important;
	font-size: .75em !important;
	text-align: center;
	padding-bottom: .2cm;
	position: absolute;
	bottom: 0;
	width: 100%;
}

@page :left {
  @bottom-right{
    margin: 0;
    /* font-family: inherit; */ /* does not work */
    content: string(doctitle);      
  }
  @bottom-left{
  	margin: 0;
  	content: counter(page); 
  }  
}

@page :right {
  @bottom-right{
    margin: 0;
    /* font-family: inherit; */ /* does not work */
    content: counter(page);      
  }
  @bottom-left{
  	margin: 0;
  	content: string(doctitle); 
  }  
}

wkhtmltopdf

Wkhtmltopdf is an open source project very similar to weasyprint, with an identical workflow. Because it is so similar, we will mostly discuss the differences between the two.

positive aspects

wkhtmltopdf is based on the webkit rendering engine, which eases bug tracking and improves support
wkhtmltopdf is very easy to install.
wkhtmltopdf can run javascript.

negative aspects

Webkit, and thus wkhtmltopdf, has not yet implemented as many advanced css printing features as weasyprint has.

sample output

"C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe" --print-media-type container.html wkhtmltopdf_book.pdf

Usage for teaching

Because it is free and open source, this tool appears very suitable for use in the hybrid publishing workflow that is currently being thought in several courses. Because it adheres to most HTML and CSS rules and can be used simply from the command line, students are not forced to learn yet another language.

built-in browser pdf prints

Many browsers nowadays have built in pdf rendering engines. When an HTML page is created on the server, users and/or printers can simply press print in their browser and choose to export a pdf.

positive aspects

Less work is needed on the server
Users can easily customize the look of their pdf

negative aspects

Publishers have no guarantee that users see the correct lay-out (Chrome does this very poorly for example)
It requires more technical know-how from the user

sample output

Below is an example print from Chrome:

Usage for teaching

This works quite poorly, so we see no place for this in education. Recent versions of Chrome (post-webkit) actually perform worse than before, so there is little hope for improvements in the future.

references

↑ “WeasyPrint Documentation — WeasyPrint 0.22 Documentation.” http://weasyprint.org/docs/.

[weasyprint1-1] “WeasyPrint Documentation — WeasyPrint 0.22 Documentation.” http://weasyprint.org/docs/.

[1]

@@ Line 1: / Line 1: @@
-''This page is dedicated to research on web-to-print approaches and workflows.'' '''Deadline for research:'''  June 26
+''This page is dedicated to research on web-to-print approaches and workflows.''
 =Starting points=
@@ Line 60: / Line 60: @@
 ==Weasyprint==
-''WeasyPrint is a visual rendering engine for HTML and CSS that can export to PDF. It aims to support web standards for printing. WeasyPrint is free software made available under a BSD license.<br/>It is based on various libraries but not on a full rendering engine like WebKit or Gecko. The CSS layout engine is written in Python, designed for pagination, and meant to be easy to hack on.''
+[[Research/Web-to-print/WeasyPrint]]
+''[https://pypi.python.org/pypi/WeasyPrint WeasyPrint] is a visual rendering engine for HTML and CSS that can export to PDF. It aims to support web standards for printing. WeasyPrint is free software made available under a BSD license.<br/>It is based on various libraries but not on a full rendering engine like WebKit or Gecko. The CSS layout engine is written in Python, designed for pagination, and meant to be easy to hack on.''
 <ref name="weasyprint1">“WeasyPrint Documentation — WeasyPrint 0.22 Documentation.” http://weasyprint.org/docs/.</ref>
@@ Line 152: / Line 154: @@
 }
 </source>
 ==wkhtmltopdf==

Anonymous

Search

Research/Web-to-print: Difference between revisions

Namespaces

More

Page actions

Latest revision as of 16:24, 25 November 2015

Contents

Starting points

Possible strategies - software

assessment

Strategies

LaTex

positive aspects

negative aspects

sample output

final remarks

Weasyprint

positive aspects

negative aspects

sample output

style.css

wkhtmltopdf

positive aspects

negative aspects

sample output

Usage for teaching

built-in browser pdf prints

positive aspects

negative aspects

sample output

Usage for teaching

references

Navigation

Main navigation

Namespaces

Wiki tools

Wiki tools

Anonymous

Search

Research/Web-to-print: Difference between revisions

Latest revision as of 16:24, 25 November 2015

Starting points

Possible strategies - software

assessment

Strategies

LaTex

positive aspects

negative aspects

sample output

final remarks

Weasyprint

positive aspects

negative aspects

sample output

style.css

wkhtmltopdf

positive aspects

negative aspects

sample output

Usage for teaching

built-in browser pdf prints

positive aspects

negative aspects

sample output

Usage for teaching

references

Navigation

Wiki tools

Page tools

Categories