[printing-discuss] How Do We Get The Right Printable Page Area - Information to be provided by API

Patrick Powell papowell at astart.com
Mon Feb 25 11:31:12 PST 2002


      How Do We Get The Right Printable Page Area?
        PaperSize VS PageSize VS ImageableArea


 (With vague mutterings of 'How the heck did we get
   into this mess in the first place?')
                   Patrick Powell
                 <papowell at lprng.com>

Introduction

  The comments here are presented in an informal manner along with
several discussions of apparently out of scope topics.  This is
due to the nature of the printing issues discussed here.  They
interact and There Is No One True Way To Do Things.

 The comments in this note reference:

   PostScript Printer Description File Format Specification
       Version 4.3, 9 Feb 1996

  A slew of PPD files scrounged from about 80 users, distributed
  HP,  QMS, Lexmark, Epson, Brother, etc. etc. drivers.

  I will make the brutal assumption that most of the interest
of the readers in this list is in the area of 'rasterizers',
not in the area of the actual 'fling the rasterized output'
to the printer area.  That is,  the 'rasterizer' wants information
about the printer,  but it really does not want to take care
of the messy details of sending the job to the printer.  It would
quite happily send the rasterized job output to a file,
a socket, or a pipe.  (Note:  I did not say what the output format
was,  just that it was 'rasterized'.  :-.)

  The problem faced by the folks doing rasterization is that some
printers are too smart and some are REALLY dumb.  However,
there is a common ground that has to be dealt with by all users:

          WHAT IS THE PRINTABLE AREA WE USE?

  a) Paper Size from Generic Information
  b) Printer Specific Information

I would like to brutally propose that we adopt the following standard
terms/options/conventions for paper sizes.  In addition,  I will
also discuss how these can be supported by various PPD files,
databases, etc.  Also,  I will show how this information can be
organized in such a way as to make it available for use by 'filters'
AKA rasterizers,  GUI interfaces,  and various print spooling
systems.

Note that the 'implementation' of the way that this information
can be provided is a matter of choice.  There are natural mappings
into Perl, Python, LISP (retch!),  and several 'perl object support'
libraries for C and C++.

PAPER/PAGE SIZES NAMES

Raster/PostScript Based Printers printers require their input to
be prepared with the output printable area and possibly the device
resolution.

I have thought about the various ways to do this and after much
consideration have reluctantly come to the conclusion that the
following is just about the best compromise that can be made.

Lets avoid all of the wishful thinking about the Way Things Should
Be and get down to the nitty-gritty.

There are about 8 (12? maybe a few more) standard paper sizes in
the paper industry.  I called up a major paper supplier and was
told that in their precut paper products line that 8.5x11 and legal
(8.5 x 14) was over 90% of their business in the US.

Of course there is a bit of cause and effect here - the more they
can standardize on a few sizes and not have to stock/produce other
sizes the more volume, the more efficiency, prices are lower,  and
they sell more of these sizes because it is the cheapest size...

The European market has the same breakdown - a couple of sizes are
standard,  rest are marginal.

And finally, lets stop debating about the best dimension set to
use.  Give up.  You know and I know that if anything but points (1
inch is approximately 72 points) then we are into a hellacious
backwards compatibilty problem with various PostScript PPD files
and other references for information that we will have major
headaches.  And we already have enough problems.

The various Laser/Raster printer manufacturers have studied these
figures and usually set up the paper trays or 'auto size recognition
facilities' for their printers for the most popular sheet sizes
for their market.

By convention,  page sizes (ok, paper sizes if you
insist) are stated in Portrait mode (shortest dimension first).

KEY        Description                 Dimenions      Aliases
                                       in points
a0         iso a0                      2384x3370
a1         iso a1                      1684x2384
a2         iso a2                      1191x1684
a3         iso a3                      842x1191
a4         iso a4                      595x842
ansic      ansi c                      1224x1584
ansid      ansi d                      1584x2448
ansie      ansi e                      2448x3168
archa      arch a                      648x864
archb      arch b                      864x1296
archc      arch c                      1296x1728
archd      arch d                      1728x2592
arche      arch e                      2592x3456
b1         jis b1                      2064x2920
b2         jis b2                      1460x2064
b3         jis b3                      1032x1460
b4         jis b4                      729x1032
b5         jis b5                      516x729
c5         c5 envelope 6.375"x9"       459x649
comm10     comm 10 envelope 4.125"x9.5" 297x684
dl         dl envelope 11cmx22cm       312x624
executive  executive 7.25"x10.5"       522x756
legal      us legal 8.5"x14"           612x1008    11x14
letter     us letter 8.5"x11"          612x792     ansib
monarch    monarch envelope 3.875"x7.5" 279x540
tabloid    tabloid 11"x14"             792x1224    ledger 11x17

I have brutally ripped these out of various PPD files
and since Adobe started this, and everybody else appears to
have copied these values in order to be compatible,  they
appear to be identical in all the PPD files I have tried.

<aside>

If there are some printer vendors that use
different values for page sizes, then shame
on them.  We may need to provide 'translation
files' that convert from their units to the
normalized units.

</aside>

We also define one or more aliases, to be compatible with legacy
applications.  I like aliases - you can add as many as you like
for your local purposes.  You can even use localization based
aliases, i.e. - strings that are in another languages/encoding.


USAGE 

The usage of these options can be implemented in several ways.
The following are examples and suggestions, but most print
spooler/rasterization programs are pretty close to using them
anyways.

1. You brutally define a set of 'options' for your spooler/
   job converter/job flinger corresponding to the sheet
   sizes. I.e.:
     cups:   lpr -o legal
     lprng:  lpr -o legal   (yes, -o == -Z)
             lpr -Z legal
     lp:     lp -o legal    (various SysV implementations)
     foomatic support: -o legal 
      or (<retch/>) lpr "-Jlegal"

     gimp (or others)  -o legal
       OR select the 'legal' image size from the
       option list.
     
       (My apoligies to the GIMP folks, but I
        am writing on a system that does not have
        gimp installed and am doing this from memory.
        And I just go a 'memory fault, core dump'
        message from the frontal lobes.)

2. You define a 'pagesize=xxx' or 'sheetsize=xxx' option
   whose value is one of the default sheet sizes.

     cups:   lpr -o pagesize=legal
     lprng:  lpr -o pagesize=legal
             lpr -Z pagesize=legal
     foomatic support: -o pagesize=legal 
      or (<retch/>) lpr "-Jlegal"

     gimp (or others)  -o pagesize=xxx

3. Desperation Escape hatch

   Of course,  you need an escape hatch for
   setting just the raster area.  You can do
   this with

      -o "imageablearea=HORxVER"
             ==
             -o "imageablearea=[0 0 HOR VER]"
      -o "imageablearea=[llx llxy urx ury]"

   I suggest that this be used only when trying to
   generate special output for a specific purpose.

4. Cowards Way Out

   And of course,  we have:
      -o "pagesize=default"
   Which uses the default size for the rasterizable area.

PPD FILES, PageSize, and PaperDimension

At this point you should probably guess that some of this stuff
is embedded in the PPD files.  The PaperDimension and PageSize
'keywords' have the following effects:

The PostScript PPD file approach is to provide a set of
of PaperDimension, PageSize, PageRegion,
and ImageableArea values.

What is the relationship between PaperDimension, PageSize,
PageRegion, and ImageableArea?


First, lets deal with the PaperDimension stuff in the PPD file.
If you read the PPD Reference, you will notice an amazing lack of
discussion on the interaction between the PageSize and PaperDimension
stuff.  It is almost as though the folks at Adobe who were doing
the PostScript raster conversion and the folks who were doing the
paper feed control stuff got together at the last minute and
discovered that they had been using different names for the same
thing.  So they simply decided to keep BOTH entries.   The paper
feed folks were happy and went off to build more and bigger cassette
feeds while the raster conversion folks were left to finish off
the PPD spec.  They did this by simply IGNORING any interactions.
If its not specified, it doesn't matter.

I suspect strongly that they intended the 'PaperDimension' stuff
to be simply used for folks who wanted the 'right size' of paper.


Now lets move on to PageRegion and PageSize.

I quote from Section 5.15 of the PPD Reference:

On some devices,  the imageable area of a given page size
varies as a result of the current resolution,  amount
of memory, direction of paper feed, and other factors.
... the available imageable area will not be smaller than
that shown in the PPD file and all marks made within the
imageable area will be visible.

In other words, this is a guaranteed minimum, not an
allowable maximum.  For those folks who want to get the
allowable maximum,  you are going to have to massage the
values in the PPD file...  Sigh...  More on this later.

Also, from the PPD Reference:

The ... PageRegion sets the imageable area to the
appropriate media type (sic) without explictly setting
the source of the media.  It is intended to be used
in conjunction with Manual Feed so that the imagearea
is appropriate for the media to be fed.  It is also
used instead of the PageSize invocations when the user
specifies an input tray and page size because the PageSize
invocations generally select an input tray and would
override the user's previous selection of a specific
input tray.

<comment>
Ummm... actually this is not quite the case in practice.
When you select 'manual feed' most (all?) PostScript
will ignore any other selections.  The dimensions of the
'PageSize' selection are used to set the 'resolveable
area', but that is the end of their effect.

To further confuse the issue,  the effects of combinations
of 'bin selection' and 'page size' depend on the model
of printer,  version of firmware,  and the paper in the
various trays, whether the trays have 'auto size sensing'
capabilities,  and what 'administrative defaults' have
been set by using PJL.  Totally a mess.
</comment>

And also from the PPD Reference Manual:

To print manager authors:  An invocation string supplied
by PageSize will usually override an invocation string
supplied by PageRegion.  Therefore, if for some reason
both a PageRegion and a PageSize invocation for a single
page are going into the output file,  the PageRegion
invocation must come after the PageSize selection.

<comment>
Note that the reference weasels^H^H^H^H^Hauthors
use the word 'usually'.  So it does not ALWAYS
and is not ALWAYS required to override it. :-)

And note that it may not have the desired effect :-)
</comment>

And also from the PPD Reference Manual:

The PageSize invocations will establish both an input
slot and a frame buffer.

--- brief pause to take some asprin -----

After working with this for a couple of years, and doing
a tremendous amount of testing on Adobe, HP, and other
third party implementations of PostScript interpreters,
I have come to the following conclusions:

a)  If you have a rasterizer or something generating
    PostScript,  then it should be aware of the bounding
    boxes, or you should tell it.  Don't put put the
    PageRegion stuff into the output PostScript file
    or evil things will happen later.

    This is an experimental result,  and again, you are
    invited to experiment.  Just be prepared to spend LOTS
    and LOTS of time doing it.  And then upgrade the
    firmware and try again,  and then on different models.
    :-)

b)  If you put PageSize selection codes into the output,
    it RESTRICTS the input trays to those which contain
    media 'compatible with' the requested page size.

c)  If you have multiple input bins with the same media,
    you can then force selection of the appropriate input
    bin by using input bin selection commands.

d)  If you do NOT specify the page size,  then the PostScript
    printer will assume the DefaultPageSize value,  and then
    based on this will use an input bin.

    Just to make life REALLY interesting,  most of the time
    when you print a PCL job,  the default input bin is NOT
    the same as the PostScript default input bin.

    Umm... And if your printer has PJL support, you can
    override the defaults for PageSize and bin selection,
    but this only works on the defaults, not on the actual
    values used.

<I suggest some more asprin at this point/>

Now lets get back to ImageableArea.

As far as I can tell the REAL thing that matters
is the PageSize and ImageableArea information.
Most PPD files contain values for their supported
PageSizes.  And of course, you can always assume that
the ImageableArea for a page size of NN x MM is
[0,0,NN,MM] (or [0,0,NN-1,MM-1] depending on your assumptions
about Clipping/Bounding Box conversions).

If you are generating PostScript, do not put the the
'Want this imageable area' PostScript request into the
output PostScript (or other outpout format), but only
provide this information to the rasterizer.  You should
send the page size requests and/or a 'manual feed' request
to the printer.

Interface To PaperDimension/PageSize/ImageableArea Information

I am going to use Perl notation here.  You may
decide that a particular API could represent then
information in another way.

The concept here is that the information is available
in some form (file, etc.),  and that there is a method
available to request and get this information.

Part of the information is the 'papersize' option with
a set of values.  The the rasterizer step is usually
interested only in the sizes, not the commands to do
pagesize selection,  but it may need to embed these
if the downstream spooling system does not support the
functionality.

hplaserjet4 =>{  # cannonical name of printer stuff
   ....          # more on this at another time

 options =>{
   pagesize = {
         # stuff for the GUI to mumble over
         name =>{
             default=>"Page Size Selection",
             .fr => "Selection de Page",
         }
         #  the help information.
         #  - you may want to 
         help =>{
             default=>"
Selects the page size for
the print job
",
             .fr => "...",
         },
         # We only allow the predefined options
         option_type => ORDINAL,
         # now we have the default value for 
         default => "letter",
         option_values = {
            "letter" => {
               aliases => {
                default=> [],
                .fr=> [ "lettre" ],
                },
               description => {
                 default => "us letter 8.5\"x11\"",
                 .fr = "page lettre d'US", # or whatever
                },
               help -> {
                  default => "
                     Select letter size paper.  May also select
                     input tray containing letter size paper.",  
                  .fr =" ....",
                }
               papersize =>"612 792",
               imageablearea =>"14.16 12.12 597.84 780.12"
               postscript=>{
                 # what things in the conflicts section does
                 # this affect?
                 ppd_vars = { PageSize => Letter },
                 OrderDependency = "40 AnySetup PageSize",
                 # stuff from the PPD file on how to generate
                 # postscript to do pagesize selection
                 code => "
%%BeginFeature: *PageSize Letter
    1 dict
    dup /Policies 2 dict dup /PageSize 2 put dup /MediaType 0 put put
    setpagedevice
    2 dict
    dup /PageSize [612 792] put
    dup /ImagingBBox null put
    setpagedevice
%%EndFeature
",
                 # this is what is used just to put a request
                 # into the output file.  Super clever backends
                 # may do the expansion... hopefully.
                 request => "
%%IncludeFeature: *PageSize Letter
",
               },
               pcl => {  # this is what we need for PCL 
                  code = "\033(5H" 
               }
               pjl => {  # this is what we need for PJL 
                  code = "@PJL SET PAGEPROTECT=LETTER",
               }
                # and spooling system requests if necessary
             }
             lprng= "-opagesize=letter",
             cups = "-opagesize=letter",
          } } }


Lets see how this information could be used.

First,  I have left out the 'GUI Interface' organization
that is present in the PPD file stuff.  I will address this
in another document.

The concept is that this information would be available
via an API call.  The actual format of the returned information
would depend on the API interface, but the general content
would be approximately the same.

The GUI/setup level would use the information in the
Name and other fields to generate a set of selections and/or
present a menu to users.

Once this is done they would then generate the necessary
options to call the formatter de jour or the spooler de jour.
Note that some assistance can be given by adding the necessary
options to the information. 

The formatter would be provided the same information,
together with the selected 'pagesize' or (in desperation
cases, imageablearea=NNxMM  or
imageablearea=[llx lly urx ury]).

If the formatter is only doing rasterization (late binding
model),  it will just make use of the pagesize or imageablearea
information.  It will generate jobs with no other information.

If the formatter is doing a complete rasterization and
page/information selection (early binding model),  it
will put in page size requests and all of the other
information needed for page selection, input bin selection,
etc.

Resolution

The other information that is needed by the rasterizer
is the resolution of the device.  There are two purposes
to this information:  to tell the rasterize what it needs
to generate and to prepare the printer to generate output
with this resolution.

Experments with selections of PostScript and PCL printers
have indicated that most printers use the resolution as
a 'quality of service' request,  rather than as an indication
that the job contains information in a specific format. 
There are execeptions to this,  of course.  When generating
jobs in a vendor specific pre-rasterized format
the rasterizer is responsible for providing resolution
information in the job the printer can interpret the 
supplied raster image and place the pixels in the appropriate
location on a page.   In these cases the resolution
information is embedded in the print job 'data' and cannot
be modified by the downstream printing steps.

Again, there are exceptions to this as well.  On some high
end printers intended for high resolution printing you can
'downgrade' the output resolution to generate proofs at
reduced resolution.  But usually in this case you generate
at high resolution, proof at low resolution,  and then
reprint at high resolution.  This 'draft quality' printing
is usually handled by a user interface on the printer.

The PPD Specification *Resolution information states
categorically that *Resolution options have the
format:

 300x300dpi
   or 300dpi => implies 300x300dpi

The 300 can be replaced by the appropriate resolution.
This makes it quite easy to determine the available
resolutions on a printing device.

Of course, you may want to do some aliasing (low, medium,
high?),  or provide internationalization for these
values.


hplaserjet4 =>{  # cannonical name of printer stuff
   ....          # more on this at another time

 options =>{
   resolution = {
         # stuff for the GUI to mumble over
         name = "Resolution";
         #  the help information.  This can be
         #  pretty elaborate - short help and long
         #  help
         help -> {
          default => "
             Set desired device resolution to 300 x 300 dpi
             ",  
          .fr =" ....",
         }
         description =>{
             default=>"Device Resolution",
             .fr => "Resolution",
         },
         # We only allow the predefined options
         option_type => ORDINAL;
         # now we have the default value for 
         default = "300dpi";
         option_values = {
            "300dpi" => {
               description => {
                 default => "300 x 300 dpi",
                 .fr = "120 x 120 dpcm"
                },
               help -> {
                  default => "
                     Set desired device resultuion to 300 x 300 dpi
                     ",  
                  .fr =" ....",
                }
               postscript=>{
                 # what things in the conflicts section does
                 # this affect?
                 ppd_vars = { Resolution => 300dpi },
                 OrderDependency => "10.0 DocumentSetup *Resolution",
                 # stuff from the PPD file on how to generate
                 # postscript to do pagesize selection
                 code => "
%%BeginFeature: *Resolution 300dpi" 
<< /HWResolution [300 300]>>  setpagedevice"
%%EndFeature
",
                 # this is what is used just to put a request
                 # into the output file.  Super clever backends
                 # may do the expansion... hopefully.
                 request => "
%%IncludeFeature: *Resolution 300dpi
",
               },
               pcl => {  # this is what we need for PCL 
                  code = "\033(5H" 
               }
               pjl => {  # this is what we need for PJL 
                  code = "@PJL SET RESOLUTION=300",
               }
                # and spooling system requests if necessary
               lprng= "-oresolution=300dpi",
               cups = "-oresolution=300dpi",
             }
          } } } }




More information about the printing-discuss mailing list