On this Page

    Merge External PCLs

    (As of 3.2.001.06)

    You can use Eclipse Form Conversion Service (EFCS) to get PCL5S files instead of doing it locally with the PCLExtract tool.

    It is very common for business transaction documents to have not only the usual dynamic header, details, and trailer information, but also to have some static content pages for Terms and Conditions or Standard Policies, or similar. As that content is quite possibly part of a legally binding document it is fervently desired that they remain exactly as authored.

    DocOrigin supports the direct integration of external PCL pages, without any need for conversion to a DocOrigin form design file. However, for DocOrigin Merge to integrate external PCL files they must be split into separate one-page PCL files and "stripped" with the DocOrigin PCLExtract tool. The resultant PCL5-stripped (PCL5S) single-page files must be named according to a simple rule so that DocOrigin can pick them up automatically.

    Achieving External PCL File Integration

    Consider: You have a normal DocOrigin form design file (.xatw) and an external document, let's say a Ts&Cs.docx. First, you must print that document into single-page PCL files (see more about that below). Let's say that you get two single-page files out of it - Ts&Cs.1.pcl and Ts&Cs.2.pcl. Second, you must "strip" those files with the PCLExtract tool to get PCL5S files out of it - Ts&Cs.1.pcl5s and Ts&Cs.2.pcl5s. Those files must be placed in a single folder. They must have a .pcl5s extension and have the same base name (Ts&Cs in our example). It will be common practice to put all external files to be merged into documents into a single folder. Perhaps $U/Inserts -- your choice. The task is to integrate the Merge produced output and those PCL5S files into a single output PCL file.

    Even if your external file is only a single page, it still needs to be preprocessed to produce a version that is suitable for dynamic inclusion in a merged document. See more about that below. It is expected that the files to be inserted are quite static -- changing rarely. The preprocessing needs to be done only once, NOT on every run of Merge. It is expected that you would keep the preprocessed files in a single standard folder of your choice.

    At the appropriate place (or places) in your form design, you must define what we term a 'placeholder page'. That is quite likely a blank page (though it need not be totally blank). Each placeholder page represents the place where external PCL page(s) are to be inserted.

    Each placeholder page is required to have an object tag to identify which external single-page PCL file(s) should be inserted at that point.

    The tag name is: Underlay
    The tag value is: The external PCL file name;Pages=n

    Notes on the external file name

    1. The file name must be fully pathed; DocOrigin folder mappings may be used. e.g. $$F, or $U/Inserts.
    2. The file name in the tag value must NOT specify the file extension, .pcl5s is assumed, and added automatically. Hence the files to be inserted must have a .pcl5s extension.
    3. The file name must NOT include the page number suffix.

    Notes on the ;Pages=n specification

    1. Most commonly you would leave that specification out and get all the pages in the referenced external file.
    2. You can use the ;Pages= element to reference specific pages. The ;Pages= value can take the form of: a single page number, a comma separated list of individual numbers, a dash separated range of pages, or a comma-separated mixture of those forms. You may also use asterisk (*) as an explicit indication that you want all pages.

    In our example, it could be something like: $$F/Ts&Cs. Merge will look for its pages, i.e. Ts&Cs.1.pcl5s, Ts&Cs.2.pcl5s, etc.

    Having defined the placeholder pages with their Underlay tags -- what happens? The normal pagination process will happen and your placeholder page(s) will end up as some physical page(s) in your document. Placement depends on how many instances of panes were used to contain all of the data in your data stream. It's dynamic. At the end of this pagination operation, Merge will detect the placeholder pages (because they have an Underlay tag) and proceed to do the external file integration. Based on the tag value and the number of relevant external files that it finds, Merge will, at least conceptually, clone additional placeholder pages so as to contain all the referenced external pages. The "integration" is such that the external page is placed first and hence ends up "beneath" (i.e. underlays) the Merge-generated placeholder page, which overlays the external page. (The placeholder page might not be blank: e.g., it would be possible to "stamp" a policy number on the integrated external pages or to put your own page numbers over Ts&Cs's ones overlaying them with an opaque white rectangle and then your own page numbers.)

    If you are so inclined, you could set these Underlay tag values dynamically, with script. Indeed, you can insertPage, or clone (placeholder) pages dynamically as well.

    For example, the Underlay tag value could be set dynamically on a page object with:

    this.setTag("Underlay", "$U/Stitch/PolicyRiderReFlooding");

    That tag, and the prepared external files, are all you need to make it work.

    Preparing External Files

    To function efficiently the external files that you wish to assemble into your documents must be 'prepared' for that operation. These preparations need be done only once to then allow those prepared files to be used in many, many Merge runs.
    This is in the context of producing PCL-based documents. For PDF-based document assembly, see Merge External PDFs, these preparations are unnecessary.

    The preparations involve a two-step process:

    1. Print the external file into single page PCL files
    2. "Strip" each single page file of context-altering commands

    The Eclipse Form Conversion Service (EFCS) provides the means to quickly accomplish those steps using the EFCS server. However, you may also do the preparations in your own local environment.

    Step 0: Get Organized!

    You deal with a lot of files. It's sane to organize them into folders rather than have them spread out willy-nilly. We recommend that you create a folder to hold the external files that are targeted to be inserted into documents. Perhaps: $U/Inserts or $U/ExternalDocs -- please choose a folder and collect your external documents there. As we've just said, these external documents need to have some one-time preparations done with them. You may choose to keep the results of those preparations in the same folder, in a subfolder, or some other folder, but do make that choice and follow an organized plan of file location.

    File naming. The naming convention that you MUST follow is that given external file: ABC.xxx, the single page PCL5S files must be named ABC.1.pcl5s, ABC.2.pcl5s, etc. (Not .01., .02., ... but .1., .2., etc.). Remember that your Underlay object tag value would stop at ABC; it does not include the page number suffix. It is extremely helpful to have the prepared PCL5S files in a consistent location so that you can specify your Underlay tag values with confidence.

    It may be that you will want to invent a 'folder mapping' and define it in your $O/Paths.prm file. E.g. $W=$U/Inserts so that your Underlay tag values can always use "$W/...". Such folder mappings can be very helpful when moving between Dev, QA, and production.

    Printing Files in PCL5 Format

    The external files need to be printed in PCL5 format, not PCL6. To do that you need to install a virtual PCL5 printer and print your file (docx, pdf, etc.), one page at a time, to that virtual printer. We recommend installing the official HP Universal Print Driver. The last version of that driver that produces pure PCL5 output is version 6.1.0.20062 (upd-pcl5-x64-6.1.0.20062.exe). Do an internet search for that name, download and install it. The driver is quite old and will be found on non-HP sites. DO NOT USE a current PCL6 printer driver. The results are unusable for document assembly purposes.

    You should install the PCL5 driver in the usual manner as a regular local printer. Specify settings that target port "FILE (Print to File)". After that you will be able to print as usual, targeting the installed virtual printer and changing print properties such as color, DPI, etc. After each print, the driver will ask you where to store your PCL5 file. Follow the prescribed file naming conventions. Remember to print one page at a time, directing each page to a different file.

    Preparing your PCL file for integration

    Once you get your one-page PCL file (say Ts&Cs.1.pcl) you have to apply the DocOrigin PCLExtract tool on it like this:

    PCLExtract.exe -strip -in Ts&Cs.1.pcl -out ./stripped/Ts&Cs.1.pcl5s

    Obviously you must specify the applicable pathing for the files. Do keep in mind the file organization decisions you made in "Step 0".

    Shortcut for PDF-based External Files

    If your source document is a PDF then DocOrigin can automate this page-by-page printing effort. DocOrigin includes a tool at ...\DO\Bin\Java\PDFProcessor.jar. For usage information, run

    java -jar "%DO_ROOT%\DO\Bin\Java\PDFProcessor.jar"

    One great usage of that tool is to have it list all of your printers. Hopefully, you will see the PCL5-based virtual printer that you installed. Use:

    java -jar "%DO_ROOT%\DO\Bin\Java\PDFProcessor.jar" -listPrinters

    For PDF files you can automate page-by-page printing by using the -autoSplit option of PDFProcessor.jar. For example

    java -jar "%DO_ROOT%\DO\Bin\Java\PDFProcessor.jar" -autoSplit

    -printer "HP Universal Printing PCL 5"

    -in Ts&Cs.pdf

    -out "%DO_ROOT%\User\Inserts"

    That is a one line command.

    Unfortunately, PDFProcessor.jar is not privy to DocOrigin folder mappings, i.e. it does not understand $U, $E, $F etc. You must specify pathing without the benefit of using those folder mapping $X names. Given the availability of this shortcut it would not be surprising if you chose, for example, to print your .docx file to PDF format, then use PDFProcessor.jar on the resultant .pdf file to complete the preparations.

    The single page PCL files produced in the output folder will ne named as base.n.pcl. Where base is the base name of the input PDF file. And n is the physical page number from 1 to the last. The extension is always .pcl. These files must still be processed to strip away PCL print environment context altering commands to make them suitable for document assembly. As specified above you would use the PCLExtract -strip ... tool for that and ensure that the final file names would be base.n.pcl5s.

    Once you have established your organizational conventions and dev-test-production operational procedures, you will undoubtedly create commands that facilitate these operations.

    Limitations

    The best results can be achieved when you do not mix document properties. For example, if you want Merge to produce a 600 DPI color PCL document then your external document(s) should be printed using color and 600 DPI. Furthermore, if you design your placeholder page to be Letter and Portrait then your external document should be printed accordingly.

    Note, you may choose (experiment?) with adjusting your printer properties as you like. Mixing properties may produce the desired results in some cases, but we do not guarantee consistent results -- after all, they are 'internal constructs unknown' external files.

    Warning: Merging and underlaying PCL is a technically challenging task. To succeed, DocOrigin cannot apply all of its normal PCL optimization techniques. This may result in bigger output files, especially when many external pages are used.

    You must weigh this cost against the price of repeatedly, as their versions change, converting external documents to design file format and concerns around the accuracy of any such conversions.

    Why do we insist on single-page PCL files? A multi-page PCL file will use shared resources, fonts, images, etc. It is technically impossible to find and extract these resources. So each page is printed independently and becomes self-sufficient.

    PCL emulators are often used in testing. These can be great but are not as definitive as actually printing to paper. Indeed there are many models of printer which have their own PCL interpreter code, not always producing the same results. While you may use a PCL emulator initially, a) don't be alarmed if some things are not right, and b) final testing on actual printers is always advised.

    See Also

    Merge External PDFs
    domObj.setTag