On this Page

    ConvertDatToXml Filter

    This is the preferred filter to use to convert Field Nominated Format (FNF) into XML. It is a binary executable, not a script, so it performs faster. Unlike predecessors, it handles multiple documents in one data file. It uses the target form name as input as well and so does a much better job of matching up the case of names used in the FNF file with those used in the form. Internally, it actually creates a document DOM and then dumps that out as an XML file. The result is that the XML file is perfectly tailored for DocOrigin use with the form being used.

    Conversion of .dat files is very common. To that end, if the data file has a .dat, or a .fnf extension then, if no filters are specified, this ConvertDatToXml filter will be applied automatically.

    The default options will be used. To override any ConvertDatToXml Filter options:

    • Copy the Default-ConvertDatToXml.prm file from $R\DO\Bin ($R is the root folder of your DocOrigin installation)
    • Paste the file into your $U\Overrides folder
    • Remove the Read Only attribute from the file properties
    • Remove the "Default-" file name prefix
    • Add any options from the list below. 

    ^Global Processing

    ^global fields in FNF files are not the same as fields marked "global" in DocOrigin Design. To instantiate ^global fields, see the "-substitute" parameter below.

    Additional parameters:

    • –in (required)

    This is used to provide the name of the input .DAT file. This argument is common to all DocOrigin filter processes. Note that it is supplied automatically when this app is used as a Merge filter. However, if this app is run standalone, the -in parameter must be specified.

    • –out (required)

    This is used to provide the name of the output XML file. This argument is common to all DocOrigin filter processes. Note that it is supplied automatically when this app is used as a Merge filter. However, if this app is run standalone, the -out parameter must be specified.

    • –form (required)

    Used to provide the name of the DocOrigin form file(s) to be used for field name searching. If more than one form is required, the names of the forms should be separated with –;— as they are when invoking the ‘DocOriginCombineForms’ executable.

    • –formlist (optional)

    Used to provide the name of a file into which is written the list of form files referenced in ^form lines within the .DAT file. This file is typically a temporary file whose content is used to re-establish the list of forms for Merge to use when it–s actually merging the resulting XML and the forms.
    Usage of this option is a little tricky. Please refer to the command line option -useFilterFormList

    • -oneChild paneName (optional)

    This is used, somewhat rarely, as a means to put into effect the notion of letting the data stream control the order of the panes rather than having the form design specify the order of the panes. See the "Data Order" discussion at the end of this list of options.

    • –prefix (optional)

    This is used to specify a character other than ^ as the prefix for hat command lines in the .DAT file. At JetForm, the character was always called 'hat' and not 'caret'.

    • –newline (optional) (as of 3.1.001.13)

    This is used to specify a different alternate newline character. By default, ~ is the alternate newline character, but you can change that to some other character. If you use -newline but do not provide a character to use, then the alternate newline character processing will be turned off, i.e. it will not look for ~.

    • –rename (optional)

    Includes two strings separated by : used to populate a table of rename pairs.

    Example: -rename InvoiceNo:InvoiceNumber - this indicates to ConvertDatToXml that it is to replace InvoiceNo with InvoiceNumber wherever it encounters it used as a field name in a ^field or ^global line in the .DAT file.

    Note that there can be any number of –rename arguments on the command line. Well, more likely in a .prm file than right on the keyed in command line.

    • –split splitText (optional)

    This provides a string that is to be processed as a document separator whenever it is found in the .DAT file.

    This argument is optional. However, if you do not provide any –split argument the entire .DAT file will generate a single document in the XML file. That might be what the user wants but more likely it is not.

    A popular string to provide is: ^page 1. That is often used in .DAT files to indicate the end of one and the beginning of another document's data.

    Example: -split "^field AccountNumber" indicates that whenever a line in the .DAT file that contains only the string ^field AccountNumber is encountered a new document should be started in the output XML file. When the string to search for includes spaces, it's necessary to put quotes around the string so that command processing will not treat it as multiple separate command line arguments.

    ConvertDatToXml always removes leading and trailing blanks and line-ending characters from the line before it checks for a match.

    The splitText string may contain the special wildcard characters * and ?. In that case, the comparison treats those characters in the same way as Unix and DOS do for file name searching. i.e. * matches any number of characters (including no characters) and ? matches any single character. If you are actually searching for a string that includes a ? or a * you need to precede that character with a backslash in the string. If you are searching for a backslash you need to include two backslashes together in your string.

    The string comparison, when checking the line from the .DAT file, is case-insensitive.

    Note that there can be any number of –split arguments on the command line.

    • –substitute    "oldCommandText:newCommandText" (optional) (As of 3.0.003.11)

    If a -substitute specification is given, a command line of the data file can be changed before it is further processed. An example of usage would be:

    -substitute "^global JF123:^field JF123"

    Such a change would mean that the field would be treated as a field and not just the resetting of the current value of a global. Thus it could cause a new instance of a pane to be emitted, whereas a mere change in a global value would never do that.

    Another example would be:

    -substitute "^page 2:^subform TsAndCs" 

    This can make it easier to correlate old .dat file page numbers with pane names.

    The oldCommandText is case-insensitive (since JetForm data files are case-insensitive).

    It need not be the entire command line; using just a substring of it is permitted. No wildcard characters are supported. This is useful for calls such as ^group or ^subform, to substitute the command with a *field. Then play a hidden field of the same name in the pane to instantiate the pane.  

    Note that there can be any number of –substitute arguments on the command line. If there are more than one -substitute specification, all instances of the first substitution will be done on the .dat command line being processed before the next -substitute specification is done.

    • –symbolset codepage (optional)

    If your input data is in some codepage other than utf-8 then you should cause it to be converted to utf-8 by indicating which codepage it is in. To see a list of supported codepages run uconv -l in the DO/Bin folder of your installation. See also Codepage Handling.

    • –allowRTF Y|N (optional)

    Enable or disable conversion of fields' data to rtf globally for the ConvertDatToXml run. By default, this conversion is enabled. Note that it's possible to handle some specific fields that are expected to have rtf content later in Merge using _inlineToRtf (Central "\x" to RTF) script function which converts Central inline text commands to rtf.

    • –trim (optional)

    This indicates whether to trim trailing blanks from the value string before storing it for a field. The valid setting are: left, start, right, end, both and none.

      • left and start indicate to trim blanks on only the left end of the string.
      • right and end indicate to trim blanks on only the right end of the string.
      • both indicate to trim blanks on the left and the right end of the string
      • none indicates to do no trimming of blanks. none is the default setting for –trim.
    • –verbose 100 (optional)

    If this is provided to the filter, as opposed to Merge itself, then the filter will report which fields were dropped, because of no name match, as it does the conversion.

    Data Order

    In DocOrigin the idea is that the data stream provides data and the form design provides the layout, including the order of the dynamically instantiated panes. Adobe® (JetForm) Central users are used to the data stream taking full control. Often those data streams were constructed to emit subform X, then subform Y, then subform A, etc. In any order. The form design had no say whatever. ConvertDatToXml, via its -oneChild option, provides a means to cope with such data streams.

    The Central form designs essentially had a sea of unordered subforms. The data stream could reference any of those subforms in any order. Essentially the data stream "drew" the document by continually saying what should come out next. To support this concept in DocOrigin you take all of those unordered subforms, now converted to panes, and place them all inside a parent of a chosen name. (It must be named). For description purposes let's say the chosen name of the parent pane was DataOrder -- but you can pick any name.

    So you now have a parent pane (DataOrder) and a whole bunch of loose, unordered, child panes. The parent pane must be marked as Allow Multiple, and probably Allow Split. The child panes must always be marked as not Mandatory. They may or may not be marked as Allow Split. So the structure is like this:

      DataOrder   (Allow Multiple)
         Pane1    (all of these are marked optional)
         Pane5
         Pane3
         PaneX
         PaneQ
         ...

    When you run ConvertDatToXml you would use:

    -filter 'ConvertDatToXml -oneChild DataOrder'

    That will very much affect the structure of the XML file that ConvertDatToXml produces. It will never allow the chosen pane (DataOrder in our example) to have more than one child. So, if the data stream had subforms in the order PaneX, Pane5, Pane3, what would come out in the XML is a structure as follows:

    <DataOrder>
    	<PaneX> ... fields for this pane</PaneX>
    </DataOrder>
    <DataOrder>
    	<Pane5> ... fields for this pane</Pane5>
    </DataOrder>
    <DataOrder>
    	<Pane3> ... fields for this pane</Pane3>
    </DataOrder>
      ...

    That will cause Merge to create new instances of parent pane DataOrder as needed and each time create an instance of a single child pane. In that way, the document output order is not controlled by the form design, but by the data stream.

    Of course, we think that the form design should control the layout and output order and the data should be mere data, however, you must deal with the data you are given.