The XML Configuration File

At first, here is a sample xml merging configuration file:

<?xml version="1.0" encoding="utf-8"  ?>
<documentmerger xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" type="pdf" created="2011-11-01">
  <sourcedocuments>
    <sourcedocument id="bills" type="pdf">
      <name>dummy.pdf</name>
      <filepath>d:\importfiles</filepath>
      <pages>
        <page number="10"/>
        <page number="20"/>
        <page number="15"/>
        <page number="34"/>
      </pages>
    </sourcedocument>
    <sourcedocument id="privatefiles" type="pdf">
      <name>dummy2.pdf</name>
      <filepath>d:\importfiles</filepath>
      <pages>
        <page number="10"/>
        <page number="20"/>
        <page number="15"/>
        <page number="102"/>
      </pages>
    </sourcedocument>
    <sourcedocument id="incomingchecks" type="pdf">
      <name>dummy3.pdf</name>
      <filepath>d:\importfiles</filepath>
      <pages>
        <page number="34"/>
        <page number="35"/>
        <page number="36"/>
        <page number="37"/>
        <page number="38"/>
        <page number="39"/>
      </pages>
    </sourcedocument>
    <sourcedocument id="learning" type="pdf">
      <name>dummy4.pdf</name>
      <filepath>d:\importfiles</filepath>
      <pages>
        <page number="102"/>
        <page number="22"/>
        <page number="33"/>
      </pages>
    </sourcedocument>
    <sourcedocument id="getall" type="pdf">
      <name>dummy5.pdf</name>
      <filepath>d:\importfiles</filepath>
      <pages>
        <allpages/>
      </pages>
    </sourcedocument>
    <sourcedocument id="development" type="pdf">
      <name>dummy6.pdf</name>
      <filepath>d:\importfiles</filepath>
      <pages>
        <page number="15"/>
        <page number="16"/>
        <page number="22"/>
      </pages>
    </sourcedocument>
    <sourcedocument id="screenshots" type="image">
      <name>imagegroup1</name>
      <filepath>D:\importfiles\images</filepath>
      <pages>
        <page content="tigers.jpeg"/>
        <page conent="house.png"/>
        <page content="lambo.jpg"/>
        <page content="pamela.jpg"/>
      </pages>
    </sourcedocument>
    </sourcedocuments>
  <destinationdocuments>
    <destinationdocument id="Merger1" type="samedoc" docname="merger1.pdf" pagesize="A4">
      <inputdirectory>D:\importfiles</inputdirectory>
      <outputdir>same</outputdir>
      <inserts>
        <insert betweenstart="10" betweenend="11" sourcedocumentid="bills" replace="false"/>
        <insert betweenstart="15" betweenend="16" sourcedocumentid="privatefiles" replace="false"/>
        <insert betweenstart="20" betweenend="21" sourcedocumentid="incomingchecks" replace="false"/>
        <insert betweenstart="100" betweenend="101" sourcedocumentid="learning" replace="false"/>
        <insert betweenstart="201" betweenend="207" sourcedocumentid="learning" replace="true"/>
        <insert betweenstart="220" betweenend="221" sourcedocumentid="screenshots" replace="false"/>
      </inserts>
      <extracts>
        <pages>
          <page number="2"/>
          <page number="10"/>
          <page number="15"/>
        </pages>
      </extracts>
    </destinationdocument>
    <destinationdocument id="Merger2" type="samedoc" docname="mergeintome.pdf" pagesize="A4">
      <inputdirectory>D:\importfiles</inputdirectory>
      <outputdir>same</outputdir>
      <inserts>
        <insert betweenstart="10" betweenend="11" sourcedocumentid="bills" replace="false"/>
        <insert betweenstart="15" betweenend="16" sourcedocumentid="privatefiles" replace="false"/>
        <insert betweenstart="20" betweenend="21" sourcedocumentid="incomingchecks" replace="false"/>
        <insert betweenstart="100" betweenend="101" sourcedocumentid="learning" replace="false"/>
        <insert betweenstart="201" betweenend="207" sourcedocumentid="learning" replace="true"/>
        <insert betweenstart="290" betweenend="291" sourcedocumentid="screenshots" replace="false"/>
      </inserts>
      <extracts>
        <pages>
          <page number="2"/>
          <page number="10"/>
          <page number="15"/>
        </pages>
      </extracts>
    </destinationdocument>
    <destinationdocument id="Merger3" type="newdoc" docname="doc1.pdf" newdocname="newdoc1.pdf" deleteold="false" pagesize="A4">
      <inputdirectory>D:\importfiles</inputdirectory>
      <outputdir>D:\outputfiles</outputdir>
      <inserts>
        <insert betweenstart="10" betweenend="11" sourcedocumentid="bills" replace="false"/>
        <insert betweenstart="15" betweenend="16" sourcedocumentid="privatefiles" replace="false"/>
        <insert betweenstart="20" betweenend="21" sourcedocumentid="incomingchecks" replace="false"/>
        <insert betweenstart="100" betweenend="101" sourcedocumentid="learning" replace="false"/>
        <insert betweenstart="201" betweenend="207" sourcedocumentid="learning" replace="true"/>
      </inserts>
      <extracts>
        <pages>
          <page number="2"/>
          <page number="10"/>
          <page number="15"/>
        </pages>
      </extracts>
    </destinationdocument>
  </destinationdocuments>
</documentmerger>


This configuration file is devided into the following logical sections:
TagMeaningComment
documentmergerroot elementThis is the root element tag
sourcedocumentsHolds all source documents to be inserted into the destination documents 
sourcedocumentDefines a single source document. 
sourcedocument/nameThe name of the source document.e.g. source_one.pdf
sourcedocument/filepathFilepath to the sourcece document.e.g. c:\sourcedocuments
sourcedocument/pagesDefines the pages from the source document to be inserted into the destination document. 
sourcedocument/pages/pageThe single page to be inserted. 
destinationdocumentsHolds all the destination documents where the source document-pages will be inserted. 
destinationdocumentDefines a single destinationdocument. 
destinationdocument/inputdirectoryDefines from where to load the destinationdocument.e.g. c:\destinationdocuments
destinationdocument/outputdirDefines the outputdirectory for the modified document.e.g. c:\merged_output
destinationdocument/insertsDefines the inserts for the current destination document. 
destinationdocument/inserts/insertDefines a single insert for the documentation document. 
destinationdocument/extractsDefines which pages to extract during the merge.NOT IMPLEMENTED IN THIS VERSION
destinationdocument/extracts/extractDefines a single extract for the current destination document. 

 

 

 

 

 

 

 

 

 

 

  Tag Attributes

Attributes For Tag sourcedocuments

Attribute Meaning Comment
id Defines the uniqe id for the source document. e.g. “sourceone”
type Defines if the source is  pdf document, or if images should be inserted. For PDF-File set it to “pdf” for images set it to “image”

 

Attributes For Tag page

Attribute Meaning Comment
number A specific page number of the current source document to merge. e.g “10”
content Only if you use the type=”image” attribute on the sourcedocument The name of the image to insert e.g. “dollars.jpg”

Attributes For Tag destinationdocument

Attribute Meaning Comment
id Defines the unique id of the destination document e.g. “destinationone”
type Defines if the merging is performed on the same doc, or if a new one should be created. Set to “newdoc” to create a new document, don’t replace the old one.
Set to “samedoc” if the current document should be replaced.
docname The filename of the destinationdocument. e.g. “destone.pdf”
newdocname Use only in conjunction with type set to “newdoc” The name for the new destination document. For example “destonenew.pdf”
deleteold Delete the current input document? Use only with type set to “newdoc”. Attention! The old file will be deleted. Set to “true” if you want to delete the current destination document. Set to “false” otherwise.

Attributes For Tag insert

Attribute Meaning Comment
betweenstart Start to insert beginning with this page. Set the page here e.g. “10”
betweenend Before this page, all pages from the given source will be inserted. Set the page here e.g. “20”. Must not be smaller than the value in the betweenstart attribute.
sourcedocumentid The id of the sourcedocument where the pages should be inserted. Add the value from a sourcedocument/id attribute here.
replace Replace pages betweenstart and betweenend and betweenstart and betweenend. Set to “false” to not replace. Set to true to replace.

 

More to come very soon.

Last edited Nov 13, 2011 at 11:42 PM by embeducation, version 7

Comments

No comments yet.