Gruppo CMT: transforming PDF invoices in FEPA

Accounting dept., Sales dept., Public Administration

www.gruppocmtrading.it

Sector: Service provider

The problem

Gruppo CMT decided to rely on Arket to provide its customers with the FEPA (Electronic Invoice for Public Administrations) management services: in particular they chose our solution Station, which allows to turn invoices from PDF format to the specific XML format for the FEPA.

This transformation from PDF to XML requires a very laborious procedure, requiring to get data from PDF documents that often have layouts that are different from each other, and then to structure them in a specific format.

The solution: Station

Station provides a handy and intuitive graphic tool, which allows to easily carry out the mapping of invoices, received in PDF format. The data extracted from PDF then undergo two levels of transformation (using the “Data Transform” tool of Station):

At a first level (specific for each individual PDF invoice layout) all the extracted data from the PDF are transferred in a simplified document model, universal to all mappings, in which the header data are merely distinguished from the row data.
Through a second Data Transform (equal for all kinds of invoices), the “normalized” data of the first level feed a more complex model, which reflects the structure of the XML file for electronic invoicing. Through an Outmapper this complex model populates a DataBase, which will be actually used to create the final file in XML format for FEPA.

The need to first pass through a simplified document model, derives from the fact that, receiving invoices structured in many different ways, it would be necessary to reconfigure the structure from scratch for each new model received.

This division into two levels makes it possible to simplify the addition of new types of PDF invoices to the process: when the user adds a new mapping (eg. a new customer or a new invoice layout), he will only have to concentrate on the first-level transformation (distinction of data between header and rows) and not on the complex structure of the FEPA files to produce.

An example of a complexity that can be found when mapping an invoice, involves the position of the data concerning the reference order: if the invoice refers to a single order, the order data will be in the header; if instead the invoice contains items from different orders, the order data will be in a row, in correspondence of the reference article. Depending on whether the data is in the header or in a row, then, the XML must be structured in a completely different way.

Advantages of the solution

Thanks to the double transformation, those who carry out the mapping of a new PDF will have to worry only to specify if the data is in row or in the header. It will then be the second transformation to do the correct valuation of the FEPA XML file nodes.

The selected software