top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

PDF Data source in Informatica

+1 vote
579 views

How does Informatica handle unstructured data sources like PDF. If a tabular report is stored as a PDF, can we read it out from PDF as a tabular data (like a data table in .net)?

posted May 9, 2014 by Rohini Agarwal

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

2 Answers

0 votes

PDF is actually quite structured internally. More recent revisions of the PDF specification may provide a way to hold the data ready for external processing, but the main goal of PDF documents is to describe a document for printing, so all kinds of environments and devices can print the document with a result as similar as possible.

It depends largely on the creator of the PDF if any extra data is provided other than where to print text and lines to form a table.

answer May 12, 2014 by Shweta Singh
0 votes

We can't read the data from PDF source file directly.

answer Jun 11, 2014 by Shatark Bajpai
Similar Questions
+1 vote

I have a mapping which I need to be able to run against multiple source schemas (having the same structure), one schema at-a-time. Given the number of schemas, I would rather not set up a session for each schema in order to specify a particular mapping connection, as that will require new sessions to be added as new schemas are added.

Is it possible to set up a workflow in such a way that the data source connection for a mapping within a session is defined (or passed in as a parameter of some sort) at run-time?

+2 votes

I want to know how to rename the ODBC data source in informatica.

+1 vote

I want to know how to rename the ODBC data source in informatica.

+2 votes

I have a mapping which I need to be able to run against multiple source schemas (having the same structure), one schema at-a-time. Given the number of schemas, I would rather not set up a session for each schema in order to specify a particular mapping connection, as that will require new sessions to be added as new schemas are added.

Is it possible to set up a workflow in such a way that the data source connection for a mapping within a session is defined (or passed in as a parameter of some sort) at run-time?

...