The first step in grabbing information from a pdf file is to translate it into text format with pdftotext -layout command.
Is it available any specific python tool or library to describe the layout of a page with ascii characters and to help in identifying and extracting the useful pieces of information? For example a function allowing to select N characters at line I starting from column Y.
If a such tool is not available, what is in your mind the best structure to describe in python a two dimensions page layout?