I want to read a PDF and get some list of it's pages and each page's size. I don't need to manipulate it in any way, just read it.
Currently trying out pyPdf and it does everything I need except a way to get page sizes. Understanding that I will probably have to iterate through, as page sizes can vary in a pdf document. Is there another libray/method I can use?
I tried using PIL, some online recipes even have d=Image(imagefilename) usage, but it NEVER reads any of my PDFs - it reads everything else I throw at it - even some things I didn't know PIL could do.
Any guidance appreciated - I'm on windows 7 64, python25 (because I also do GAE stuff), but I'm happy to do it in Linux or more modern pythiis.
解決方案
This can be done with PyPDF2:
>>> from PyPDF2 import PdfFileReader
>>> input1 = PdfFileReader(open('example.pdf', 'rb'))
>>> input1.getPage(0).mediaBox
RectangleObject([0, 0, 612, 792])
(Formerly known as pyPdf and still refers to its documentation.)