天天看點

java pdf轉jpg清晰度_java 庫 pdfbox 将 pdf 檔案轉換成高清圖檔方法

近期需要将 pdf 檔案轉成高清圖檔,使用庫是 pdfbox、fontbox。可以使用 renderImageWithDPI 方法指定轉換的清晰度,當然清晰度越高,轉換需要的時間越長,轉換出來的圖檔越大,越清晰。

說明:由于 adobo 軟體越來越強大,支援的格式越來越多,這造成了 java 軟體有些不能轉換。是以對于新的格式可能會有轉換問題。

1 引入依賴

org.apache.pdfbox

pdfbox

2.0.16

org.apache.pdfbox

fontbox

2.0.16

2 代碼如下

public static voidconvertPdf2Image(String pdfPath, String imageDirPath) {

log.info("start convert pdf file:[{}] to image path:[{}]", pdfPath, imageDirPath);if (!newFile(pdfPath).exists()) {

log.info("pdfFilename:[{}] not exist", pdfPath);return;

}if (!newFile(imageDirPath).exists()) {

log.info("imageDir:[{}] not exist", imageDirPath);return;

}byte[] pdfContent =FileUtil.getFileContentByte(pdfPath);

String filename=FileUtil.getFilename(pdfPath);float dpi = 200;

convertPdf2Image(pdfContent, filename, imageDirPath, dpi);

log.info("convert pdf file:[{}] to image success", filename);

}private static void convertPdf2Image(byte[] pdfContent, String pdfFilename, String imageDirPath, floatdpi) {

log.info("convert pdfFilename:[{}] to imageDir:[{}] with dpi:[{}]", pdfFilename, imageDirPath, dpi);if(ArrayUtils.isEmpty(pdfContent)) {return;

}//為了保證顯示清除,至少 90

if (dpi < 90) {

dpi= 90;

}

String baseSir=imageDirPath;if (baseSir.endsWith("/") || baseSir.endsWith("\\")) {

baseSir+= pdfFilename + "_";

}else{

baseSir+= File.separator + pdfFilename + "_";

}

PDDocument document= null;

BufferedOutputStream outputStream= null;try{

document=PDDocument.load(pdfContent);int pageCount =document.getNumberOfPages();

PDFRenderer pdfRenderer= newPDFRenderer(document);

String imgPath;for (int i = 0; i < pageCount; i++) {

imgPath= baseSir + i + ".png";

outputStream= new BufferedOutputStream(newFileOutputStream(imgPath));

BufferedImage image=pdfRenderer.renderImageWithDPI(i, dpi, ImageType.RGB);

ImageIO.write(image,"png", outputStream);

outputStream.close();

log.info("convert to png, total[{}], now[{}], ori:[{}], des[{}]", pageCount, i + 1, pdfFilename, imgPath);

}

}catch(IOException e) {

log.error("convert pdf to image error, pdfFilename:" +pdfFilename, e);

}finally{

IOUtil.closeSilently(outputStream);

IOUtil.closeSilently(document);

}

}//IOUtil.closeSilently 代碼

public static voidcloseSilently(Closeable io) {if (io != null) {try{

io.close();

}catch(IOException e) {

e.printStackTrace();

}

}

}

在實際使用中遇到問題

1)ERROR o.a.p.contentstream.PDFStreamEngine 911 - Cannot read JBIG2 image: jbig2-imageio is not installed

2)Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed

3) java.lang.IllegalArgumentException: Numbers of source Raster bands and source color space components do not match at java.awt.image.ColorConvertOp.filter

以上兩個問題需要使用 JAI 插件和 jbig2 插件支援,通過引入 jai-imageio-core、jai-imageio-jpeg2000、jbig2-imageio

com.twelvemonkeys.imageio

imageio-jpeg

3.4.2

com.github.jai-imageio

jai-imageio-core

1.4.0

com.github.jai-imageio

jai-imageio-jpeg2000

1.3.0

org.apache.pdfbox

jbig2-imageio

3.0.2

參考問題檔案

https://github.com/crazyCodeLove/studentservice/blob/master/sys/src/main/resources/pdffile/000208-p1.pdf

https://github.com/crazyCodeLove/studentservice/blob/master/sys/src/main/resources/pdffile/001659-p14.pdf

https://github.com/crazyCodeLove/studentservice/blob/master/sys/src/main/resources/pdffile/main%20doc.pdf

https://github.com/crazyCodeLove/studentservice/blob/master/sys/src/main/resources/pdffile/573636.pdf

參考文獻

https://stackoverflow.com/questions/42169154/pdfbox1-8-12-convert-pdf-to-white-page-image

https://stackoverflow.com/questions/20424796/pdf-box-generating-blank-images-due-to-jbig2-images-in-it

https://blog.csdn.net/qq_15801963/article/details/80746830

https://my.oschina.net/u/2345654/blog/1058192

https://stackoverflow.com/questions/18351583/illegalargumentexception-numbers-of-source-raster-bands-and-source-color-space

https://stackoverflow.com/questions/10416378/imageio-read-illegal-argument-exception-raster-bands-colour-space-components