# nest.js puppeteer產生website PDF ## 前言 node如果要讀取website網站並產生PDF或畫面screenshot,如果用httpclient讀取的話,就像爬蟲一樣,僅能夠讀取到html的內容,無去將html code再轉為畫面,這時僅能夠透過瀏覽器渲染出畫面再轉為PDF或image。而node的puppeteer套件剛好可以達到這樣的需求,以下就是用nest.js來實現功能。 ## 安裝套件 建立好專案project後,並且安裝相關套件 ```bash! npm install nest-puppeteer puppeteer npm install -D @types/puppeteer ``` ## 建立service服務 service服務裡,以3種方式作為回傳,runBrowser裡針對lazy-load的websie作scroll方式來讀取,故讀取的時間會至少約20秒,如果只有單純的screenshot的話,可以直接省略。但是要注意有些website的image也是用lazy-load,如果不在意的話,可以再調整讀取的秒數,能更快的產生畫面screenshot。 ==src\app.service.ts== ```typescript! import { Injectable } from '@nestjs/common'; import { InjectBrowser } from 'nestjs-puppeteer'; import { Browser, Page } from 'puppeteer'; import * as path from 'path'; @Injectable() export class AppService { constructor( @InjectBrowser() private readonly browser: Browser, ) { } //將產生的畫面save為pdf檔案並再再回傳檔案 async getWebpage(url: string, outputFilename: string) { let page = await this.browser.newPage(); page = await this.runBrowser(url, page) // Some extra delay to const pdfPath = path.resolve(__dirname, '..', 'pdfs', outputFilename); await page.pdf({ path: pdfPath, format: 'A4' }); await page.close(); return pdfPath; } //將產生的畫面以PDF Butter的方式回傳 async getWebpagePdfButter(url: string) { let page = await this.browser.newPage(); page = await this.runBrowser(url, page) const pdfBuffer = await page.pdf({ format: 'A4' }); await page.close(); return pdfBuffer; } //將產生的畫面以image Butter的方式回傳 async getWebpageImage(url: string) { let page = await this.browser.newPage(); page = await this.runBrowser(url, page) const screenshotBuffer = await page.screenshot({ type: 'png', fullPage: true }); //如要全頁需加入參數 fullPage await page.close(); return screenshotBuffer; } async runBrowser(url: string, page: Page) { await page.goto(url, { waitUntil: 'networkidle0' }); // lazy-load const bodyHandle = await page.$('body'); const { height } = await bodyHandle.boundingBox(); await bodyHandle.dispose(); // Scroll one viewport at a time, pausing to let content load const viewportHeight = page.viewport().height; let viewportIncr = 0; while (viewportIncr + viewportHeight < height) { await page.evaluate(_viewportHeight => { window.scrollBy(0, _viewportHeight); }, viewportHeight); await this.wait(20); viewportIncr = viewportIncr + viewportHeight; } // Scroll back to top await page.evaluate(() => { window.scrollTo(0, 0); }); return page; } wait(ms) { return new Promise<void>(resolve => setTimeout(() => resolve(), ms)); } } ``` 要使用Puppeteer,則需要於module裡imports套件的PuppeteerModule ==src\app.module.ts== ```typescript ... import { PuppeteerModule } from 'nestjs-puppeteer'; @Module({ imports: [ PuppeteerModule.forRoot({ headless: true }), //import ,headles代表為無痕模式 ], controllers: [AppController], providers: [AppService], }) export class AppModule { } ``` ## 建立cntroller及測試 ```typescript! import { Controller, Get, HttpStatus, Query, Res } from '@nestjs/common'; import { AppService } from './app.service'; import { createReadStream } from 'node:fs'; import { Response } from 'express'; @Controller() export class AppController { constructor(private readonly appService: AppService) { } @Get('pdf') async getPdf(@Query('url') url: string, @Res() res: Response,) { const outputFilename = 'output.pdf'; const pdfPath = await this.appService.getWebpage(url, outputFilename); const fileStream = createReadStream(pdfPath); res.set({ 'Content-Type': 'application/pdf', 'Content-Disposition': `attachment; filename=${outputFilename}`, }); fileStream.pipe(res); } @Get('pdfs') async getPdfBuffer(@Query('url') url: string, @Res() res: Response,) { const outputFilename = 'output.pdf'; const pdfPath = await this.appService.getWebpagePdfButter(url); res.set({ 'Content-Type': 'application/pdf', 'Content-Disposition': `attachment; filename=${outputFilename}`, }); res.status(HttpStatus.OK).end(pdfPath); } @Get('image') async getImage(@Query('url') url: string, @Res() res: Response) { const imageBuffer = await this.appService.getWebpageImage(url); res.setHeader('Content-Type', 'image/png'); res.status(HttpStatus.OK).end(imageBuffer); } } ``` 以上完成後即可以用postman進行測試,並且帶入URL去產生PDF畫面或image ![image](https://hackmd.io/_uploads/HJt2TvCi1g.png) #### 參考 > https://www.npmjs.com/package/nest-puppeteer >