# nest.js puppeteer產生website PDF
## 前言
node如果要讀取website網站並產生PDF或畫面screenshot,如果用httpclient讀取的話,就像爬蟲一樣,僅能夠讀取到html的內容,無去將html code再轉為畫面,這時僅能夠透過瀏覽器渲染出畫面再轉為PDF或image。而node的puppeteer套件剛好可以達到這樣的需求,以下就是用nest.js來實現功能。
## 安裝套件
建立好專案project後,並且安裝相關套件
```bash!
npm install nest-puppeteer puppeteer
npm install -D @types/puppeteer
```
## 建立service服務
service服務裡,以3種方式作為回傳,runBrowser裡針對lazy-load的websie作scroll方式來讀取,故讀取的時間會至少約20秒,如果只有單純的screenshot的話,可以直接省略。但是要注意有些website的image也是用lazy-load,如果不在意的話,可以再調整讀取的秒數,能更快的產生畫面screenshot。
==src\app.service.ts==
```typescript!
import { Injectable } from '@nestjs/common';
import { InjectBrowser } from 'nestjs-puppeteer';
import { Browser, Page } from 'puppeteer';
import * as path from 'path';
@Injectable()
export class AppService {
constructor(
@InjectBrowser() private readonly browser: Browser,
) { }
//將產生的畫面save為pdf檔案並再再回傳檔案
async getWebpage(url: string, outputFilename: string) {
let page = await this.browser.newPage();
page = await this.runBrowser(url, page)
// Some extra delay to
const pdfPath = path.resolve(__dirname, '..', 'pdfs', outputFilename);
await page.pdf({ path: pdfPath, format: 'A4' });
await page.close();
return pdfPath;
}
//將產生的畫面以PDF Butter的方式回傳
async getWebpagePdfButter(url: string) {
let page = await this.browser.newPage();
page = await this.runBrowser(url, page)
const pdfBuffer = await page.pdf({ format: 'A4' });
await page.close();
return pdfBuffer;
}
//將產生的畫面以image Butter的方式回傳
async getWebpageImage(url: string) {
let page = await this.browser.newPage();
page = await this.runBrowser(url, page)
const screenshotBuffer = await page.screenshot({ type: 'png', fullPage: true }); //如要全頁需加入參數 fullPage
await page.close();
return screenshotBuffer;
}
async runBrowser(url: string, page: Page) {
await page.goto(url, { waitUntil: 'networkidle0' });
// lazy-load
const bodyHandle = await page.$('body');
const { height } = await bodyHandle.boundingBox();
await bodyHandle.dispose();
// Scroll one viewport at a time, pausing to let content load
const viewportHeight = page.viewport().height;
let viewportIncr = 0;
while (viewportIncr + viewportHeight < height) {
await page.evaluate(_viewportHeight => {
window.scrollBy(0, _viewportHeight);
}, viewportHeight);
await this.wait(20);
viewportIncr = viewportIncr + viewportHeight;
}
// Scroll back to top
await page.evaluate(() => {
window.scrollTo(0, 0);
});
return page;
}
wait(ms) {
return new Promise<void>(resolve => setTimeout(() => resolve(), ms));
}
}
```
要使用Puppeteer,則需要於module裡imports套件的PuppeteerModule
==src\app.module.ts==
```typescript
...
import { PuppeteerModule } from 'nestjs-puppeteer';
@Module({
imports: [
PuppeteerModule.forRoot({ headless: true }), //import ,headles代表為無痕模式
],
controllers: [AppController],
providers: [AppService],
})
export class AppModule { }
```
## 建立cntroller及測試
```typescript!
import { Controller, Get, HttpStatus, Query, Res } from '@nestjs/common';
import { AppService } from './app.service';
import { createReadStream } from 'node:fs';
import { Response } from 'express';
@Controller()
export class AppController {
constructor(private readonly appService: AppService) { }
@Get('pdf')
async getPdf(@Query('url') url: string, @Res() res: Response,) {
const outputFilename = 'output.pdf';
const pdfPath = await this.appService.getWebpage(url, outputFilename);
const fileStream = createReadStream(pdfPath);
res.set({
'Content-Type': 'application/pdf',
'Content-Disposition': `attachment; filename=${outputFilename}`,
});
fileStream.pipe(res);
}
@Get('pdfs')
async getPdfBuffer(@Query('url') url: string, @Res() res: Response,) {
const outputFilename = 'output.pdf';
const pdfPath = await this.appService.getWebpagePdfButter(url);
res.set({
'Content-Type': 'application/pdf',
'Content-Disposition': `attachment; filename=${outputFilename}`,
});
res.status(HttpStatus.OK).end(pdfPath);
}
@Get('image')
async getImage(@Query('url') url: string, @Res() res: Response) {
const imageBuffer = await this.appService.getWebpageImage(url);
res.setHeader('Content-Type', 'image/png');
res.status(HttpStatus.OK).end(imageBuffer);
}
}
```
以上完成後即可以用postman進行測試,並且帶入URL去產生PDF畫面或image

#### 參考
> https://www.npmjs.com/package/nest-puppeteer
>