Deploying Nextjs Puppeteer Web Scraper to Vercel - next.js

I am working on a webscraper with Nextjs and Puppeteer. Everything works well on localhost, but once I deploy, the Vercel deployed version gives a 500 internal server error when I try to access puppeteer. I've looked at some guides on deploying a serverless puppeteer function to Vercel, and some suggested playwright, but it still doesn't work when I deploy it. Here are the code snippets using puppeteer, and here is the github repo: https://github.com/hellolol2016/EquilibriNews
import chromium from "chrome-aws-lambda";
import playwright from "playwright-core";
//FUNCTION TO RUN SEPARATE SCRAPE FUNCTIONS
async function scrapeInfiniteScrollItems(page, getNews, src) {
let items = {};
try {
items = await page.evaluate(getNews);
} catch (e) {
console.log(e);
console.log("bad source", src);
}
return items;
}
//FUNCTION TO SET UP BROWSER AND RETURN
export default async function handler(req, res) {
const browser = await playwright.chromium.launch({
args: chromium.args,
executablePath:
process.env.NODE_ENV !== "development"
? await chromium.executablePath
: "/usr/bin/chromium",
headless: process.env.NODE_ENV !== "development" ? chromium.headless : true,
});
const page = await browser.newPage();
page.setJavaScriptEnabled(false);
page.setViewport({ width: 1280, height: 3000 });
await page.goto("https://www.foxnews.com/politics");
let items = await scrapeInfiniteScrollItems(page, extractFox, "fox");
//NOTE: I didn't include the extractFox function because it didnt use any puppeteer functions
allArticles.fox = items;
await browser.close();
res.status(200).json(allArticles);
}
I've tried some other articles about this like https://puppeteer-screenshot-demo.vercel.app/?page=https://whitep4nth3r.com (This one uses a deprecated version of Node) and https://ndo.dev/posts/link-screenshot (this is what I'm trying right now).
I'm guessing the solution is to install a different library that works in a similar way as playwright / puppeteer / chrome-aws-lambda but can still be used when deployed as a serverless function on Vercel.

Related

Cannot find module 'firebase-functions/lib/encoder' from 'node_modules/firebase-functions-test/lib/providers/firestore.js'

Hey I am trying to unit test this cloud function right here:
import { logger, region, https } from "firebase-functions/v1";
import Message from "../types/message";
const helloWorldHandler = region("europe-west1").https.onCall((_, context) => {
if (context.app == undefined) {
throw new https.HttpsError("failed-precondition", "The function must be called from an App Check verified app.");
}
logger.info("Hello logs!", { structuredData: true });
const message: Message = {
text: "Hello from Firebase!",
code: 200,
};
return message;
});
export default helloWorldHandler;
with the following test:
import * as functions from "firebase-functions-test";
import * as path from "path";
const projectConfig = {
projectId: "myproject-id",
};
const testEnv = functions(projectConfig, path.resolve("./flowus-app-dev-fb-admin-sdk-key"));
// has to be after initializing functions
import helloWorldHandler from "../src/functions/helloworld";
import Message from "../src/types/message";
describe('Testing "helloWorld"', () => {
const helloWorld = testEnv.wrap(helloWorldHandler);
it("helloWorld does work", async () => {
const data = {};
const success: Message = await helloWorld(data);
expect(success.code).toBe(200);
});
});
When I run it with yarn test I receive the following error
Cannot find module 'firebase-functions/lib/encoder' from 'node_modules/firebase-functions-test/lib/providers/firestore.js'
Even though my function does not even use firestore in the first place?
Any ideas ?
I was facing a similar issue while trying to set up a unit testing environment for firebase cloud functions.
Mainly, after following all the steps on Firebase's docs, and running npm test
I would get the following error
Error [ERR_PACKAGE_PATH_NOT_EXPORTED] Package subpath './lib/encoder'
is not defined by "exports"
After stumbling on Farid's suggestion for this problem, I realized that, for some reason, npm i firebase-functions-test does not install the latest version of the module.
Try npm i firebase-functions-test#latest.

Getting 500 | Internal Server Error when using getServerSideProps in NextJS after deploying in Vercel

I'm using server rendered pages in NextJS using getServerSideProps. It's in index.js (root page).
When I'm making build locally, website working fine. But when i'm hosting this site in Vercel, it's showing 500 | Internal Server error.
export async function getServerSideProps(context) {
let params = context.query;
const job_col = await collection(db, "job_list");
const job_snap = await getDocs(job_col);
let jobData = job_snap.docs.map((doc) => ({
...doc.data(),
id: doc.id,
}));
return {
props: {
jobs: jobData,
params
},
};
}
It's because of the large payload. Move the large payload code portion to the client-side(useEffect) and it will be resolved.
I had the same issue. But when I changed getserversideprops to getstaticprops it worked.

Error 504 when making an API request in NEXTJS deploy at vercel

I have a service handler to handle cv uploads to firebase storage on a NEXTJS project. At the development stage, everything works correctly when the client makes a request to the API. But after deploying to vercel sometimes the API request fails by generating a 504 error. Here is my handler code
async add(userId, { file }) {
const userService = new UserService();
const { url: urlImage, pathName } = await this.#firebaseStorageService.uploadFile({
file,
folder: this.#firebaseStorageService.folder.assets,
path: 'cv',
});
const user = await userService.getById(userId);
// if the user has already uploaded a cv
if (user.cv) {
// remove old cv if exist
const oldCv = await this.#getData({ field: '_id', value: user.cv.id });
// delete old cv on storage
await this.#firebaseStorageService.deleteFile(oldCv.path);
// rewrite old cv on db
oldCv.url = urlImage;
oldCv.path = pathName;
await oldCv.save();
return oldCv._id;
}
const newCv = new CV({
url: urlImage,
path: pathName,
});
const cv = await newCv.save();
await userService.addCV(userId, cv._id);
return cv._id;
}
In vercel under usage tab check for serverless functions section. It will give you a clear picture on the timeouts that are occurring during execution.
The cold start time for the serverless functions could be one of the reason why this might be happening. Might also depend on the region where the serverless function is running and how long does it take to resolve and respond.

How come when I use getStaticPaths in production on my local server the pages load instantly but deployed on Vercel they don't?

Hi I'm using NextJs and I am having an issue that when my app is hosted on my local server the pages that are pre loaded with getStaticProps are loading in a few ms but when hosted on Vercel it is taking 300ms to load.
Does anyone have any suggestions on how I can get my pages to load quicker on Vercel?
My app is currently hosted at https://next-movie-delta.vercel.app/
and my github repo is https://github.com/lewisRotchell/next-movie
My getStaticPaths code looks like:
export async function getStaticPaths() {
const movies = await getMovies();
const bigMovieArray = [
...movies[0].results,
...movies[1].results,
...movies[2].results,
];
const paths = bigMovieArray.map((movie) => ({
params: { movieId: movie.id.toString() },
}));
return {
paths,
fallback: "blocking",
};
}
and the getMovies code looks like :
export async function getMovies() {
const urls = [newReleasesUrl, popularMoviesUrl, topRatedMoviesUrl];
try {
let data = await Promise.all(
urls.map((url) => fetch(url).then((response) => response.json()))
).catch((error) => console.log(error));
return data;
} catch (error) {
console.log(error);
}
}
Thanks :)
I've managed to fix the problem! I changed the linking from withRouter to a tag from next/link and it has fixed my issue!

Nuxt SSR firebase functions returns 504 timeout

I'm trying to implement Nuxt with SSR in Firebase hosting (using Firebase functions), but after my function is triggered I keep getting an '504 timed out waiting for function to respond'.
My Firebase function:
const functions = require("firebase-functions");
const { Nuxt } = require("nuxt");
const express = require("express");
const app = express();
const config = {
dev: false,
buidlDir: 'src',
build: {
publicPath: '/'
}
};
const nuxt = new Nuxt(config);
function handleRequest(req, res){
console.log('handling request');
//res.set('Cache-Control', 'public, max-age=600, s-maxage=1200')
nuxt.renderRoute('/')
.then(result => {
console.log('result: ' + result.html);
res.send(result.html);
})
.catch(e => {
res.send(e);
console.log(e);
})
}
app.get('*', handleRequest);
exports.ssrApp = functions.https.onRequest(app);
I also tried with:
function handleRequest(req, res) {
console.log("log3");
res.set("Cache-Control", "public, max-age=300, s-maxage=600");
return new Promise((resolve, reject) => {
nuxt.render(req, res, promise => {
promise.then(resolve).catch(reject);
});
});
}
I also have node vs8 as default for my functions because I read that that could give problems. :
"engines": {
"node": "8"
},
But with the same result. My function is being triggered but it always times out, btw: I have this problem serving locally and when trying to deploy to Firebase itself.
Let me know if you need more information/code to try to help and see what the problem could be.
First, if you want to find out what caused it, use debug option.
Second, if you face the timeout error, check the path is valid.
If you success build Nuxt and nuxt.render, the error is processed by Nuxt, and Nuxt show this error page.
In other words, if you don't see Nuxt error page, the cause may be not related with Nuxt.
I also stuck 4 hours due to timeout error, and I finally found out the cause was the contents of publicPath.
Please check these 2 things.
buidlDir is valid ?
The path of buildDir is valid ? You should check .nuxt folder is deployed to your cloud functions successfully.
publicPath contents is uploaded successfully?
The builded contents in .nuxt/dist must be uploaded to Firebase Hosting. You should check it manually.
Type URL to address bar ex) https://test.firebaseapp.com/path/to/file.js
Finally, I post a URL of
my sample project, using Nuxt and Firebase.
I also stucked like you and it made me rally tired. I'm really happy if this answer helps someone like me.
PS: When you build Nuxt in functions folder, nuxt start is failed. Be careful. In my project, I build it in root, and when deploy, I copied it.
Nuxt SSR with Firebase Integration
I got the same problem because Nuxt is not ready yet (promise is undefined)
So you can try to add nuxt.ready() after new Nuxt()
Example:
const functions = require('firebase-functions');
const express = require('express');
const { Nuxt } = require('nuxt');
const config = {
dev: false
// Your config
};
const nuxt = new Nuxt(config);
const app = express();
nuxt.ready(); // <---------- Add this!
async function handleRequest(req, res) {
res.set('Cache-Control', 'public, max-age=1, s-maxage=1');
await nuxt.render(req, res);
}
app.get('*', handleRequest);
app.use(handleRequest);
exports.ssrApp = functions.https.onRequest(app);
Ref: https://github.com/nuxt/nuxt.js#using-nuxtjs-programmatically

Resources