Prerender proposal

mojombo · July 1, 2020, 5:26pm

I’ve been thinking about what the developer experience should be for enabling route-based prerender in a Redwood app and here’s a proposal:

Let’s say your home page is a marketing page that doesn’t take any parameters and should just be static. All you’ll have to do is add prerender to the route:

// web/src/Routes.js
<Route path="/" page={HomePage} name="home" prerender />

And that’s it! It will be rendered out at build-time and delivered as a static file. Now, if you have route parameters, it gets more interesting. You’ll still specify prerender in the route:

// web/src/Routes.js
<Route path="/todo/{id}" page={TodoPage} name="todo" prerender />

But now you need to get all possible values of id at build time. There are two ways you might want to do this. Perhaps you have a static list of IDs that don’t change very often (you would be happy changing a code file to update it). In that case, you could create a file on the web side at src/prerender.js:

// web/src/prerender.js
export const todo = [ {id: 1}, {id: 2}, {id: 3} ]

The idea here is that you can export an array-of-objects with the same name as the route you’ve requested to be prerendered and Redwood will grab the possible route params from there. You could also use this method to read them off disk from a JSON file or the like.

This is cool, but often you’ll want to get a dynamic list of route params from your database at CI build-time. In this case, it would be handy for the code to live on the api side so you can have access to whatever data access machinery normally lives there. But we don’t want to make you have to do complicated data fetching, so we could allow you to instead create a src/build/web/prerender.js file on the api side:

// api/src/build/web/prerender.js
import { todos } from 'src/services/todos'

export const todo = todos

This example is in the context of the example-todo app, and in it we can import the todos function from the todos service (which is exactly why services are written like this!). It will simply return an array of Todo objects, each of which contains an id parameter. Redwood will then orchestrate how that data is consumed by the prerender build process (maybe via a build-time-only GraphQL service, or maybe something else).

In either case, with only a few lines, you can enable prerender on a per-route basis and provide the necessary route params so Redwood can iterate over the possible pages and get them all rendered for you at build time.

You might wonder why we don’t simplify this and JUST have the api side file. The answer is two-fold:

If you don’t have an api side at all (which is fine), then you’d have no way to prerender!
If you don’t have an api side prerender file, then we can simplify the build process and not have to spin up any database access there, which will make the build happen faster and easier, so it ends up being an optimization to use the web side prerender file.

Let me know what you think!

Tobbe · July 1, 2020, 7:55pm

This is very exciting Tom!

For the app I’m building the third option, with the api-side prerender.js, is going to be very useful!

chris.johnson · July 1, 2020, 8:13pm

Sounds really great, I think the third option would be most useful for my use cases.

Just a quick question, appreciate this might be out of the scope at this time, but does the pre-render proposal allow for any form of ‘Incremental Builds’?

I am sure you guys are aware of the same concept in Next.js, https://nextjs.org/blog/next-9-4#incremental-static-regeneration-beta.

Obviously it isn’t ideal if you just, for example, make a minor text change and this requires a whole rebuild.

peterp · July 2, 2020, 6:15am

Oooh, I love the way that you’re tying the router and the “prerender.js” file together

dthyresson · July 8, 2020, 1:17pm

Hi.

I thought I’d share my developer experience implementing an app that did many page preenders (30,000+ pages) during build and deploy that was, shall we say … sub-optimal.

Hope that my experience can inform some considerations when designing and implementing prerender in RedwoodJS – which will be a great and – I think – oft used feature.

Page Rendering circa 2017

First, we travel back to Fall 2016, Winter 2017. I jumped aboard the JAMStack train.

The app I built used Middleman – a static-site generator – since I came from the Ruby world and I we had already been using Contentful to store “research data” (company & people profiles, blog posts, “market maps”). So we thought ahead, Add in Algolia to search, Auth0 to authenticate (and authorize via roles) and Netlify to build and deploy … and enforce auth.

We were an early user of Netlify’s role-based redirects where cookies stored the JWT and said if a user has access to the market map area, etc.

Contentful didn’t yet have GraphQL support, its Ruby sdk was still in early stages. etc, the its Delivery API could access the data I needed to generate pages. Also, Netlify did not have plugins to help with prebuild tasks ot the build plugin cache to help store file data between builds. I also wasn’t going to check in 30,000 pages into GitHub to keep in sync. Plus, what would check in? A local build? Netlify?

First Approach

My first approach was basically (little simplified):

During Netlify build
Use a rake task in the build command
to fetch all data needed from Contentful
store in yaml files per type (companies, posts, people, etc). Maybe 3-4k entries.
fetch article data from a private microservice (25k+)
build app
Wait for Netlify to send up lots and lots of possible changes to CDN

At some point, build times got to be over 3 hours (sometimes hitting 6 hrs if lets say a layout chamges at all 40k+ pages changed), memory exploded, pushing to the CDN could fail on timeouts due to number of files … we got our own build instance and only built in the morning and evening.

Something had to change.

Second Approach later in2017

Where can I optimize? Yaml generation.

With

some optimization (gzipping and archiving yaml to S3 and
only fetching data from Contentful and articles from a “last” date) and
optimizing the page rendering for Middleman

I got the builds down to 60-90 mins and no memory explosion.

This involved lots of pre-processing the pages w/ front matter instead of loading the massive datasets.

Again as a rake task, but now could be done w/ Netlify plugins.

Third Approach early 2018

Can we scale to more articles?

We were adding articles at a rate of 1-2k per week, so w/in a few months 30k would become 60k. Would become 90k. FYI - there are ~ 300k now.

No longer prerender the article pages
Embedded a Vue.js app to fetch from the Article API itself (validated uer’s JWT and acces etc etc)
So now fewer pages and API calls

and build went down to < 30 mins.

Fourth approach early 2019

Scrap it all and make React app.

One time full data load
Contentful webhooks send changes (CRUD) to a microservice that sends to GetStream collections
Microservice sends Articles to GetStream
Other data (charts, structured JSON data) stored on S3 and Netlify lambda functions fetch
App builds only on feature changes and takes ~2-3mins

2017 Problems

Here are some of my problems with larger scale prerendering (again on not so optimized systems but alas the concepts hold true for design consideration).

Hope some of this can inform prerendering with RW.

Pagination

Whatever fetches the data to be rendered as pages may either have to render in batches (100-1000 at a time) or be able to fetch the entire dataset.

I haven’t tried a RW graphql example with pagination to see what that might look like.

Memory

If the entire dataset is returned, is this a memory concern?

When Middleman had to load all the yaml files, it was taking several GB of RAM and as I said, I had to get a dedicated Netlify build box. We were on Enterprise, so everyone there was happy to help. But, no.

⊧⊧⊧ Multple Models in page

How would multiple models in one page be handled – assuming the 2nd model is not related to the first?

For example,


// api/src/build/web/prerender.js

import { todos } from 'src/services/todos'

export const todo = todos

I want to shows todos and maybe also on a page show the map of the todo lat/lon? (guessing here). And lets say that makes a MapCell that calls a 3rd part api with lat/lon to fetch (city, state, zip, country).

Would a MapCell component still render if passed the lat, lon? Even if the Cell makes an API or GraphQL call?

In 2017 this meant I had to have all the data for all models in memory so the page could access wahtever it needed.

Timeouts / Connection Resets

During pagination of the Contentful API, I relatively frequently hit connection resets or timeouts due to network issues.

May need to implement a exponential backoff and retry in pagination calls.

Prerender data fetch fails = Incomplete Site?

Until I handled timeouts or retries more gracefully, build/deploy would not necessarily fail … even worse I’d only have a subset of data or old data and the site would reflect that.

Number/Cost/Limits of API Calls

If the prerender data isn’t cached (say as a Netlify build plugin or elsewhere) then every build will make api calls. If that is to the Prisma-backed database, maybe not a problem.

But, if a third-party/external api is being used, there rate limits, and calls per-day to consider. This can also have monetary considerations.

Other Thoughts

The term “prerender” has some name recognition with a “prerendering” service for SEO. Will this be confusing? See: https://docs.netlify.com/site-deploys/post-processing/prerendering/
There comes a limit when prerendering isn’t suited and the pages should be dynamic. Maybe it is 100 pages or 1000 or it depends on how the prerender data is fetched (and from where). The developer needs to be sensible if to use or not. But that’s the nature of things.
Would Auth work the same way? Would it be possible to also use Netlify auth/role based redirects to enforce? Not sure one would want to, but just thinking.

TL;DR

So, that’s my experience of “when prerendering goes bad”.

Pagination
Number/Cost of API calls
Build time
Memory
Connection reset/timeouts
Fails, incomplete sites

Confident it won’t go that way in RW.

dthyresson · July 9, 2020, 7:10pm

Read Pre-rendering with react-snap & Redwood and that raised another consideration that I forgot:

How to handle routes, pages nested in <Private>

Assumption might be that such auth-backed pages cannot be prerendered and that the prerender is not allowed on a Route if surrounded by <Private>.

nickg · September 14, 2020, 5:46pm

Great proposal! Some thoughts:

Build-time prerendering will use the data for the correct environment I presume? Production would need production data, a staging environment its own staging data, local dev builds local data.
Sometimes you’d like to update the prerendered pages without deploying / building. For example, for a highly dynamic site you might want to update the prerendering every hour. Could there be a remote rake-like task for this purpose? Or does this break the JAMstack concept?

thedavid · September 15, 2020, 1:15am

Hi @nickg!

Yes, correct. Whatever DB connection you are using specific to that environment will be the data used. Same for integrations to other services — most likely environment variable controlled (pros/cons).
Absolutely possible. You could set up anything from GH Action to the new Repeater.dev

nickg · September 28, 2020, 5:32pm

Just a quick question, appreciate this might be out of the scope at this time, but does the pre-render proposal allow for any form of ‘Incremental Builds’?

I think you should be able to build only certain routes with a build parameter (with an array or glob). This way, for example if you have a CMS (in your Redwood app itself of course) you could only build the page whose data is recently saved / published as a trigger.

Next.js’s incremental static regeneration is also a good idea, but expects highly dynamic data. Still, I highly welcome it.

nickg · January 25, 2021, 12:36pm

I take that back. Having now used incremental static regeneration (ISG) in Next.js 9.5+, I truly think it’s a game-changer.

You have the speed of prerendered (SSG) with the up-to-date content of server rendering (SSR). The only difference is a slight chance of outdated data in the page. With few visitors to a page, this can be minutes or hours. However, with many visitors to a page, this can be as low as 1 second (the default revalidate period). So it works out for important pages that usually have lots of visitors.

A big client was so impressed they implemented their own solution based on this idea, since they cannot use Next.js at the moment.

thedavid · January 26, 2021, 12:39am

@nickg There have been some excellent internal discussions prompted in part by the Next iSSG thing (which is effectively Stale while Revalidate cache control) and the React Server Components announcement. The result is a longer-term vision for what we’re referring to as Redwood as Scale — what are the performance requirements to scale applications and how could we implement novel approaches to achieve them?

The first step is Redwood-style prerendering (build-time generation). Depending on your page, this might mean a “static page” or better handling of data-loading for marketing or product pages. Here’s the PR for step 1 of this initiative. Do take a look and let us (especially @danny) know what you think:

ajcwebdev · February 3, 2021, 10:21am

We recently had Danny on the FSJam podcast to talk about the latest prerender proposal, I wanted to write up some notes from the episode for anyone who is interested in this topic but doesn’t want to listen to the episode.

I’ve had a very hard time wrapping my mind around all these terms, what they mean, and how they apply to Redwood and I found these explanations very useful. Here’s Danny’s comments lightly edited for clarity:

Origin of the term server-side rendering

I’m going to describe my understanding of prerendering. Before the amazing stuff that Next.js did with SSG and SWR (there’s all of these terms that Next.js introduced) there was really one way to do it with React. You run it through an Express server and then you get some HTML back.

This was my first experience with SSR, or server-side rendering with React. This doesn’t necessarily mean whether it happens at runtime or build time. All it means is using a NodeJS environment to render React code. That’s what SSR originally meant.

Shifting definitions of SSR

If you look at old documentation, that’s what they’ll refer to. You’ll see if you look at outdated libraries “SSR now supported.” Those meant I’m doing an extra check for window before trying to do something with the library. As Next introduced all of these new terms, SSR became an overloaded term.

SSR now means, “I’m going to render it at runtime.” SSG now means, “I’m going to run server rendering at build time,” but it’s actually doing the same thing it’s just where or when it’s doing it. And of course there are some differences.

The difference between SSR and static generation

When you say “static,” I feel like there’s more confusion that can get generated. When you say “static generation,” I think of tools like Gatsby, or Jekyll, or Hugo. These are tools, whatever you write in, it’s not even important.

At the end of it, it generates static HTML files. Once you deploy it that’s it, there’s nothing dynamic about it. It’s static generation for that reason. That’s how I’ve used “static.”

Prerendering in Redwood

The way I view prerendering in Redwood (the official term still to be decided by the way) is much closer to the original SSR. Which is to say, we’re going to take a component, and we are going to render it in a NodeJS environment.

Currently with Redwood, we’re going to try to do this at build time. At build time it renders your landing page so that you get all the benefits of SEO. Or maybe you want to create the skeleton of your page before your full application loads up.

But the difference from static is that once the JavaScript loads up, it will hydrate your page. It starts with a static HTML file and then as soon as your React application or your Redwood application has loaded it will become dynamic.

Defer Loading of JavaScript (and possibly CSS)

Beyond the SEO benefits (if there are any, I’m not an expert on this topic), you see the experience benefits. You can start to do some fun stuff like defer the loading of the JavaScript file altogether. Maybe even defer loading of CSS files if you’re brave, that’s something I’m in the process of looking at. The idea behind it is you make it feel as fast as possible while also making sure that bots can parse it.

Stale-while-revalidate

Part of Tape, the share page is built with Next. In the Next context, SSG (static stuff) also hydrates. That’s something you should let me first explain. iSSG means, whenever you go onto a page, in the background it will run that render process again. When the next person visits that page the next person will see the latest one.

Even though the first person got a stale page, the page is rendered and cached. This is also known as SWR (stale-while-revalidate). Like I said, there’s so many terms describing what it is. An easy way to think about it is to always think about whether it’s happening at build time or runtime. That clarifies it, I think.

adriatic · February 19, 2021, 11:03pm

I found this gem, hidden in the “prerender” discussion and nearly missed it. If my ability to see the future still works as I am used to, there will be a big increase in the number of people that love RedwoodJS and would like to learn and use it.

Being one of these novices, I can tell you that I did not always have an easy time discovering what is important and what is very important . That brings me to the idea of writing the Redwood Hitchhiker Guide through JamStack

Tobbe · May 26, 2022, 3:01pm

Two years since the initial proposal this is finally happening!

If you’re interested you can follow along my work here: WIP: Cell prerendering by Tobbe · Pull Request #5600 · redwoodjs/redwood · GitHub
The biggest difference compared to Tom’s initial idea is that instead of two separate prerender.js files there is now just one, and it lives in /scripts/prerender.js (next to /scripts/seed.js). And while the OP’s text here focuses on how to provide values for route parameters, I’m also taking the natural next step and prerender the cells on the pages too. Because in like 90% (or more) of the cases that’s what those route parameters are used for.

No details are set yet. We still have to discuss my solution in the core team.