Bad performance of Redwood app on render.com

gkk · October 10, 2022, 8:41am

Hi,

I’m hitting a performance snag with deploying a Redwood app on render.com, and I’m wondering if anyone else ran into it? Specifically, graphql calls to .redwood/functions/graphql are slow. I pinned it down to render’s rewrite rules that introduce over 1000ms (confirmed with render.com’s team). E.g. https://app-web.onrender.com/.redwood/functions/graphql takes over 1000ms longer to respond than https://app-api.onrender.com/graphql.

This seems like a show-stopper for deploying redwood apps on render.com and I’m a little surprised to be the first one to discover it. Has anyone run into it and found a work-around?

shansmith01 · October 11, 2022, 8:46pm

I would be really interested in a workaround here

You have just caused me to review one of my render apps that’s not far off production and I can confirm responses are taking an age.

It’s a shame, I tried out render because I believed render serveful implementation of RW and I have been having perf nightmares on a netllify install with graphql response times of 1.5-3sec believing the bulk of the issue is to do with serverless

dthyresson · October 11, 2022, 9:47pm

Can you rule out your database query or n+1 issues by looking at the query time and or resolver execution?

Also, can you rule just latency between where your app is located and your database is? Ie on different regions?

While I am not discounting Render I do know many Redwood startups that use it so just want to rule out other possible causes.

gkk · October 13, 2022, 11:21am

Yes, I looked at both database and multi-region as possible sources of latency. However, this test rules out database as a factor. Compare

curl -w "@curl-format.txt" -X POST -H "Content-Type: application/json" \
-d '{ "operationName":"RedwoodQuery", "variables":{}, "query":"query RedwoodQuery { redwood { version} }" }' \
https://promptcraft-api.onrender.com/graphql

{"data":{"redwood":{"version":"3.0.3"}}}     
time_namelookup:  0.005659s
        time_connect:  0.054200s
     time_appconnect:  0.134051s
    time_pretransfer:  0.134285s
       time_redirect:  0.000000s
  time_starttransfer:  0.134291s
                     ----------
          time_total:  0.263442s

to:

curl -w "@curl-format.txt" -X POST -H "Content-Type: application/json" \
-d '{ "operationName":"RedwoodQuery", "variables":{}, "query":"query RedwoodQuery { redwood { version} }" }' \
https://promptcraft-web.onrender.com/.redwood/functions/graphql
{"data":{"redwood":{"version":"3.0.3"}}}     
time_namelookup:  0.005490s
        time_connect:  0.072247s
     time_appconnect:  0.134791s
    time_pretransfer:  0.135018s
       time_redirect:  0.000000s
  time_starttransfer:  0.135026s
                     ----------
          time_total:  1.361592s

The first http call hits Redwood directly, the second one goes through a rewrite rule implemented by Render.com. API is deployed in the Frankfurt region, and -web is multi-region. Let’s see if traceroutes agree on both:

❯ traceroute promptcraft-web.onrender.com
traceroute: Warning: promptcraft-web.onrender.com has multiple addresses; using 216.24.57.3
traceroute to gcp-us-west1-1.origin.onrender.com.cdn.cloudflare.net (216.24.57.3), 64 hops max, 52 byte packets
 1  192.168.68.1 (192.168.68.1)  2.915 ms  4.170 ms  3.763 ms
 2  10.90.166.205 (10.90.166.205)  4.416 ms  3.902 ms  5.417 ms
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  89.108.200.2 (89.108.200.2)  47.077 ms  32.136 ms  34.177 ms
 9  89.108.200.83 (89.108.200.83)  62.902 ms  46.451 ms  47.785 ms
10  cloudflare.tpix.pl (195.149.232.128)  33.082 ms  52.916 ms  50.354 ms
11  216.24.57.3 (216.24.57.3)  33.545 ms  33.478 ms  35.637 ms

~/tmp via ☕ took 1m16s
❯ traceroute promptcraft-api.onrender.com
traceroute: Warning: promptcraft-api.onrender.com has multiple addresses; using 216.24.57.253
traceroute to promptcraft-api.onrender.com.cdn.cloudflare.net (216.24.57.253), 64 hops max, 52 byte packets
 1  192.168.68.1 (192.168.68.1)  4.977 ms  3.939 ms  3.581 ms
 2  10.90.166.205 (10.90.166.205)  5.449 ms  5.521 ms  5.694 ms
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  89.108.200.2 (89.108.200.2)  36.632 ms  31.351 ms  45.671 ms
 9  89.108.200.83 (89.108.200.83)  33.286 ms  32.269 ms  34.516 ms
10  cloudflare.tpix.pl (195.149.232.128)  47.246 ms  42.703 ms  60.341 ms
11  216.24.57.253 (216.24.57.253)  29.397 ms  30.903 ms  30.224 ms

I’m surprised others deploying on Render haven’t noticed this problem but it seems to be pervasive. Render team advised me to configure my app to hit https://promptcraft-api.onrender.com/graphql directly as a work-around, but that opens a can of worms with CORS and anything I tried broke the dev environment.

dthyresson · October 13, 2022, 12:33pm

Thanks for the detailed diagnosis @gkk … I’ve asked a number of Redwood startups who I know use Render if they have seen similar behavior as well as pointed our Render partner connections to this issue.

I’ll let you know when I hear back … or they may reply here as well.

We’ll sort this out!

zygopleural · October 13, 2022, 3:04pm

+1 we see exactly the same

https://api.<>/graphql
     time_namelookup:  0.042742s
        time_connect:  0.053369s
     time_appconnect:  0.077880s
    time_pretransfer:  0.077939s
       time_redirect:  0.000000s
  time_starttransfer:  0.077941s
                     ----------
          time_total:  0.204974s

vs

https://app.<>/api/graphql
     time_namelookup:  0.005338s
        time_connect:  0.028663s
     time_appconnect:  0.052751s
    time_pretransfer:  0.052831s
       time_redirect:  0.000000s
  time_starttransfer:  0.052833s
                     ----------
          time_total:  1.118274s

gkk · October 16, 2022, 9:46pm

Thanks for the assistance @dthyresson. I’m very keen to learn what you heard back.

In the meantime, now that we have the problem confirmed, do ideas for a work-around pop up into your mind? Hitting the API service directly would be a solution but it seems to me it would be difficult to convince Redwood to do so.

gkk · October 19, 2022, 9:56pm

Hi,

Just a small update on this from my end. Render folks are looking into this but it seems to be an unfortunate interaction between Render and Cloudflare (they integrate Cloudflare’s CDN by default). Out of frustration, I gave fly.io a spin as they have a Redwood support. The graphql queries went from ~1300ms to ~50ms(!!), using exactly the same data in db. This gave a really snappy feel for the app I’m working on.

Sadly, I found DX of fly.io to be rather rough[0] so I’d prefer to stick with Render. Still shopping for work-arounds.

[0] pushing 1.3GB container images from my laptop is not something I’m keen of

dbm · October 20, 2022, 12:57am

I’m an infrastructure engineer at Render. Thanks for these reports, which @dthyresson was kind enough to surface to us. Currently all static site rewrites are routed through one of our Oregon clusters. For API services located in Frankfurt, Singapore, Ohio, or even a different Oregon cluster, this introduces unnecessary (and noticeable) latency. We’re planning to rearchitect our routing layer so we can cut out this extra hop. We’ll keep you posted on our progress.

Still shopping for work-arounds.

@gkk If you’re willing to use a separate CDN, you could point it to a single web service on Render that serves both static assets and dynamic API requests. I haven’t thought through all the details, but am happy to go deeper with you.

gkk · October 20, 2022, 10:50am

Blockquote @gkk If you’re willing to use a separate CDN, you could point it to a single web service on Render that serves both static assets and dynamic API requests. I haven’t thought through all the details, but am happy to go deeper with you.

The -api/-web split was done for me by Render when I imported my Redwood.js project. Are you suggesting hosting everything as one service, essentially what Redwood describes baremetal deployment: Introduction to Baremetal | RedwoodJS Docs ?

dbm · October 21, 2022, 6:23am

Yes, the api/web split is a more natural way to do things (notwithstanding the performance issue). As a workaround, you could host everything as one Render service. If you want to improve performance for serving static assets you could put a CDN in front. The CDN would use your Render service as its origin server and should be configured to cache only static assets, not API responses.

gkk · November 16, 2022, 3:40pm

Hi,

I wanted to close this thread by saying that I I didn’t find a good work-around for the original issue in this thread. Moreover, I ran into more perf problems on Render, this time related to very slow queries to Postgres from Redwood/Prisma (exactly same queries executed via psql ran fast).

Fueled by another bout of frustration, I migrated to Netlify + Supabase, and it’s been mostly a smooth voyage.

shansmith01 · November 16, 2022, 5:06pm

Really hoping we see some more action from Render on this.

As an aside I have tried several hosting/DB combos out there and I am yet to find something that really fires. I don’t think this is necessarily a redwood issue I think it’s a combo of:

I am serving customers in Australia and New Zealand and most hosting solutions don’t offer server/db options close to us.
I think every hot host/paas’s marketing department works as quickly as possible to stick the logo of every hot framework on the front of their website so they look good and then walk away. It is evidenced by a lack of maintenance on support docs and attention to perf. I’m a marketer at heart so I get the strategy, but the sour taste in a user’s mouth is not worth it in my opinion.

Now I have that rant off my chest…

If there were interest from the community I would consider some sort of benchmarking project across different hosting providers.

And please sound off if you have hosting solution you are super happy with

Justin · May 24, 2023, 6:27pm

Hey everyone! I’m starting to investigate slowness of our redwood app deployed on render (mostly due to our own inefficient code probably), and I noticed this thread. Does anyone know if this is still an issue with render.com rewrites before I look into it? There hasn’t been activity in a while, so I’m hoping it’s resolved or perhaps someone has found a lower effort workaround. Thanks!

thedavid · May 24, 2023, 7:40pm

I don’t believe there’s currently an issue. Question: when did you first start deploying to Render? (Asking because it’s possible the config changed over time when you run yarn rw deploy setup render.)

Where’s your DB hosted? Network latency between Server<>DB is the number one reason for slow performance in my eperience (even worse than AWS Lambda cold starts).

Recommendations for diagnosing:

try Studio locally to inspect: Redwood Studio [Experimental]
try OpenTelemetry in deployment: OpenTelemetry Support [Experimental]

Justin · May 24, 2023, 8:10pm

Hey, thanks for the quick reply! Both API and DB are hosted on Render in Oregon (US West). We first deployed our app to render in late March 2023. Thanks for sharing those recommendations, I’ll give them a try

thedavid · May 24, 2023, 8:43pm

I also should have asked which Render plan you’re using? The free and lower priced plans are significantly throttled (if I recall correctly).

clarkbw · May 27, 2023, 3:39pm

Wanted to add that I’m running on render and not seeing performance issues. I run the starter plan and while build times are slow (3 - 5 mins) on that plan the performance of the service running is fast. I have my db hosted elsewhere (another AWS backed provider) and I’m not seeing that as an issue yet.

sara · April 29, 2024, 10:02pm

Hey - I’m an infrastructure engineer at Render. I wanted to share an update here - we recently changed our architecture so that static sites are now stored and served from the region selected for a user’s web service. This means that rewrites will be evaluated in that region, no longer requiring the round-trip through Oregon. Link to our changelog.

Sites created starting March 14th 2024 are opted into this behavior automatically. Any existing sites that have been re-deployed since April 10th will also adopt the new behavior.

One quirk is that if a static site was created before any web services, it will have defaulted to the Oregon region. Re-creating the static site will cause it to inherit the region of your web services.