Handling long-running APIs

arimendelow · March 25, 2023, 6:24pm

One of my GraphQL endpoints needs to make a call to an API, and that API often takes about a minute to respond (GPT4 is slow af). Issue is, then the serverless function is stuck waiting around, which is both a waste of resources and exceeds the Vercel runtime limits.

Does anyone have advice for handling situations like this?

Thanks!

KrisCoulson · March 26, 2023, 4:12pm

You def need to put that request in a background job.

Your are not going to be able to run a request like that on serverless architecture. Not sure what the request is if its a query or mutation so not sure how to handle it but it should definitely be moved to the background.

You take a look at inngest.com

We also have experimental setup command to get you start that I worked on

yarn dlx rw-setup-inngest

dthyresson · March 26, 2023, 5:10pm

I think you may in fact be able to do this still on Vercel and Netlify serverlsss as their background jobs ran run asynchronously for 10 to 15 minutes, but I would orchestrate this as Kris said with Inngest because there is no callback from those async jobs.

Send event to run OpenAI task step function that sends and then awaits for the job complete to be received
Receive event and trigger background job via some http request (ie, treat as signed, verifiable webhook)
When that long running to complete which saves the info and and then send the job completed event with a reference to the info
The step function resumes and fetches data for that reference and maybe adds a record for a notification
A notification auto polling request sees a new result and presents to user
User views result

It sounds complicated but it’s super powerful and can be used to do any long task

arimendelow · March 27, 2023, 3:26pm

Thanks @KrisCoulson and @dthyresson for your replies!! Yeah, I’ve been trying to ideate on a way to handle any long running task because I agree - having infra to do this would be super powerful.

I’ve been trying to play around with Vercel’s edge functions but I can’t get it to work with a function in a Redwood project. Been talking to support and they’re just as confused.

Will try to get the flow y’all suggested working with inngest! It seems relatively straightforward, in terms of triggering an Inngest job from a graphql service, and then using a webhook to tell another serverless function when it’s completed.

What I get stuck on is this - how do I then notify the user that the job has been completed?

dthyresson · March 27, 2023, 4:00pm

You may also want to checkout the Inngest Discord topic here: https://discord.com/channels/842170679536517141/1089456173863944192

Vercel currently imposes 60 seconds execution time limit on its serverless function. If I have a job that needs more than 60 seconds, how I can I run it? Are function steps what I looking for?

dthyresson · March 27, 2023, 4:02pm

In my pseudo-code example above, that was steps 4-6:

When tasks is complete, write a Notification record for that result belonging to the user that requested it. Note: Inngest provides a way to send the userId in the user context part of the payload. It would be available in the step function later.
Then in your App, have a graphql query that polls every 30-60 seconds or some for their new notifications. See: https://www.apollographql.com/docs/react/data/queries/#polling
User gets the notification link and then uses it to view the results that were persisted.

arimendelow · March 27, 2023, 4:25pm

ahhhhh good ol’ polling - I was hoping for something better hopefully we’ll get gql subscriptions in rw soon

thanks!! I’ll post back here with results.

ched-dev · March 28, 2023, 6:33pm

Inngest looks like a great option, thanks for the recommendation! I’ve been doing some cheesy workarounds for long running tasks and this should simplify it.

arimendelow · April 3, 2023, 4:59pm

@dthyresson this actually wouldn’t work - Inngest seems to just allow you to break your code into steps, so if you have one step (for example an api call) that takes longer than the execution limit, you’re stuck.

See the short discussion on the Discord topic you linked to earlier.

Additionally, RedwoodJS doesn’t support Vercel edge functions, which makes this effectively impossible. I opened a bug report.

thedavid · April 3, 2023, 5:58pm

Vercel’s edge functions are built on Cloudflare Workers, which have a short runtime limit but then schedule a cron: Limits · Cloudflare Workers docs

@arimendelow Have you set up any experiments to actually confirm this will work?

You might want to check out Netlify background jobs: Background Functions overview | Netlify Docs

Your background job code doesn’t have to be in Redwood. Seems like a very simple function you can build stand-alone and deploy wherever you’d like that fits the requirement.

I’ve seen several Redwood + ChatGPT products that are working well. I’ll try to learn more about their infra. My guess is that people are not using Serverless for this very reason.

arimendelow · April 3, 2023, 6:44pm

Haven’t set up any experiments yet Still very much in the investigation phase, figuring out my options for making this sort of thing work before I start to build any infra. (there are other reasons to want Edge support in Redwood, see below)

I want to come up with a Redwood-eque (do we have a Redwoodish name that’s similar to Pythonic?) solution for anything that needs to run longer than 10/60 seconds.

Of course I could always move this sort of thing outside of Redwood, but it becomes annoying to maintain - for example, I do have a separate project for generating OG images using Vercel’s Edge functions, and I keep putting off working on it because it’s in a separate repo etc.

If we could get Edge support in Redwood, that would be my first step to testing that out - moving that OG image generator into my Redwood monorepo.

And of course I could move away from Serverless, but that feels like a weird regression for what should be a solvable problem.

thedavid · April 5, 2023, 8:56pm

Yes, we want this, too. It turns out this already exists in the form of traditional server architecture/infa

We would love to see this better supported in Serverless. TBH, we’re dissapointed with the promise-meets-reality trendline of Serverless. What’s more, as soon as you add a DB you have all kinds of performance issues from Network latency (cold starts are not, in our experience, the primary cause of latency). So then you have to look into DB options like Neon and Planetscale, or crazy caching, or… For all the promise of simplicity, the complexity of stitching together infra and services becomes overly complex.

Turns out you get super snappy performance from an EC2 with an in-region DB combined with basic caching you’d get from something like Cloudflare.

Anyway, that’s a long way around to saying that Serverless != Simplicity, and it turns out that sometime the solutions at hand are both good enough and much more maintainable in the long term.

We’re not giving up. We’re just not holding our breath.

thedavid · April 5, 2023, 9:00pm

People are effectively doing this today:

create your code for feature X that runs on the edge
deploy it
connect it to your Redwood App

You could then use Inngest if/as needed for event queue, talking back and forth between your Redwood App and the Edge function.

I do think it would be interesting to make this whole experience integrated seemlessly into Redwood. I think the first step would be to create a Repo with a functional example and then explore possibilities.

dthyresson · April 6, 2023, 9:21pm

Also please see Long-running background functions on Vercel - Inngest Blog

dennis · April 7, 2023, 1:03pm

since I’m already familiar with aws and aws cdk, http://sst.dev is a tool I always use for background jobs. It offers a similar local-dev-but-live workflow as inngest.

Since lambdas can run up to 15 minutes, that’s all I’ll just need. My main api can either invoke the lambda directly or communicate by passing a message to sqs with a lambda that’s subscribed to the queue.

Tobbe · April 7, 2023, 6:30pm

I have a RW + OpenAI example here: GitHub - Tobbe/rw-openai: rw-openai
I never deployed it anywhere, but I totally could if I wanted to, just not to any of the serverless providers. It needs to be serverful because lambda functions don’t support SSE, which is what OpenAI uses to stream the response back to the client.
I’d go with a baremetal deploy if I were to deploy it

steveoOn · June 13, 2023, 7:30am

Hi @arimendelow I also need to call OpenAI’s API and face the same issue, suggestion:

To use GraphQL Subscription Realtime to handle OpenAI stream completions to solve Vercel serverless function time limits.
reduce maxtoken return from API.

However, Redwood JS v5 is not supported GraphQL Subscription until v6 is coming. see this: https://github.com/redwoodjs/redwood/pull/8397

before then, an optional choice is https://redwoodjs.com/docs/how-to/self-hosting-redwood

arimendelow · June 13, 2023, 3:18pm

A better solution is to just not do serverless between slow cold start and high prices with arbitrary limits like this, I understand why the industry is giving up on serverless.

I tried to make it work but was ultimately convinced when AWS published a post about how they reduced costs on a service by 90% but moving it off of serverless.