Tutorial: What secures that heroku postgres instance

Built the tutorial and it seems as if the heroku postgres instance is secured only through the obscurity of its hopelessly complex URL

I am a complete noob to redwoodjs, and jamstack in general, but is there any more security to offer than that?

What is to keep the wiley h4x0r from finding out the URL and making off with all my blogposts?

How would this really be handled in production?

I’m not qualified to answer your question (so I won’t try to :sweat_smile:), but the discussion in this thread is similar, and security comes up a few times:

Environment variables is squarely where security comes in though. The all-caps comment at the top of a Redwood app’s .env file illustrates this pretty well, and you can see that DATABASE_URL there too:

# THIS FILE SHOULD NOT BE CHECKED INTO YOUR VERSION CONTROL SYSTEM
#
# ...
#
# DATABASE_URL=postgres://user:pass@postgreshost.com:5432/database_name
# BINARY_TARGET=rhel-openssl-1.0.x

But as far as Jamstack goes, databases aren’t very Jamstacky. And that’s part of the reason we’re here–to bring fullstack to the Jamstack. So the solutions you see aren’t final, and security’s something we have in mind, but don’t have in progress per-se. And there’s some areas where I’m sure we can’t actually do anything (as in, some aspects of it will be up to the technologies we’re using).

But just like we provided auth out of the box, we’re not going to make you figure out those need-to-haves yourself. :evergreen_tree:

2 Likes

I don’t even have the db url in my .env files, I only have it configured on Netlify. And since it’s only injected in the lambda functions, and never in the front-end code, it’s perfectly safe. No one is going to be able to see that URL.

A bigger concern might be someone accessing your db through your lambda functions. So do make sure those are properly secured!

1 Like

Can you tell me more about that?

Apart from sprinkling “isAuthenticated” logic into each function of mine, how do I secure the lambda functions"?

Sorry, haven’t gotten that far myself yet, so I don’t know :frowning:

To see what I mean though, just go here https://redwood.netlify.app/.netlify/functions/graphql That’s the kind of access anyone can get just by guessing at a pretty easy to guess url.

By their nature, lambdas need to be open to the world because they are directly accessed by a web browser from potentially anywhere. If you’re developing an internal app, only to be used within your company, that could be a reason to lock down the lambdas to only certain IP ranges. Unfortunately this isn’t possible when using services like Netlify. If this is a requirement of your app you’ll need to look into other providers, or roll your own.

You could add a tiny bit of security by adding something like a unique token to each request to one of your lambdas. If that token isn’t present then don’t process the request. Unfortunately your client-side code would need to know that token as well, so someone poking through the source code could find it.

By default Heroku will make your database open to the world and rely on username/password for security. Now, the database isn’t going to be accessed by the client, only by the lambdas, so you don’t really need it to be available to the entire internet. If you stick with Netlify then the lambdas could be deployed anywhere in us-east-1, so in theory you could limit access to the database to only be from the AWS IP ranges in us-east-1. However that still leaves a HUGE number of IP address in the world able to access your database, since half the internet seems to run on AWS these days. Here’s the full list of AWS region IP addresses. There are 338 entries just for us-east-1, and each range sometimes includes hundreds of thousands of IP addresses! :open_mouth:And then you have the headache of trying to keep that list up to date as AWS makes changes…

So yeah, this is definitely something we’re thinking about, but right now it seems like a long, complicated password on your DB is the way to go. We’re looking into deployments on other providers like Google Cloud and Vercel, but I’m not too involved in those so I’m not sure what their security model looks like.

1 Like

I’m notoriously bad when it comes to security, so please help me out here. What is the attack vector for the Heroku DB? Why does it matter that it’s open to the entire Internet?

How could anyone ever find it? Theoretically you wound’t even need a username/password for the DB, right? Because, again, no one will ever find it.

I’m notoriously bad when it comes to security,

I’m pretty bad too, just sort of maybe understand the basics and appreciate the wisdom of those who know more.

Well they find it because they hacked me, or hacked heroku, or paid off someone at heroku, or they have found some bug such that they can predict heroku database urls.

Security through obscurity is still mostly frowned on right?

1 Like

Interesting and thanks for the long and open response.

I’m coming from a class client/server, then classic backend / frontend experience, so I am mostly trying to understand the solution space that JAMstack and RedwoodJS offer, and answer questions I’ve had to “fight” in prior applications.

One more thought to add --> in my two decades of experience, the common method of connection between API and DB has always been a “connection string” most often including 1) a unique IP 2) DB name 3) authentication.

When possible, I’ve always added an extra layer of security via IP allow/whitelist. This is possible via directly deploy to AWS Lambdas using AWS API Gateway.

But the must vulnerable point and best practice is to keep the connect string a tightly locked away secret. The most common source of hacks is the connection string being committed to public repositories or stolen via phishing attacks.

All that said, security is hard.

Hope this helps a bit.

I’m not a security expert either, so take all of this with a grain of salt! But in your traditional client/server application (let’s say Ruby on Rails) you’d open the database to only your app servers so that literally no one and no thing can access it other than those servers. Of course if someone were to get access to your app servers, you’d be in trouble.

What I would do on AWS was to create a security group for the database, and one for the app servers, and let them talk to each other. Then I’d open port 3306 (MySQL) on the database security group to my house’s IP address, and open up port 22 on the app servers security group to that same IP. There was a load balancer in front of the app servers, and it had port 80 and 443 open to the world. The load balancer’s security group was then allowed to talk to the app servers security group.

Then, IN THEORY, I (really my house) was the only person on earth capable of getting direct access to those servers, everyone else could only talk to the load balancers, and everything else could talk to each other, but behind AWS’s firewalls.

But that all goes out the window in the Jamstack! You kind of need everything available to everyone, all the time. :frowning: Again, you could lock down the database a little more, but the effort makes it almost a non-starter. We’re hoping that someone like Netlify comes out with a database solution where all of this security is taken care of for us, but we have no idea when something like that may be available. :frowning:

2 Likes

I guess the main problem is that Netlify only hosts the Functions, and not the DB as well.

From my GCP experience, I’m pretty sure I could restrict access to a Cloud SQL database instance to requests coming from a specific VPC network, which would be the one my Cloud Functions are run from.
Very akin to the security groups you’re mentioning on AWS, I guess.

AFAICT, the same kind of configuration could be achieved between AWS Lambdas and whatever DB service you’d use on AWS.

So the problem isn’t really JAMStack itself. It definitely doesn’t require your DB to be available to everyone, just the functions!
The fact that Netlify is not a “full-fledged” Cloud provider is the issue; it provides an abstraction layer above serverless functions from a third-party provider, and does not have a DB offering to work with it, making the “DB open to the world” situation the only available solution.

1 Like

With AWS you can have VPC so only lambda functions can access VPC resources(like DB in RDS) not world. Only thing is if same lambda(which is attached to VPC) also needs to access auth0 for authentication then you will NAT Gateway which is per hour cost.