Sqlite backups - Here's how I'm doing it

jacebenson · October 15, 2024, 4:03am

I’ve been rebuilding my news aggregator and one of the ways I’m planning to simplify it is just to stick with sqlite.

Now I’ve been also wanted to code against a prod-like or copy of the prod database and as it’s just me, that’s great… The problem is, and it’s happened a few times, on a db migrate the database gets blown away. SO yea backups.

Maybe there’s a better way to do this. I’m running a daily job to do this and then the most I’ll lose could be a days worth of data.

How have you done backups?

Here’s a quick way to set this up.

Create a S3-compliant storage object somewhere (like Cloudflare, vultr, aws, etc)
Set up the environment variables

S3_HOSTNAME=sjc1.vultrobjects.com
S3_BUCKET=news-jace-pro
AWS_ACCESS_KEY_ID=1NOTREALKEY123
AWS_SECRET_ACCESS_KEY=NOTrealKeyHereTooGetYourOwn123
AWS_DEFAULT_REGION=sjc1

Add this script and run it;

// To access your database
// Append api/* to import from api and web/* to import from web
//import { db } from 'api/src/lib/db' //prisma
import { exec } from 'child_process'
import fs from 'fs'
import path from 'path'

import {
  S3,
  PutObjectCommand,
  ListBucketsCommand,
  HeadObjectCommand,
  CreateBucketCommand,
} from '@aws-sdk/client-s3'

const execCommand = (command) => {
  return new Promise((resolve, reject) => {
    exec(command, (error, stdout, stderr) => {
      if (error) {
        reject(stderr || error.message)
      } else {
        resolve(stdout)
      }
    })
  })
}
let dbPath = process.env.DATABASE_URL
if (dbPath.startsWith('file:./')) {
  dbPath = path.join(process.cwd(), 'api', 'db', dbPath.slice(7))
}
if (dbPath.startsWith('file:')) {
  dbPath = dbPath.slice(5)
}
const backupDir = path.dirname(dbPath)
const backupFileName = `${new Date().toISOString().split('T')[0]}.db` // Format YYYY-MM-DD.db
let backupFilePath = path.join(backupDir, backupFileName)

const backupDatabase = async () => {
  try {
    // if backup file already exists, exit
    if (fs.existsSync(backupFilePath)) {
      console.log(`Backup file already exists: ${backupFilePath}`)
      return backupFilePath
    }
    await execCommand('sqlite3 --version')
      .then((stdout) => {
        console.log(`sqlite3 version: ${stdout}`)
      })
      .catch((error) => {
        console.error(`Error checking sqlite3 version: ${error}`)
        throw new Error('Error checking sqlite3 version')
      })
    await execCommand(`sqlite3 ${dbPath} ".backup '${backupFilePath}'"`)
      .then((stdout) => {
        console.log(`${stdout}\nDatabase backed up to: ${backupFilePath} `)
      })
      .catch((error) => {
        console.error(`Error backing up database in exec: ${error}`)
        throw new Error('Error backing up database')
      })

    return backupFilePath
  } catch (e) {
    // Handle errors
    console.error(`Error backing up database in try/catch: ${e}`)
    return null
  }
}

//lets define s3
const s3 = new S3({
  region: process.env.S3_HOSTNAME.split('.')[0],
  endpoint: `https://${process.env.S3_HOSTNAME}`,
})

const bucket = process.env.S3_BUCKET

const createBucket = async ({ bucket }) => {
  const params = {
    Bucket: bucket,
  }
  try {
    await s3.send(new CreateBucketCommand(params))
    console.log(`Bucket ${bucket} created`)
    return true
  } catch (err) {
    console.error('Error creating bucket:', err)
    return false
  }
}

const isBucketCreated = async ({ bucket }) => {
  const params = {
    Bucket: bucket,
  }
  try {
    await s3.send(new ListBucketsCommand({}))
    console.log(`Bucket ${bucket} already exists`)
    return true
  } catch (err) {
    console.error('Error checking bucket:', err)
    return false
  }
}

const uploadToS3 = async (backupedDB) => {
  console.log(`Uploading backup to ${bucket}/${backupFileName}...`)
  const file = fs.readFileSync(backupedDB)
  const params = {
    Bucket: bucket,
    Key: backupFileName,
    Body: file,
  }
  try {
    const data = await s3.send(new PutObjectCommand(params))
    console.log(`Backup uploaded to ${bucket}/${backupFileName}`)
    return data
  } catch (err) {
    console.error('Error uploading file:', err)
    return null
  }
}

export default async ({ args }) => {
  // Your script here...
  console.log(':: Executing script with args ::')
  console.log(args)
  // Backup the database
  let backupedDB = await backupDatabase()
  console.log(`:: Backup saved to ${backupedDB} ::`)
  if (backupedDB) {
    // now we'll upload the backup to the cloud
    //await uploadToS3(backupedDB)
    // first lets check if there is a bucket
    try {
      if (!(await isBucketCreated({ bucket }))) {
        await createBucket({ bucket })
      }
      //if (!(await isAlreadyUploaded({ bucket, backupFileName }))) {
      //  await uploadToS3(backupedDB)
      //}
      // we want to overwrite the backup if its already there
      await uploadToS3(backupedDB)
      // cleanup
      fs.unlinkSync(backupedDB)
    } catch (e) {
      console.error('Error checking bucket:', e)
    }
  }
}

dthyresson · October 16, 2024, 2:28pm

Does this happen because in your dev environment you’ve set the database url to point to a production database?

Or, when you deploy a migration (that you or the deploy script oks) resets the database?

Since it’s SQLite, there’s no url per se that might change - it’s in the filesystem, am curious how this has happened.

Could you explain your deploy process – perhaps you could remove any prisma commands and only do migrations to production manually as needed.

Note that migrate is for dev only: see: Development and production | Prisma Documentation

Instead you want to do migrate deploy in production. See: Development and production | Prisma Documentation

Which is what they deploy cli commands should do:

And baremetal as well:

jacebenson · October 19, 2024, 7:57pm

I’m going to have to see if I can reproduce the issue.

I’m not building with a test server other than what i have locally.
So generally I’m just doing

#change some files locally, generally not the schema
#test locally
# yay
git add .
git commit -m "Feat. works!"
git push
yarn rw deploy baremetal production
#wait for deploy
#notices that I'm getting errors
#ssh on the server 
#cd into /var/www/app (this is where i am leaving my sqlite db)
# ls -la
# huh... why is my prod.db empty with a file size of 0 or <1k?  it's now ~300MB

My deploy.toml, maybe im misconfigured

[[production.servers]]
host = "${DEPLOY_HOST}"
username = "${DEPLOY_USERNAME}"
password = "${DEPLOY_PASSWORD}"
agentForward = false
sides = ["api"]
packageManagerCommand = "yarn"
monitorCommand = "pm2"
path = "/var/www/app"
processNames = ["api"]
repo = "REDACTED"
branch = "main"
keepReleases = 2

[before]
#update = "yarn rw exec backup"

[after]
#install = []
#migrate = "yarn rw exec seed"
build = ["yarn rw build web"]
restart = ["pm2 restart job"]

I’ll reply if I am able to consistently reproduce it.