TAP production-readiness questions

ngerakines.me · December 25, 2025, 7:07pm

This is a cross-post from https://bsky.app/profile/ngerakines.me/post/3mathu3myj22n

I’m getting ready to run TAP in production for Smoke Signal and have some questions about what it’s like to run a production-ready instance. Looking for feedback from the community and experts before I light things up.

Full Network Backfill Configuration

I want to do a full-network backfill specifically for lexicon community calendar events and RSVPs, plus Smoke Signal profiles and RSVP acceptances. I only need these records from across the network for indexing and search purposes.

Can someone check my assumption that this is the correct config:

TAP_FULL_NETWORK=true
TAP_COLLECTION_FILTERS=community.lexicon.calendar.*,events.smokesignal.*

For the TAP_COLLECTION_FILTERS config, will events.smokesignal.* match both events.smokesignal.profile and events.smokesignal.calendar.acceptance?

Database Choice

I’m planning on running one instance of TAP with SQLite on a persistent volume. If I’m doing a full network backfill for specific collections, should I go with Postgres instead?

Resource Requirements

Any suggestions on the amount of CPU and RAM necessary for this setup?

Clustering and Redundancy

From the docs, I don’t think I can run multiple instances of TAP in a single cluster. Is that correct? Do I need to think about data loss or redundancy when restarting to update the container version?

ngerakines.me · December 25, 2025, 7:10pm

The initial commit to place jetstream with tap:

ngerakines.me · December 25, 2025, 7:19pm

If I set a collection filter to A, B, and C, is the process for adding D to update the config and restart TAP or to spin up a new TAP cluster and allow it to backfill and then cut-over?

baileytownsend.dev · December 26, 2025, 9:52pm

So you can do TAP_FULL_NETWORK=true, but that’s a full network bruteforce backfill. So through every PDS, every repo on that PDS, which is fine if that’s what you want, but it will be a ton of bandwidth and time. On a 1gig connection. M4 Max, and 64gigs of ram machine going to sqlite the flashes lexicon of 19k repos took 14 hours for reference and was around 200gigs of bandwidth used.

I’ve been backfilling with TAP_SIGNAL_COLLECTION. It is kind of annoying in the fact that the repo has to have that collection before it starts being tracked, but once the repo is being tracked if anything in the repo or goes by on firehose for that user that is being tracked is picked up if it’s inside of TAP_COLLECTION_FILTERS. You can use it to get historic ones too, to changing it around. For instance with place.stream I found most repos by place.stream.chat.profile but that was a recent lexicon so once I backfilled on that I restarted with place.stream.chat.message to get any that it it may of missed. It doesn’t resend any records previously found. Then you want to switch back to whichever TAP_SIGNAL_COLLECTION you’d like to use to signal for a new repo to be picked up and tracked.

That’s my current production stragey for tap and then on user sign ins to also check if they are being tracked in tap and if not to go a head and add them to be checked since there’s a high chance a TAP_SIGNAL_COLLECTIONlexicon they will make that I will most likely want.

I hope that helps! Let me know if you got any questions or want me to break down any of it

baileytownsend.dev · December 26, 2025, 10:00pm

Whoops sorry I missed these.

I think sqlite should be fine. It’s just holding mostly which repos are being tracked, cids of what records were sent, and a “outbox” which is the actual record contents. The outbox is temp, once it sends the event via the webhook or websocket and gets a ack it clears from the outbox. The flashes db ended up being like 74mb I think after 120k records and 19k repos. Postgres is nice tho if you want to poke around while it backfills
I think on railwails it never went over 4 gigs of ram and i’m not even sure if it hit 2 of the vcpu on their graph (not sure what that translates to in real computer usages sorry, but I think a small vps will be fine)
I don’t think it is designed to have multiple ones ran. All state on backfilled is held in the DB, cursors, repos, etc. So if you are doing sqlite with docker need to make sure the sqlite file location is persistent. or in postgres don’t have to worry about any of it