Supabase Logs: open supply logging server
In the present day, we’re releasing Supabase Logs for each self-hosted customers and CLI growth.
Whatch the video announcement
Logflare Introduction
Since Logflare joined Supabase over a yr in the past it’s been quietly dealing with over 1 billion log occasions daily. These occasions come from varied instruments within the Supabase infrastructure – the API gateway, Postgres databases, Storage, Edge Capabilities, Auth, and Realtime.
Logflare is a multi-node, extremely obtainable Elixir cluster, ingesting the log occasions and storing them into BigQuery for Supabase and Logflare’s prospects. On common, the cluster has 6 nodes dealing with each spike our prospects throw at it.
To show log knowledge to prospects, we leverage Logflare Endpoints. This gives an HTTP integration into Supabase Studio, powering the log question UIs and most time-series charts. These charts dwell throughout the studio, such because the mission dwelling web page and the brand new API experiences.
Self-hosting Logflare
Logflare was obtainable underneath a BSL license previous to becoming a member of Supabase. We’ve since modified the license to Apache 2.0, aligning it with our open supply philosophy.
Previously few months we’ve made Logflare extra developer-friendly for native growth and self-hosting. Whilst you’re constructing a mission, you possibly can view and question your logs from any Supabase service, simply as you’d in our cloud platform.
It at present helps a BigQuery backend, and we’re actively engaged on supporting extra.
The Ingestion Pipeline
Logflare receives Supabase log occasions through a number of strategies. Providers like Postgres use Vector to scrub and ahead log occasions to the Logflare ingest API. Different providers comparable to Realtime and Storage make the most of native Logflare integrations to ship the log occasions instantly. These then get processed and streamed into BigQuery.
The Querying Pipeline
The onerous half comes after ingesting the logs: looking out, aggregating ,and analyzing them at scale. Crunching many terabytes of information on every question is dear, and exposing the ingested knowledge to Supabase prospects in a naive method would trigger our prices to skyrocket.
To resolve these points, we constructed and refined Logflare Endpoints, the question engine that powers a lot of Supabase’s options, such because the logs views, Logs Explorer, and utilization charts.
With Endpoints, you possibly can create HTTP API endpoints from a SQL question, together with parameterized queries. Endpoints are like PostgREST views however with some advantages:
- Question parameters
- You may present string parameters to the SQL question through the HTTP endpoint.
- Learn-through caching
- Outcomes from the question are cached in reminiscence for quick response occasions.
- A read-through cache gives outcomes if cached outcomes don’t exist.
- Lively cache warming
- Question outcomes are proactively warmed at a configurable interval for a mixture of quick response occasions and as-realtime-as-needed knowledge.
- Question sandboxing
- If an Endpoint question comprises a CTE and the sandbox choice is chosen, the Endpoint will inject the question string of the
sql
question parameter into the Endpoint SQL changing the default question (the a part of the SQL question after the CTE). - Endpoints parse SQL to permit
choose
queries solely. No DML or DDL statements are permitted to run via Logflare Endpoints.
- If an Endpoint question comprises a CTE and the sandbox choice is chosen, the Endpoint will inject the question string of the
With this characteristic set, Supabase has been capable of construct any view we’ve wanted on high of billions of every day log occasions.
Logflare Endpoint Instance
Utilizing webhooks, we will ship all GitHub occasions within the Supabase group to Logflare. The webhook sends structured occasions, and Logflare transforms the payload into metadata:
{
"event_message": "supabase/supabase | JohannesBauer97 | created",
"id": "0d48b71d-91c5-4356-82c7-fdb299b625d0",
"metadata": {
"sender": {
"id": 15695124,
"login": "JohannesBauer97",
"node_id": "MDQ6VXNlcjE1Njk1MTI0",
"site_admin": false,
"sort": "Person",
"url": "https://api.github.com/customers/JohannesBauer97"
},
"starred_at": "2023-03-30T20:33:55Z"
//...
},
"timestamp": 1680208436849642
}
We’re within the high contributors, which will be extracted with SQL (in BigQuery dialect):
choose
depend(t.timestamp) as depend,
s.login as gh_user
from
`github.supabase.webhooks` as t
cross be a part of unnest(metadata) as m
cross be a part of unnest(m.sender) as s
the place
timestamp::date > current_date() - @day::int
group by
gh_user
order by
depend desc
restrict
25
With this view in place, we will use Endpoints to offer an API that we will hit from our utility:
curl "https://logflare.app/endpoints/question/69425db0-1cfb-48b4-84c7-2a872b6f0a61"
-H 'Content material-Sort: utility/json; charset=utf-8'
-G -d "day=30"
This returns a JSON response with the highest org vast contributors for the final 30 days!
{
"end result": [
{ "count": 23404, "gh_user": "vercel[bot]" },
{ "depend": 10005, "gh_user": "joshenlim" },
{ "depend": 7026, "gh_user": "MildTomato" },
{ "depend": 6405, "gh_user": "fsansalvadore" },
{ "depend": 5195, "gh_user": "saltcod" },
{ "depend": 3454, "gh_user": "alaister" },
{ "depend": 2691, "gh_user": "kevcodez" },
{ "depend": 2117, "gh_user": "gregnr" },
{ "depend": 1769, "gh_user": "Ziinc" },
{ "depend": 1749, "gh_user": "chasers" },
{ "depend": 1430, "gh_user": "Isaiah-Hamilton" }
//...
]
}
We will configure this Endpoint to cache outcomes for an interval of 10 minutes after the primary API request, and proactively replace these cached outcomes each 2 minutes – 5 queries throughout the ten minute interval. Even when we hit the Endpoint hundreds of occasions, we solely maintain the price of 5 queries.
The preliminary request is quick as a result of Logflare additionally performs setup (comparable to partitioning) on our BigQuery tables appropriately. Subsequent requests are extraordinarily quick as they’re cached in-memory.
The most effective half is that each one these knobs will be tweaked to your use case. If now we have a real-time requirement, we will fully disable caching or cut back the proactive caching to replace on a per-second interval.
The Self-hosted Problem
To alter the license, we would have liked to take away all closed-source dependencies. Beforehand, Logflare relied on the closed supply General SQL Parser underneath a enterprise licenses. That is incompatible with the Apache License.
We switched to an open supply various, the rust-based sqlparser-rs library, contributing a few updates for the BigQuery dialect.
Together with the parser, we invested a variety of effort into reworking the multi-tenant structure into one thing that was self-hosting pleasant and simply configurable. We moved in direction of setting variable based mostly configuration as an alternative of compile-time configurations, exposing the Endpoints configurations obligatory for Supabase Logs.
What’s Subsequent?
To additional combine Logflare into the Supabase platform, we’re constructing out 2 principal areas: Administration API, A number of Backends.
Administration API
The Administration API permits customers to work together programmatically with Logflare to handle their account and assets. This characteristic will likely be obtainable for each Logflare prospects and self-hosted customers.
You may take a look at the preview of our OpenAPI spec right here: https://logflare.app/swaggerui
Not solely that, we intend to show person account provisioning to pick out companions. Quickly, you’ll have the ability to turn out to be a Logflare Accomplice to provision Logflare accounts via the Accomplice API. Good if you wish to resell a log analytics service from your personal platform.
Contact us at growth@supabase.com to get in early on that waitlist.
A number of Backends
Logflare at present helps a BigQuery backend. We plan so as to add assist for different analytics-optimized databases, like Clickhouse. We will even assist pushing knowledge to different net providers, making Logflare a superb match for any knowledge pipeline.
It will profit the Supabase CLI: as soon as Postgres assist is offered, Logflare will have the ability to combine seamlessly, with out the BigQuery requirement.
Wrapping Up
Logflare has given Supabase the pliability to rapidly deploy options powered by an underlying structured occasion stream. Materializing metrics from an occasion stream is a strong framework for delivering real-time views on analytics streams.
Logflare is the hub of analytics streams for Supabase. We stay up for giving Supabase prospects the identical superpower.