Collecting metrics for Flare using event sourcing and laravel-stats
Like most SaaS companies, the Flare back-office features a neat dashboard showing some of our key metrics. Using Laravel Nova, it's pretty easy to aggregate data per model like the total number of errors or the active subscriptions per week. Behind the scenes, these dashboard tiles and graphs use simple database queries. For example, the number of active subscriptions is a simple SELECT COUNT(*)
query on the subscriptions
table. But how do we query new subscriptions per week?
Counting new subscriptions per week
Let's take a look at a possible solution for the "new subscriptions per week" metric. Using YEARWEEK()
we can easily count the active subscriptions grouped by week of the year. The underlying Eloquent query now looks like this:
$numberOfActiveSubs = Subscriptions::query()
->selectRaw('COUNT(*) as number')
->where('status', 'active')
->groupByRaw('YEARWEEK(created_at)')
->get();
This works pretty well. With the correct indexes it's pretty fast too. However, there's one fatal flaw: once a subscription ends, it's no longer counted as a new subscription in the past weeks because it's no longer active
. You can probably think of a couple more examples where aggregating metrics from a traditional active record database goes wrong. Here are a couple examples of our own:
-
the number of active subscriptions per month
-
the number of deleted users
-
the total amount of errors received by a team, including errors outside of the retention period
All of these metrics are based of a certain state in time that's no longer active or data that was only in the database temporarily. If we want to query these metrics at any time, we'll need to keep all relevant data and states forever. In practice, this is less daunting than it sounds. Let's look at three possible solutions (including our very own laravel-stats package) to solve this issue.
Solution 1: snapshotting metrics
As explained above, the query for the current number of active subscriptions works pretty well because it relies on the current state of the data. Using a CRON job we can query the current number of active subscriptions every day and save that in a separate active_subscriptions_per_day
tabel. This way we're keeping a record of daily active subscriptions by snapshotting the data every day.
However, this method doesn't scale well if you need metrics from different time periods. If you're only storing new sign-ups per day, there's simply no way to query new sign-ups per hour. In other cases you might also miss data that happens in between two snapshots. For example an error that's created an deleted on the same day will never count as a "reported error" for that day. We cannot snapshot data if it's already deleted or modified at the end of the day.
Solution 2: event sourcing everything
If we simply store everything that happens in our application, we'll never run into a situation where there's not enough data to calculate a metric. The event sourcing pattern does exactly that: every change in the applications state is stored and processed as an event. Using these events we can even rebuild the application's state at any point in time and thus calculate metrics at any point in time.
Event sourcing is a pretty powerfull pattern but it can't easily be implemented in existing applications without some overhead. That's why we've opted for a hybrid solution: event sourced metrics without event sourcing.
Solution 3: event sourced statistics using laravel-stats
Using the laravel-stats package we can track changes to certain metrics based of the events that cause these changes. For the previous subscriptions example, we can easily created a SubscriptionStats
class that can be used to track changes to subscriptions:
use Spatie\Stats\BaseStats;
class SubscriptionStats extends BaseStats {}
When a new subscription is activated, we can call SubscriptionStats::increase()
to increase the number of active subscriptions. Likewise, when a subscription expires or gets cancelled, we can call the SubscriptionStats::decrease()
method to update the number of active subscriptions. Behind the scenes, the package isn't just storing a fixed number of active subscription. Instead, it's storing the increase
and decrease
events that cause this number to go up or down. You can see this happen in the stats_events
:
As you can see, we're also storing timestamps for these events. Using these timestamps we can aggregate events and metrics for every timeframe you can think of. For the subscriptions per week metric, this looks a little something like this:
use Spatie\Stats\StatsQuery;
$stats = StatsQuery::for(SubscriptionStats::class)
->start(now()->subWeeks(2))
->end(now()->subSecond())
->groupByWeek()
->get();
We're also specifying a start
and end
date. These boundaries are especially usefull when creating graphs. Finally, the returned data looks like this:
[
[
'start' => '2020-01-01',
'end' => '2020-01-08',
'value' => 102,
'increments' => 32,
'decrements' => 20,
'difference' => 12,
],
[
'start' => '2020-01-08',
'end' => '2020-01-15',
'value' => 114,
'increments' => 63,
'decrements' => 30,
'difference' => 33,
],
]
The returned data doesn't only include the summed increments and decrements but it also keeps track of the total number of subscriptions and the difference between each group! This way you can metrics and graphs for, e.g. new teams, deleted teams and the total number of teams, all from one statistics class.
Intrigued? Take a look at the package's readme on GitHub or read more about it on Freek's blog.