79731801

Date: 2025-08-11 08:12:17
Score: 0.5
Natty:
Report link

Prophet is a great time-series forecasting library, but it is known to struggle with count data, especially when being close to zero. I’ve encountered this issue frequently in my work, which ultimately led me to develop a new Prophet-inspired library: Gloria.

Gloria addresses this problem by introducing a number of new distributions beyond the normal distribution. For instance, count data can be handled using Poisson, Binomial, Negative Binomial, or Beta-Binomial distributions. The following code block showcases how I would try to treat your data, which is similar to what is shown in the Saturation tutorial:

import pandas as pd
from gloria import Gloria, cast_series_to_kind, CalendricData

# Load the data
data = pd.read_csv("headcounts.csv")

# Save the column names for later use
timestamp_name="Date"
metric_name="Headcount"

# Convert timestamp to datetime
data[timestamp_name] = pd.to_datetime(data[timestamp_name])

# Ensure metric is an unsigned integer
data[metric_name] = cast_series_to_kind(data[metric_name], "u")

# Set up the Gloria model
m = Gloria(
    model="binomial",
    timestamp_name=timestamp_name,
    metric_name=metric_name,
    sampling_period="15min",
    n_changepoints = 0
)

# Create protocol for calendric data
calendric_protocol = CalendricData(country = "US")

# Add the protocol
m.add_protocol(calendric_protocol)

# Fit the model to the data
m.fit(data, capacity = 180)

# Predict
forecast = m.predict(periods=24 * 60 * 4)

# Plot the results
m.plot(forecast, show_capacity=True)
m.plot_components(forecast)

Some remarks:

m = Gloria(...)

# Define the event profile
profile = Gaussian(width="30d")       # one month Gaussian drop

# Add event to model
m.add_event(
    name="drop",
    regressor_type="SingleEvent",     # choose the regressor class
    profile=profile,                  # attach the profile
    t_anchor="2024-01-01"             # anchor time
)

m.fit(...)

Fitting your data in this way should give you a number of advantages

  1. No more drops in the prediction below zero.

  2. All predictions and confidence bands are integer values, respecting the lower and upper bounds of your data.

  3. The prediction should better match your observed weekly peaks (Prophet pulls the prediction down towards zero due to the zero-inflated data. This way the prediction average fits the data at the expense of extreme values)

  4. Fitting the drop with an event clearly separates this vacation effect from other patterns.

Regards,

Benjamin

Reasons:
  • Blacklisted phrase (1): Regards
  • Long answer (-1):
  • Has code block (-0.5):
  • Low reputation (1):
Posted by: Benjamin Kambs