In this blog post, we’ll use Facebook’s Prophet library to analyze and forecast Switzerland’s electricity demand.
Since Prophet was released to the public in 2017, I’ve always wanted to test it on a dataset where Prophet can show it’s full power, including multiple seasonalities at different time-scales, trend, and business day logic. I recently stumbled upon freely available electricity data from the Swiss national electricity grid company SwissGrid and decided to have a quick go with Prophet.
Getting the data
The data is freely available on SwissGrid’s website here, and contains various data points regarding electricity production, consumption, imports and exports since 2009. In this example we’ll look at the “Total energy consumed by end users in the Swiss controlblock” on 15-minute intervals. The data set is split over various Excel files, one per calendar year, so we’ll first use Python’s to download all files, extract the relevant time-series and write the combined data to a single csv file. The full code is available on my gitlab here.
BASE_URL = 'https://www.swissgrid.ch/dam/dataimport/energy-statistic/'
AVAILABLE_YEARS = range(2009,2020)
SHEET_NAME = "Zeitreihen0h15"
COLUMN_NAME = "Summe endverbrauchte Energie Regelblock Schweiz\nTotal energy consumed by end users in the Swiss controlblock"
# Download all available files
for year in AVAILABLE_YEARS:
filename = f'EnergieUebersichtCH-{year}.xls'
url = f'{BASE_URL}/{filename}'
web_file = requests.get(url)
with open(filename, 'wb') as local_file:
local_file.write(web_file.content)
# extract the relevant column from each excel sheet and write it to a single csv file
series = []
for year in AVAILABLE_YEARS:
print(f'Processing {year}...')
filename = f'EnergieUebersichtCH-{year}.xls'
df_single_year = pd.read_excel(filename, sheet_name=SHEET_NAME, header=0)
df_single_year = df_single_year[col_name][1:] #skip first row following the column names
series.append(df_single_year)
series = pd.concat(series)
series = series.rename('demand')
series.to_csv('demand.csv', index=True, header=True)
Plotting the data
Once we have the data in a nice format, we can start our exploratory analysis. Using pandas describe() method, we see that our series consists of roughly 380’000 rows, where each row corresponds to the energy consumption during a 15-minute interval.

The data shows some yearly frequencies, but it’s hard to tell from above plot since it’s just too much data! Let’s re-sample the series to a daily frequency and re-plot:
df_daily = df.resample('D', closed='right').sum()

There is clearly a seasonal variation: Energy consumption peaks during winter and is lower during summer. The overall trend seems to be relatively stable. Let’s see what Prophet can do with this data.
Going forward we’ll use the daily series and neglect the 15-minute interval data – working with 380’000 rows was too much for my home computer. The daily series contains only 4000 data points.
Forecasting using Prophet
Prophet requires the data to be in a dataframe consisting of one column for the timestamps called ds, and one column for the data called y. Prophet promises to make time-series analysis easy – let’s see what it can do on our daily-sampled time-series:
m = Prophet()
m.fit(df_daily)
future = m.make_future_dataframe(periods=365)
forecast = m.predict(future)
Four lines of code to fit the model and forecast the next year! By default, Prophet uses an additive model, meaning that it models the time series as the sum of various components:
Let’s first look at the time-series components that Prophet detected and and visualize them. Prophet makes this effortless:
m.plot_components(forecast)

The first subplot shows the (piecewise linear) trend that Prophet detected. The second and third chart show the impact of seasonalities – Prophet fit both a yearly seasonality – with lower consumption in summer versus winter – and a weekly seasonality, where Saturday and Sunday have a lower demand when compared to business days.
Let’s look how our forecast looks like. Prophet provides an easy way to visualize the forecast using
m.plot(forecast)

While visually the results seem to be quite satisfactory, it’s always better to compute error metrics to be able to properly quantify the accuracy. Prophet has a support for time-series cross-validation: Here we’ll start with an initial training period of 5 years, and make a one year forecast every 90 days thereafter.
df_cv = cross_validation(m, initial=f'{5*365} days', period='90 days', horizon = '365 days')
df_metrics = performance_metrics(df_cv)
Below chart shows the mean absolute prediction error of our model for forecasts up to one year:

While the overall performance of the model is very good and the mean prediction error is below 5%, there are a few outliers where our forecast is off by more than 20%. Can we do better than this without too much effort? Yes! Given the strong dependence of the energy consumption on weekends and weekdays, it’s a reasonable assumption that at least some of the days where the model performed poorly were public holidays that should be treated similarly to weekends but weren’t treated as such. Luckily it’s easy to let Prophet incorporate the relevant holidays into the model:
m.add_country_holidays(country_name='Switzerland')
The new component plot now includes the impact of holidays on the energy consumption:

The second subplot shows the impact of holidays on our model: As expected they lead to lower energy consumption. Incorporating holidays should have made our model more accurate, we can check this by re-running the cross-validation and computing the prediction errors:

It worked! The frequency of errors > 20% has decreased quite significantly. The seasonality in the mape suggests that there is another effect that our model does not capture, but for now we’ll leave it at this and show the prediction of our improved model:

Conclusion
We’ve successfully used Prophet to forecast a complex time-series and improved it by adding public holidays to the model. Prophet is very user-friendly, default parameters seemed reasonable (at least in this specific example), and it makes time-series analysis accessible to most developers. I’ll definitely include it in my tool set going forward.