1The problem
Bordeaux is famous for its wine — and Château Cheval Blanc, ranked among the four Premier Grand Cru Classé A producers of Saint-Émilion, is one of its crown jewels. A young vintage bottle costs €500–1,200, so protecting the vines is a serious economic concern.
The biggest threat? Powdery mildew — a fungal disease that thrives at temperatures of 17–28°C with humidity of 40–80%. Once symptoms appear, sulfur treatment is too late. The challenge: how do you know to spray sulfur the day before mildew develops?
2The approach
I used multiple linear regression with the mean temperature in Bordeaux–Mérignac as the response variable, and the temperatures of four surrounding cities as predictors:
GEOGRAPHIC SETUP
Three of the predictor cities (Nantes, Limoges, Toulouse) sit roughly 230–355 km from Bordeaux in the cardinal directions. Pamplona was added as a southern proxy because no closer French station was available in the ECAD database.
- Training data: daily mean temperatures, 1 Jan 2017 – 31 Dec 2019 (1,095 days)
- Holdout data: 1 Sept 2020 – 31 Aug 2021 (used to validate the model)
- Source: ECAD (European Climate Assessment & Dataset)
3The model
The fitted regression coefficients on the training set:
| Variable | Coefficient | p-value |
|---|---|---|
| Intercept | 13.86 | 0.000 |
| Nantes (NW) | 0.278 | 0.000 |
| Limoges (NE) | 0.299 | 0.000 |
| Toulouse (SE) | 0.398 | 0.000 |
| Pamplona (S) | 0.0004 | ~1.0 |
The three French cities each had p < 0.001 — strong statistical significance. Pamplona's coefficient was effectively zero with p ≈ 1, confirming what intuition would suggest: a Spanish city across the Pyrenees adds almost no predictive value compared to the closer French neighbors.
4Validation
To check if the model would generalize, I ran the same analysis on the held-out 2020–21 data. Plotting predicted vs actual temperatures (Pamplona on x-axis as the example variable):
BORDEAUX — PREDICTED vs ACTUAL TEMPERATURE
The red and blue dots track each other closely, hugging the regression line throughout. The standard error stayed below ~10 (in tenths of a degree), and Adjusted R² remained at 0.97.