Solo Python NetworkX PageRank Centrality
Project report cover
View PDF

Goal

Map the wine world through network analysis. Using WineEnthusiast reviews, build graphs that link countries, regions, varieties, and tasters, then use centrality measures (PageRank, betweenness) to find the hubs, the versatile varieties, and the best value-for-money wines.

Dataset

Wine reviews from WineEnthusiast (via Kaggle, scraped by Zack Hout in 2017): 130k rows, 13 columns including country, province, region, variety, taster, points, and price. After dropping missing values, I focused on the first 1,000 European rows to keep networks readable and avoid overrepresenting any single region.

Three Networks

Country–Variety Headlines

Italy and France lead PageRank by a wide margin, confirming the obvious. Portugal ranks 3rd, which is less obvious. On the variety side, Red Blend, Riesling and White Blend show up everywhere, making them the most country-agnostic grapes in the dataset.

Top Countries (PageRank): Italy 0.165, France 0.154, Portugal 0.059, Austria 0.036, and Germany 0.034
Top Varieties (PageRank): Red Blend 0.052, Riesling 0.049, White Blend 0.048, Nebbiolo 0.041, and Chardonnay 0.038

The Region–Variety Network

Zooming in from countries to regions makes the structure visible. Sicily & Sardinia emerges as the most central region (highly connected to many varieties), while Veneto takes the highest betweenness, meaning it bridges different parts of the network and links varieties that otherwise wouldn't connect.

Region-Variety Network visualization
Node size = variety diversity (regions) or popularity (varieties). Edge width = strength of association.

The Value Wine

Layering quality-to-price ratio onto the taster–region–variety network surfaced one clear winner.

🥇 Cramele Recaș 2009 Chardonnay (Romania), with a quality-to-price ratio of 12.43, priced around $14. Highest in the entire dataset. Two of the top three value picks come from Portugal, echoing its surprise 3rd-place country ranking.

The likely explanation: producers in lower-cost regions can replicate the style of expensive Chardonnays (which usually trace back to Burgundy) through careful sourcing and modern winemaking, hitting similar flavor at a fraction of the price.

The Taster Bias Problem

The biggest limitation of this analysis is also the most human one: tasters disagree, and they don't even disagree consistently. Some critics run high, others run low, and the same wine gets very different scores depending on who reviews it.

Average Rating by Wine Taster bar chart
Average rating by taster, with sample size (n). Jeff Jenssen averages 92.0 across only 3 reviews; Lauren Buzzeo averages 86.2 across 4. Roger Voss (n=140) is the most prolific.

The Cramele Recaș was reviewed by Roger Voss and Anna Lee C. Iijima, both seasoned tasters with 20+ reviews, but Anna in particular skews high. So "best value wine in Europe" carries an asterisk: it's the best under these reviewers' palates, on this dataset, in this window.

Takeaway

Network analysis surfaces structure that traditional wine classifications miss, such as versatile varieties, bridging regions, and hidden value bottles. But numbers can only take you so far. Wine is still an experience, and the right wine for you is the one you actually want to drink again.