Exploring the WiNDC Data
September 12, 2025
Today I want to share a few visualizations I’ve created using the basic WiNDC national data. The goal is to provide some insights into the data, see exactly what the economy is doing and how WiNDC reflects that. My code is available on my GitHub.
For this post, assume summary is the WiNDC national
summary data for the years 1997-2023. These are the most recent years
available as of the writing of this post.
Gross Domestic Product (GDP)
GDP is the sum of all value added categories. It is the most common measure of economic activity. Computing this using WiNDC is straightforward:
table(summary, :Value_Added, :Output_Tax, :Sector_Subsidy, normalize=:Use) |>
x -> groupby(x, :year) |>
x -> combine(x, :value => sum => :gdp)Our values are slightly lower than reported, for example in 2023 the reported GDP was $27.72 trillion while our estimate is $26.73 trillion. I’m not certain why the numbers differ, but it could be due to differences in data sources or methodologies. I do know the difference is not due to calibration, I checked on both the calibrated and non-calibrated data. However, the trend is very similar.
Personal Consumption Expenditures (PCE)
Personal consumption is how much individuals spend on goods and services. We’ll start with total PCE per year:
pce_plot = table(summary, :Personal_Consumption, normalize=:Use) |>
x -> groupby(x, :year) |>
x -> combine(x, :value => sum => :pce)This looks very similar to GDP, which is expected since PCE is the
largest component of GDP. In our GDP calculation, we summed
value added. You could alternatively sum
final demand, excluding exports. Computing GDP
using this method makes GDP slightly larger than expected, which is
interesting.
Top Ten PCE Categories
Let’s look at the top ten PCE categories by total spending. First, we find the top ten categories:
top_ten = table(summary, :Personal_Consumption, normalize=:Use) |>
x -> groupby(x, [:row]) |>
x -> combine(x, :value => sum => :pce) |>
x -> leftjoin(
x,
elements(summary, :commodity) |> x -> select(x, Not(:set)),
on = :row => :name
) |>
x -> sort(x, :pce, rev=true) |>
x -> first(x, 10)We perform an inner join to filter our dataframe to only include these top ten categories, then sort by year:
table(summary, :Personal_Consumption, normalize=:Use) |>
x -> innerjoin(
x,
top_ten |> x -> select(x, Not(:pce)),
on = :row => :row,
) |>
x -> sort(x,:year)This graph shows the top ten categories of personal consumption
expenditures over time. The top two categories are Housing
and Food and beverage and tobacco products, which seems
fitting. People spend money on shelter and food first. The next two are
Ambulatory health care services and Hospitals,
recall this is the United States. I would be curious to see how this
compares to other countries with universal healthcare.
Housing seems to be growing much faster than other categories, in fact the tail looks almost exponential. But these are raw values, let’s look at the fraction of the yearly totals for these categories:
This is more revealing. Americans are spending, on average, around 15%-16% of their income on housing. This is lower than I would have expected, I thought it would be closer to 25-35%. I believe this is strictly money on housing and does not include insurance or interest. In that case, renters are pushing this number higher while homeowners are pushing it lower.
It is surprising that Americans spend ~8% of their income on food, ambulances and healthcare each. Covid is interesting anomaly. Food got a boost and restaurants took a hit, as expected due to lockdown. However, healthcare was relatively unchanged.
The final category I’ll highlight is
Petroleum and coal products. This has been shrinking for
some time likely due to fuel effiecency requirements and the move toward
renewables. Between 2017 and 2023 it seems to have hit a stable
2.5%.
Supply and Demand
Finally, let’s look at how goods are used (demanded) and created (supplied). For these calculations we are going to fix a commodity and sum across all sectors of the intermediate portions of the table.
table(
summary,
:Intermediate_Demand,
:Intermediate_Supply,
normalize=:Use
) |>
x -> groupby(x, [:row, :year, :parameter]) |>
x -> combine(x, :value => sum => :total)Using this data I created a plot that lets the user choose which commodity to display:
In the above, the red line shows total value supplied and the blue line shows total value demanded. If we had included final demands, exports, imports and taxes the two lines would be identical, this is the market clearing condition, all goods created must be used.
The
Data processing, internet publishing, and other information services
category is interesting. It clearly shows the expansion of the internet,
not many commodities have grown 10x in the last 20 years. You can also
see the dot-com bubble burst in the late 90’s, growth stagnated for
about 5 years.
The last question is what sectors are using and creating these goods?
For this graph I’ve chosen to focus solely on the Farms
category, this is primarily due to the limitations of the plotting
library I’m using.
This data is the ratio of value to total value for each year,
excluding the Farms sector itself. Most sectors produce
their own goods, so including them would dominate the graph.
On the supply side, Farms goods are primarily created by
Farms themselves, less than 1% comes from other sectors.
The largest two contributors are federal and state governments.
On the demand side, Farms goods are primarily used by
Food and beverage and tobacco products, around 80% of all
farm goods go to this sector. Keep in mind this does not include
personal consumption.
Conclusion
There is a lot of data in WiNDC, and a lot of interesting insights to be found. I hope this post has given you some ideas on how to explore the data yourself.