WiNDC - WiNDC.jl Clarifying Sets and Parameters

WiNDC.jl Clarifying Sets and Parameters

Mitch Phillipson August 08, 2025

data blog

In this post I would like to clarify the concepts of sets, parameters and elements in WiNDC.jl. These terms are borrowed from GAMS and have larger mathematical meaning, but it’s not clear what they mean in the context of WiNDC.jl.

This post will build on the work from last weeks post. Since last week I have removed the parameters table and folded it into the sets table. This post will go into detail on the logic of this transition and discuss other regularity conditions.

Sets and Elements

Mathematically speaking, a set is a collection of elements and is completely determined by it’s elements. This is how WiNDC.jl uses the concept of sets, a set is a name for a collection of elements. For example, we have a set, commodity. The elements in commodity are the commodities, like agricultural goods, mining, etc. We have a dataframe containing the set names, it looks something like:

name	description	domain
commodity	Commodities	row
sector	Sectors	col

The domain column refers to the columns in the data table. In this case, we will have columns row, col, parameter and value. The last two are implied by the data table being a WiNDCtable.

The sets table doesn’t identify the elements. To do this we have the elements dataframe.

name	description	set
c1	commodity 1	commodity
c2	commodity 2	commodity
s1	sector 1	sector
s2	sector 2	sector

You should see how these two dataframes are linked, the set column in the elements table links to the name column in the sets table. I’ll discuss this link in more detail in a later section.

This syntax allows for the extraction of different subsets of a WiNDCtable. For example, suppose we have a WiNDCtable X and we want only the portion of the data where the row is a commodity. We have the syntax

table(X, :commodity)

that will extract just the commodities. On the back end, this function reads the sets table, finds the correct domain (or column), and performs an innerjoin on that column with a filtered elements table leaving just the commodities in that column of X.

Parameters

Parameters, in the context of WiNDC, are just sets. Let’s take a look at a slightly more complicated sets table with parameters included:

name	description	domain
commodity	Commodities	row
value_added	Value Added	row
sector	Sectors	col
IntermediateDemand		parameter
LaborDemand		parameter
CapitalDemand		parameter
ValueAdded		parameter

and the elements table:

name	description	set
c1	commodity 1	commodity
c2	commodity 2	commodity
L	Labor Demand	value_added
C	Capital Demand	value_added
s1	sector 1	sector
s2	sector 2	sector
intermediate_demand		IntermediateDemand
labor_demand		LaborDemand
capital_demand		CapitalDemand
labor_demand		ValueAdded
capital_demand		ValueAdded

A few things to observe in these two tables. First, by convention parameter sets are CamelCase, and non-parameter sets and elements are in snake_case. This is to avoid naming conflicts as set names must be unique. Second, the ValueAdded parameter refers to two parameter elements. I think of this as a composite parameter, it’s built from smaller parameters. This allows you to have specific parameters, but not lose the general structure.

For reference, a very simple data table looks like:

row	col	parameter	value
c1	s1	intermediate_demand	1
c2	s1	intermediate_demand	2
L	s1	labor_demand	3
L	s2	labor_demand	4
C	s2	capital_demand	5

This structure means extracting parameters is the exact same as extracting sets. You need to think of a parameter as being a collection of objects, just like a set is a collection of elements; ValueAdded is both labor_demand and capital_demand.

Regularity

Much like an SQL database, there are relationships and restrictions present in this data. What follows is a list of conditions that the tables must satisfy. The conditions will change in the future, but will be documented in the docstring for the regularity function.

Column names of the three dataframes must match the expected names in the WiNDC specification. All the examples in this post use the correct names.
Set names must be unique.
The values in the domain column in the sets table must match the column names in the data table.
The values in the set column in the elements table must all be present in the names columns of the sets table.
The values in each column of the data table must correspond to an element in a set corresponding to that columns domain. For example, in the example above an error would be raised if s1 appeared in the row column as it corresponds to a sector, which is has the domain col.
All elements for every set in a fixed domain must be unique. For example, the sets commodity and value_added can’t both have an element named c.