Working with GDX files - EMI

advertisement
Working with GDX files
Information paper
20 March 2014
Market Performance
Working with GDX files
Version control
Version
Date amended
Comments
0.0.1
19/3/2014
Initial draft
0.0.2
20/3/2014
Finalised
i
20 March 2014 1.56 p.m.
Working with GDX files
Contents
1
Introduction
GAMS and GDX files
GAMS license arrangements
1
1
1
2
Using GDX files
Write some GAMS code
Use GDXdump to create CSV or GMS files
The py-gdx (Python) utilities
Interfacing GAMS with R
Writing to Excel from the GAMS IDE
Passing data between GAMS programs using merged GDX files
Worked examples
A more sophisticated approach
1
2
3
4
5
5
5
6
7
Glossary of abbreviations and terms
9
ii
20 March 2014 1.56 p.m.
Working with GDX files
1
Introduction
1.1
Several of the models that the Electricity Authority (Authority) makes publicly available are
1
formulated using the GAMS software and therefore make extensive use of GDX files. We often
get asked for advice on how to work with GDX files; specifically, how to:
1.2
(a)
extract data from a GDX file
(b)
modify data in a GDX file
(c)
translate from the GDX file format to some other file format.
The purpose of this paper is to provide a few useful tips on working with GDX files.
GAMS and GDX files
1.3
The General Algebraic Modeling System (GAMS) is a high-level modelling system for
mathematical programming and optimization. The GAMS software consists of an integrated
development environment (IDE), a language compiler, and a stable of integrated highperformance solvers. GAMS is particularly well-suited to formulating and solving complex, large
scale modelling applications such as vSPD, HSS and GEM.
1.4
GDX stands for GAMS data exchange. A GDX file is a binary file format for use with GAMS. GDX
files are a convenient way to:
1.5
(a)
store the input data for a GAMS-based model and easily read it into GAMS
(b)
pass data from one GAMS-based program to another
(c)
collect the output from a GAMS-based model for further processing, either with GAMS or
some other software.
While users of EMI tools need no experience with GAMS, some familiarity with GAMS would be
useful. For users wishing to learn the basics of GAMS, the guided tour and the GAMS tutorial in
chapter 2 of the GAMS Users Guide are sensible places to start. Both of these resources can be
found under the GAMS IDE help menu.
GAMS license arrangements
1.6
A GAMS license is required to solve the Authority's GAMS-based models. Contact GAMS directly
to purchase a GAMS licence. A minimal requirement is a runtime license for the GAMS base
module and a GAMS open source solver. Users who already own a license for a solver that is
suitable for solving the Authority's models may be able to purchase a GAMS solver link at less
cost than a new GAMS solver of the same type. Again, consult GAMS directly for information
regarding the details of all available options.
1.7
Feel free to email us at emi@ea.govt.nz if you wish to learn of our experience with the various
solvers required to solve the Authority's models.
2
Using GDX files
2.1
There are many tools for working with GDX files; some of them are listed under the Contributed
software section on the GAMS website. It is necessary to have GAMS installed on your computer
1
See www.gams.com.
1 of 9
20 March 2014 1.56 p.m.
Working with GDX files
in order to exploit the following tips, but it is not necessary to have a GAMS license. Six topics on
working with GDX files are now discussed:
(a)
Write some GAMS code
(b)
Use GDXdump to create CSV or GMS files
(c)
The py-gdx (Python) utilities
(d)
Interfacing GAMS with R
(e)
Writing to Excel from the GAMS IDE
(f)
Passing data between GAMS programs using merged GDX files
Write some GAMS code
2.2
2.3
An obvious way to interact with a GDX file is to write a little GAMS code. For example, the snippet
of GAMS code shown below is taken from GEM and accomplishes the following:
(a)
Five sets called y, f, r, t, and lb are declared.
(b)
Two parameters called i_fuelPrices and i_NrgDemand are then declared. Note that
i_fuelPrices has as its domain the sets f and y, while energy demand is indexed on sets r,
y, t and lb.
(c)
The GDX file called GEMinputData.gdx is then designated as the GDX file to be read
from, and the $load statements tell GAMS to reach into the GDX file and extract the
nominated data.
In GAMS parlance, the $load statements are said to initialise the previously declared symbols
using the data read from the GDX file.
Sets
y
'Modelled calendar years'
f
'Fuels'
r
'Regions'
t
'Time periods (within a year)'
lb 'Load blocks' ;
Parameters
i_fuelPrices(f,y)
'Fuel prices by fuel type and year, $/GJ'
i_NrgDemand(r,y,t,lb) 'Load by r, y, t and lb, GWh';
$gdxin "GEMinputData.gdx"
$load y f r t lb
$load i_fuelPrices i_NrgDemand
2.4
In a realistic setting, the program would then presumably continue with statements that make use
of the loaded data. A similar syntax is adopted to write data from a GAMS program to a GDX file.
For example, the following statement would write the symbols a, b and c to a GDX file called
test.gdx.
execute_unload "test.gdx" a b c
2.5
As an aside, it is worth highlighting that in the examples given above, the $load statement is
effected during the compile phase whereas the execute_unload statement is effected during
the execution phase. Running a GAMS job first compiles the code and then executes it. Both
2 of 9
20 March 2014 1.56 p.m.
Working with GDX files
actions – reading from and writing to a GDX file – are able to be implemented during either
phase.
2.6
Additional information on using GAMS to interact with GDX files can be found in the document
entitled GAMS GDX facilities and tools, which is available from the help menu of the GAMS IDE –
see Help > docs > tools > GDXutils.pdf. Excel users in particular may find the tools GDXXRW,
XLSDump and XLSTalk helpful.
Use GDXdump to create CSV or GMS files
2.7
An easy-to-use tool, which is described in GAMS GDX facilities and tools, is GDXdump.
GDXdump can be executed from within a GAMS program or, more conveniently, directly from the
command line. Among other things, the GDXdump utility can be used to extract a symbol from a
GDX file and write it to a CSV file.
2.8
By way of example, consider the parameter noted above called i_fuelPrices from the GDX file
called GEMinputData.gdx. The following instruction entered at the command prompt will extract
i_fuelPrices and write it to a CSV file called fuelPrices.csv:
gdxdump geminputdata.gdx symb=i_fuelPrices format=csv cdim=y > fuelPrices.csv
2.9
The argument called symb denotes which symbol is to be extracted, the format argument
instructs the resulting file to be of a CSV format, and the cdim argument (set equal to y for yes)
says to use the last dimension in the domain of i_fuelPrices as the columns. The default option,
or cdim=n, would write i_fuelPrices to the CSV file as a list rather than a table.
2.10
The first few rows and columns of the resulting CSV file look like this:
"Dim1","2012","2013","2014","2015","2016"
"Coal",4.14,4.14,4.13,4.13,4.13
"Lig",1.05,1.07,1.1,1.12,1.14
"Gas",4.95,4.68,4.98,7.28,7.89
2.11
Another useful function of the GDXdump utility is the ability to inspect the contents of a GDX file
by dumping out a list of all symbol names along with their dimensions, type and any explanatory
text. Executing the command:
gdxdump geminputdata.gdx symbols
2.12
yields something like this on the screen (the complete symbol list has been truncated for
presentation purposes):
1
2
3
4
5
6
2.13
Symbol
Benmore
coal
cogen
demandGen
diesel
e
Dim
1
1
1
1
1
1
Type
Set
Set
Set
Set
Set
Set
Explanatory text
Benmore substation
Coal fuel
Cogeneration technologies
Demand side technologies as generation
Diesel fuel
Zones
Similarly, the following command would direct the output to a file called symbolList.txt rather
than the console:
gdxdump geminputdata.gdx symbols > symbolList.txt
2.14
Of course, while the GDXdump tool is useful for getting data out of a GDX file, it can't be used to
put data back into a GDX file. But there is a simple enough way to do just. If GDXdump is
executed without any arguments, it will write the contents of the GDX file (i.e. sets, scalars,
3 of 9
20 March 2014 1.56 p.m.
Working with GDX files
parameters, etc) to standard output formatted as a GAMS program with data statements. A
GAMS programs by convention has the .gms suffix. The GDX file called GEMinputData.gdx
can be written to a file called, say, GEMinputData.gms by issuing the following instruction at the
command prompt:
gdxdump geminputdata.gdx > geminputdata.gms
2.15
The resulting file is a legitimate GAMS program that can be edited using the GAMS IDE. Code
can then be appended to the end of geminputdata.gms to manipulate the data, and an
execute_unload statement can be used to write the result back to a GDX file. For example, if
geminputdata.gms was created as just illustrated, and the following two lines were added to
the end of geminputdata.gms, and the file was then processed or submitted as a GAMS job, a
GDX file called highFuelPrices.gdx would be created.
i_fuelPrices(f,y) = 5.0 * i_fuelPrices(f,y) ;
execute_unload "highFuelPrices.gdx" i_fuelPrices
2.16
Yet another option is to use GDXdump to create geminputdata.gms and append the line
multiplying fuel prices by five to the file, as shown above. Then, rather than use the
execute_unload statement to write selected (or all) data symbols to a new GDX file, run the
appended geminputdata.gms file as a GAMS program to create a new GDX file containing all
symbols. For example, issuing the following command:
gams geminputdata.gms gdx=newGEMinputData
will yield a GDX file called newGEMinputdata.gdx.
The py-gdx (Python) utilities
2.17
A collection of Python utilities called py-gdx is available for manipulating GDX files. Python is an
2
open-source, interpreted, high-level programming language. The py-gdx utilities are available for
3
download from GitHub.
2.18
The py-gdx utilities are designed to be executed from a command line. While the py-gdx package
contains several utilities, the two main one are:
2.19
(a)
gdx_insert_csv.py – inserts or replaces symbols in the GDX file with data from a CSV
file. If required, it can be used to create a new GDX file from scratch.
(b)
gdx_extract_csv.py – produces a CSV file of symbol (parameter or variable) values in one
or more GDX files. If parameter domains are missing from the GDX file, it guesses them. It
can act as if several GDX files were one, or it can compare the same parameters in several
GDX files. It can export a selected set of parameters, it can export all parameters that are
defined over a selected set of domains, or it can export all parameters and variables from a
GDX file. The CSV files produced by gdx_extract_csv.py are readable by
gdx_insert_csv.py. Hence, it is quite straightforward to export an entire GDX file to
CSV, edit it, and rebuild a revised copy of the GDX file.
Help on all of the utilities can be seen by typing the name of the utility followed by --help. For
example, type the following where <example> is replaced with the literal name from the utility of
interest:
python gdx_<example>.py --help
2
https://www.python.org/.
3
https://github.com/geoffleyland/py-gdx/.
4 of 9
20 March 2014 1.56 p.m.
Working with GDX files
Interfacing GAMS with R
2.20
Users of R, an open-source statistics application, are able to use the GDXRRW package to
4
transfer data back and forth between R and GDX files. A simple one-line statement in R will
quickly transfer all or selected symbols from a GDX file to an R dataframe, and vice versa. This
approach will suit users who prefer to use R as their primary tool for data work.
Writing to Excel from the GAMS IDE
2.21
As illustrated in the image below, it is a straightforward matter to write, or export, from a GDX file
to an Excel file from within the GAMS IDE. The steps are:
(a)
open a GDX file in the GAMS IDE
(b)
select or highlight the symbol of interest
(c)
right-click the symbol and choose Write followed by Write symbol to Excel file.
2.22
If the option to Write ALL symbols to Excel file is chosen, then the resulting Excel file
will contain many worksheets, one per symbol from the GDX file. Data from a GDX file can be
similarly written to HTML files or copied to the clipboard.
2.23
While this method is useful for getting data out of a GDX file and into an Excel, whereupon it can
be edited, it doesn't provide a means of getting the modified data back into the GDX file.
However, the GDXXRW tool noted earlier in paragraph 2.6 can be used to easily do that.
Passing data between GAMS programs using merged GDX files
2.24
It is very common to run a model many times, once for each of many scenarios, and write the
results from each model run to a GDX file. The GDX files can then be merged into a single GDX
4
http://www.r-project.org/.
5 of 9
20 March 2014 1.56 p.m.
Working with GDX files
file that can be used as the input into a GAMS-based report writing program. Such a report writer
would then generate reports on all scenarios in a single step.
2.25
However, this seemingly simple process can be complicated if the set membership changes with
each scenario. For example, consider running vSPD, say, for 30 days in a row. It is easy to
imagine that in each trading period in each of the 30 days, the set of nodes or branches or offered
units might be different. For report writing purposes, a ‘super set’ is desirable. That is, a set
containing a unique listing of the entire union of set elements defined over all trading periods and
days. Hence, the super set can be used as the domain for parameters read into GAMS, say, into
a report-writer, from the merged GDX file, ensuring all data for all scenarios is loaded from the
merged GDX file.
2.26
A collection of small GAMS programs that demonstrate the worked examples to follow should
accompany this report – see combiningSetsFromMultipleGDXFiles.zip.
Worked examples
2.27
The following GAMS programs should each be executed in the same order as they are described
below to demonstrate how to prepare reports based on results from multiple runs of a GAMSbased model. Comments explaining what is being done at each step are included in each of the
programs.
Example1.gms
2.28
This program creates two sets, i and j, and three parameters, a, b and c. The domain of a is set i,
the domain of b is set j, and the domain of c is both sets i and j. All five symbols (sets and
parameters) are written to a GDX file called ex1_everything.gdx and a subset of symbols is
written to a GDX file called ex1.gdx.
2.29
In practical modelling applications, it is helpful to collect all output from a model run in a single
GDX file for archive purposes, e.g. ex1_everything.gdx. At some future date, data can be
extracted from this file without the need to re-run the model. At a minimum, the archive GDX
should contain all sets and parameters used in the model, all variable levels and all equation
marginal values. Note that many symbols in a GAMS program are used as an intermediate step
to create the parameters used in the model – these symbols probably don’t need to be archived.
2.30
One or more additional GDX files should be created to collect the information used to generate
reports, e.g. ex1.gdx in the present case. The advantage of this approach is that the report
writing process makes use of much smaller files. Similarly, if they need to be distributed by email
or published on the web, it is convenient to be working only with those symbols actually needed.
It is nearly always the case in any realistic modelling application that report writing requires only a
tiny fraction of all the symbols created to formulate, parameterize and solve the model.
Example2.gms
2.31
This program is very similar to example1.gms; the key difference is that membership of the sets
i and j is different. Actually, there is some overlap in the set membership across the two
programs. As with example1.gms, example2.gms creates two GDX files –
ex2_everything.gdx and ex2.gdx.
combineSets.gms
2.32
This program reads in sets i and j from both ex1.gdx and ex2.gdx. In the first instance, it
reassigns the sets to new symbols, ii and jj. It combines the elements from each case to form a
new ‘super set’ and then writes these combined super sets to a GDX file called
6 of 9
20 March 2014 1.56 p.m.
Working with GDX files
combinedSets.gdx. In the process of writing the GDX file, it changes the set names back to i
and j.
mergeGDXfiles.gms
2.33
This program creates and executes a batch file that copies ex1.gdx and ex2.gdx to sc1.gdx
and sc2.gdx, respectively, and then merges sc1.gdx with sc2.gdx to create a yet another
GDX file called mergedResults.gdx.
2.34
This step may appear trivial. However, it is a crucial step in being able to combine results from all
scenarios into a single GDX file containing the same number of symbols as each individual
scenario GDX file.
2.35
While in the case of just two scenarios this may seem insignificant, it is very powerful when there
are many scenarios. Furthermore, every assignment statement in a GAMS report writing program
can now be applied to all scenarios at once. In other words, there is no need to execute the report
writing code once per scenario and then somehow (Excel vlookup, perhaps, if you’re very
patient?) join all the results together into a single table.
2.36
The key to this process is the creation of the scenario set, sc, in mergeGDXfiles.gms. The
choice of element labels in this set is deliberate – the first element is sc1 and the second is sc2.
These labels are in fact the names given to the GDX files to be merged. The reason for this is
that all symbols in the merged GDX file acquire a new domain index and the name of that domain
is taken from the name of the files being merged.
2.37
For example, ex1.gdx (copied to sc1.gdx) and ex2.gdx (copied to sc2.gdx) each contained
a symbol called c, which was defined on sets i and j. The merged file, mergedResults.gdx,
also contains the symbol called c. But note how it is defined not only on sets i and j, but also on
sc.
Example3.gms
2.38
This program reads in the relevant data from combinedSets.gdx and mergedResults.gdx. It
can be used as the beginning of a report writing program.
A more sophisticated approach
2.39
All of the above can be made more sophisticated if the model is solved inside a loop on the
scenario set. The GAMS put_utility can be used to write the GDX files each time around the loop
and the GDX files would take their name from the text labels (.tl) of the elements assigned to set
sc. The syntax would be something like this:
Set sc / sc1, sc2, etc, etc / ;
file dummy ;
loop(sc,
..
.. code to assign parameter values for this solve of the model
..
Solve vSPD minimising TOTALCOSTS using lp ;
..
.. more code (maybe) to do post-solve computations on model output
..
put dummy ;
put_utility 'gdxout' / sc.tl ;
7 of 9
20 March 2014 1.56 p.m.
Working with GDX files
execute_unload i j c or whatever needs to be merged in the GDX files
);
execute gdxmerge sc*.gdx output=mergedResults.gdx
8 of 9
20 March 2014 1.56 p.m.
Working with GDX files
Glossary of abbreviations and terms
Authority
Electricity Authority
CSV
A file type that contains so-called comma-separated variables, which is
nothing more than a comma-delimited text file with a .csv suffix
GAMS
General Algebraic Modeling System
GDX
GAMS data exchange - a binary file format with a .gdx suffix
GEM
Generation expansion model
GMS
A file containing GAMS code - a text file with a .gms suffix
HSS
Hydro supply security test
IDE
Integrated development environment
SPD
Scheduling, Pricing and Dispatch
vSPD
Vectorised Scheduling, Pricing and Dispatch
9 of 9
20 March 2014 1.56 p.m.
Download