Python’s template class

Many analyses require the same model to be run many times, but with different inputs.  For instance a Sobol sensitivity analysis requires thousands (or millions) of model runs corresponding to some strategic sampling of the factor space.  Depending on how complicated your model is, facilitating hundreds or thousands of runs may or may not be straightforward.  Some models require a unique configuration file, so performing a Sobol analysis is not as simple as changing a vector of numbers passed to an executable.

A very simple solution suggested by Jon Herman in an earlier post is to use Python string templates.  It is such a handy tool, I thought it deserved its own post.  We’ll use Python’s string module’s Template class.  The Template class has two methods: substitute and safe_substitute.  The difference between substitute and safe_substitute is that substitute will throw an exception if there is some problem filling the template, where as safe_substitute will not.  These two methods work essentially as standard $-based substitutions in Python, but rather than altering a single string, we can alter an entire document, and can then save it with a unique name.

Let’s consider a simple example first where we modify a single string:

from string import Template
s = Template('$who is from $where')

d = {}
d['who'] = 'Bill'
d['where'] = 'Boston'

p = s.substitute(d)

Which returns the string Bill is from Boston…lucky Bill.  Now we can get a bit fancier with lists of people and places:

from string import Template
s = Template('$who is from $where')

people = ['Bill','Jim','Jack']
places = ['Boston','London','LA']

p = {}
cnt = int(0)
for person in people:
 for place in places:
  d = {}
  d['who'] = person
  d['where'] = place
  p[cnt] = s.substitute(d)
  cnt = cnt+1

Which returns a p as a dictionary of every combination of people and places:

Bill is from Boston
Bill is from London
Bill is from LA
Jim is from Boston
Jim is from London
Jim is from LA
Jack is from Boston
Jack is from London
Jack is from LA

Of course this is a silly example, but this sort of exercise proved really useful for some recent factorial experiments where we wanted to test a model performance for every combination of input factors (specified by filename strings).

Getting a bit more complex, let’s consider a long configuration file needed to run your model.  For example GCAM, an integrated assessment model I’ve previously discussed, uses a configuration xml file that’s about 100 lines long. We’ll consider a pared down version:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
   <Files>
      <Value name="xmlInputFileName">../input/gcam-data-system/xml/modeltime-xml/modeltime.xml</Value>
      <Value name="BatchFileName">batch_ag.xml</Value>
      <Value name="policy-target-file">../input/policy/forcing_target_4p5.xml</Value>
      <Value name="xmldb-location">../output/database_basexdb</Value>
   </Files>
   <ScenarioComponents>
      <Value name = "climate">../input/climate/magicc.xml</Value>
   </ScenarioComponents>
</Configuration>

Now, suppose we want to vary the cost of solar power inside the model over a number of levels, and we want each model run to print to a unique output directory.  Our first step is to make a template xml file with a $-place holder where we want to vary the configuration file:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
   <Files>
      <Value name="xmlInputFileName">../input/gcam-data-system/xml/modeltime-xml/modeltime.xml</Value>
      <Value name="BatchFileName">batch_ag.xml</Value>
      <Value name="policy-target-file">../input/policy/forcing_target_4p5.xml</Value>
      <Value name="xmldb-location">../output/database_basexdb_$RN&</Value>
   </Files>
   <ScenarioComponents>
      <Value name = "climate">../input/climate/magicc.xml</Value>
      <!-- SOLAR -->
      $SOLAR
   </ScenarioComponents>
</Configuration>

We can utilize the template xml file using Python’s template class as follows:

with open(template_name,'r') as T:
 template = Template(T.read())
SOLAR_1 = ['<Value name="solar">../input/gcam-data-system/xml/energy-xml/solar_low.xml</Value>']
SOLAR_2 = ['']
SOLAR_3 = ['<Value name="solar">../input/gcam-data-system/xml/energy-xml/solar_adv.xml</Value>']
SOLAR = [SOLAR_1,SOLAR_2,SOLAR_3]
for i in range(3):
   d = {}
   d['SOLAR']=SOLAR[i]
   d['RN']=str(i)
   S1 = template.safe_substitute(d)
   with open('./configuration_' + str(i) + '.xml','w') as f1:
   f1.write(S1)

Here we are looping over experimental particles, defined by a unique setting of the solar power level in our experimental design.  For each particle a GCAM, the solar level and run number are substituted in (see S1), and S1 is written to a unique XML file.  If we open configuration_0.xml we get see that the substitution has worked!

<?xml version="1.0" encoding="UTF-8">
<Configuration>
   <Files>
      <Value name="xmlInputFileName">../input/gcam-data-system/xml/modeltime-xml/modeltime.xml</Value>
      <Value name="BatchFileName">batch_ag.xml</Value>
      <Value name="policy-target-file">../input/policy/forcing_target_4p5.xml</Value>
      <Value name="xmldb-location">../output/database_basexdb_0</Value>
   </Files>
   <ScenarioComponents>
      <Value name = "climate">../input/climate/magicc.xml</Value>
<!-- SOLAR -->
<Value name="solar">../input/gcam-data-system/xml/energy-xml/solar_low.xml</Value>
</ScenarioComponents>
</Configuration>

Of course this is a very simple example, but it has proven incredibly useful in our ongoing work.

That’s all for now!

Advertisements

3 thoughts on “Python’s template class

  1. Nice post. I’ve been (slowly) working on a Python project to create a task-based model execution library that supports this functionality: https://github.com/Project-Platypus/Executioner. For example, in the following it creates a temp folder, copies the files to the temp folder, performs keyword substitution, executes the program, and reads a value from an XML file (using XPath expressions):

    with Executioner() as executioner:
    executioner.add(CreateTempDir())
    executioner.add(Copy(from=”~/ModelTemplate”))
    executioner.add(Substitute())
    executioner.add(Execute(“model.exe -i config.xml”))
    executioner.add(ParseXML(“output.xml”).get(“.//value[@name=’y1′]/text()”, “y1”, float))
    executioner.add(DeleteTempDir())

    executioner.execute({ “field1” : 1.0, “field2” : 3.14 })

  2. Pingback: Water Programming Blog Guide (Part I) – Water Programming: A Collaborative Research Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s