The Water Programming blog continues to expand collaboratively through contributors’ learning experiences and their willingness to share their knowledge in this blog. It now covers a wide variety of topics ranging from quick programming tips to comprehensive literature reviews pertinent to water resources research and multi-objective optimization. This post intends to provide guidance to new, and probably to current users by bringing to light what’s available in the blog and by developing a categorization of topics.
This first post will cover:
1.Programming languages and IDEs
2.Frameworks of optimization, sensitivity analysis and decision support
3.The Borg MOEA
Part II) will focus on version control, spatial data and maps, conceptual posts and literature reviews. And finally Part III) will cover visualization and figure editing, LaTex and literature management, tutorials and miscellaneous research and programming tricks.
*Note to my fellow bloggers: Feel free to disagree and make suggestions on the categorization of your posts, also your thoughts on facilitating an easier navigation through the blog are very welcomed. For current contributors, its always suggested to tag and categorize your post, you can also follow the guidelines in Some WordPress Tips to enable a better organization of our blog. Also, if you see a 💡, it means that a blog post idea has been identified.
If you are new to the group and would like to know what kind of software you require to get started with research. Joe summed it up pretty well in his New Windows install? Here’s how to get everything set up post, where he points out all the installations that you should have on your desktop. Additionally, you can find some guidance on: Software to Install on Personal Computers and Software to Install on Personal Computers – For Mac Users!. You may also want to check out What do you need to learn? if you are entering the group. These posts are subject to update 💡 but they are a good starting point.
1. Programming languages and Integrated Development Environments (IDEs)
Dave Hadka’s Programming Language Overview provides a summary of key differences between the C, C++, Python and Java. The programming tips found in the blog cover Python, C, C++, R and Matlab, there are also some specific instances were Java is used which will be discussed in section 2. I’ll give some guidance on how to get started on each of these programming languages and point out some useful resources in the blog.
Python is a very popular programming language in our group so there’s sufficient guidance and resources available in the blog. Download is available here, also some online tutorials that I really recommend to get you up to speed with Python are: learn python the hard way, python for everybody and codeacademy. Additionally, stackoverflow is a useful resource for specific tasks. The python resources available in our blog are outlined as follows:
Data analysis and organization
Using Python IDEs
The use of an integrated development environment (IDE) can enable code development and make the debugging process easier. Tom has done a good amount of development in PyCharm, so he has generated a sequence of posts that provide guidance on how to take better advantage of PyCharm:
The plotting library for python is matplotlib. Some of the example found in the blog will provide some guidance on importing and using the library. Matt put together a github repository with several Matlab and Matplotlib Plotting Examples, you can also find guidance on generating more specialized plots:
Miscellaneous Python tips and tricks
Other applications in Python that my fellow bloggers have found useful are related to machine learning: Basic Machine Learning in Python with Scikit-learn, Solving systems of equations: Root finding in MATLAB, R, Python and C++ and using Python’s template class.
Matlab with its powerful toolkit, easy-to-use IDE and high-level language can be used for quick development as long as you are not concerned about speed. A major disadvantage of this software is that it is not free … fortunately I have a generous boss paying for it. Here are examples of Matlab applications available in the blog:
I have heard C++ receive extremely unflattering nicknames lately, it is a low-level language which means that you need to worry about everything, even memory allocation, but the fact is that it is extremely fast and powerful and is widely used in the group for modeling, simulation and optimization purposes which would take forever in other languages.
If you are getting started with C++,there are some online tutorials , and you may want to check out the following material available in the blog:
Here is some training material that Joe put together:
If you are developing code in C++ is probably a good idea to install an IDE, I recently started using CLion, following Bernardo’s and Dave’s recommendation, and I am not displeased with it. Here are other posts available within this topic:
If you are looking for sample code of commonly used processes in C++, such as defining vectors and arrays, generating output files and timing functions, here are some examples:
R is another free open source environment widely used for statistics. Joe recommends a reading in his Programming language R is gaining prominence in the scientific community post. Downloads are available here. If you happen to use an R package for you research, here’s some guidance on How to cite packages in R. R also supports a very nice graphics package and the following posts provide plotting examples:
1.5. Command line/ Linux:
Getting familiar with the command line and linux environment is essential to perform many of the examples and tutorials available in the blog. Please check out the Terminal basics for the truly newbies if you want an introduction to the terminal basics and requirements, also take a look at Using gdb, and notes from the book “Beginning Linux Programming”. Also check out some useful commands:
2. Frameworks for optimization, sensitivity analysis, and decision support
We use a variety of free open source libraries to perform commonly used analysis in our research. Most of the libraries that I outline here were developed by our very own contributors.
I have personally used this framework for most of my research. It has great functionality and speed. It is an open source Java library that supports several multi-objective evolutionary algorithms (MOEAs) and provides tools to statistically test their performance. It has other powerful capabilities for sensitivity and data analysis. Download and documentation material are available here. In addition to the documentation and examples provided on the MOEAFramework site, other useful resources and guidance can be found in the following posts:
2.2. Project Platypus
This is the newest python framework developed by Dave Hadka that support a collection of libraries for optimization, sensitivity analysis, data analysis and decision making. It’s free to download in the Project Platypus github repository . The repository comes with its own documentation and examples. We are barely beginning to document our experiences with this platform 💡, but it is very intuitive and user friendly. Here is the documentation available in the blog so far:
This is an open source library in R for Many Objective robust decision making (MORDM), for more details and documentation on both MORDM and the library use, check out the following post:
SALib is a python library developed by Jon Herman that supports commonly used methods to perform sensitivity analysis. It is available here, aside from the documentation available in the github repository, you can also find guidance on some of the available methods in the following posts:
2.5. Pareto sorting function in python (pareto.py)
This is a non-dominated sorting function for multi-objective problems in python available in Matt’s github repository. You can find more information about it in the following posts:
3. Borg MOEA
The Borg Multi-objective Evolutionary Algorithm (MOEA) developed by Dave Hadka and Pat Reed, is widely used in our group due to its ability to tackle complex many-objective problems. We have plenty of documentation and references in our blog so you can get familiar with it.
3.1. Basic Implementation
You can find a brief introduction and basic use in Basic Borg MOEA use for the truly newbies (Part 1/2) and (Part 2/2). If you want to link your own simulation model to the optimization algorithm, you may want to check: Compiling, running, and linking a simulation model to Borg: LRGV Example. Here are other Borg-related posts in the blog:
3.2. Borg MOEA Wrappers
There are Borg MOEA wrappers available for a number of languages. Currently the Python, Matlab and Perl wrappers are documented in the blog. I believe an updated version of the Borg Matlab wrapper for OSX documentation is required at the moment 💡.
4. High performance computing (HPC)
With HPC we can handle and analyse massive amounts of data at high speed. Tasks that would normally take months can be done in days or even minutes and it can help us tackle very complex problems. In addition, here are some Thoughts on using models with a long evaluation time within a Parallel MOEA framework from Joe.
In the group we have a healthy availability of HPC resources; however, there are some logistics involved when working with computing clusters. Luckily, most of our contributors have experience using HPC and have documented it in the blog. Also, I am currently using the MobaXterm interface to facilitate file transfer between my local and remote directories, it also enables to easily navigate and edit files in your remote directory. It is used by our collaborators in Politecnico di Milano who recommended it to Julie who then recommended it to me. Moving on, here are some practical resources when working with remote clusters:
4.1. Getting started with clusters and key commands
4.2. Submission scripts in Bash
4.3. Making bash sessions more enjoyable
4.4. Portable Batch System (PBS)
4.5. Python parallelization and speedup
4.7. File transfer