Enhance your (Windows) remote terminal experience with MobaXterm

Jazmin and Julie recently introduced me to a helpful program for Windows called “MobaXterm” that has significantly sped up my workflow when running remotely on the Cube (our cluster here at Cornell). MobaXterm bills itself as an “all in one” toolbox for remote computing. The program’s interface includes a terminal window as well as a graphical SFTP browser. You can link the terminal to the SFTP browser so that as you move through folders on the terminal the browser follows you. The SFTP browser allows you to view and edit files using your text editor of choice on your windows desktop, a feature that I find quite helpful for making quick edits to shell scripts or pieces of code as go.

mobaxtermsnip

A screenshot of the MobaXterm interface. The graphical SFTP browser is on the left, while the terminal is on the right (note the checked box in the center of the left panel that links the browser to the terminal window).

 

You can set up a remote Cube session using MobaXterm with the following steps:

  1. Download MobaXterm using this link
  2.  Follow the installation instructions
  3. Open MobaXterm and select the “Session” icon in the upper left corner.
  4. In the session popup window, select a new SSH session in the upper left, enter “thecube.cac@cornell.edu” as the name of the remote host and enter your username.
  5. When the session opens, check the box below the SFTP browser on the left to link the browser to your terminal
  6. Run your stuff!

Note that for a Linux system, you can simply link your file browser window to your terminal window and get the same functionality as MobaXterm. MobaXterm is not available for Mac, but Cyberduck and Filezilla are decent alternatives. An alternative graphical SFTP browser for Windows is WinSCP, though I prefer MobaXterm because of its linked terminal/SFTP interface.

For those new to remote computing, ssh or UNIX commands in general, I’d recommend checking out the following posts to get familiar with running on a remote cluster:

 

 

 

Advertisements

Globus Connect for Transferring Files Between Clusters and Your Computer

I recently learned about a service called Globus Online that allows you to easily transfer files to the cluster.  It’s similar to WinSCP or SSH, but the transfers can happen in the background and get resumed if they are interrupted.  It is supported by the University of Colorado: https://www.rc.colorado.edu/filetransfer and the NSF XSEDE machines: https://www.xsede.org/globus-online. Also, courtesy of Jon Herman, a note about Blue Waters: There is an endpoint called ncsa#NearLine where you can push your data for long-term storage (to avoid scratch purges). However on NearLine there is a per-user quota of 5TB. So if you find yourself mysteriously unable to transfer any more files, you’ll know why.

To get started, first create a Globus Online account.  Then, you’ll need to create “endpoints” on your account.  The obvious endpoint is, say, the cluster.  The University of Colorado for example has instructions on how to add their cluster to your account.  Then, you need to make your own computers an endpoint!  To do this, click Manage Endpoints then click “Add Globus Connect.”  Give your computer a name, and then it will generate a unique key that you can then use on the desktop application for the service.  Download the program for Mac, Unix, or Windows.  The cool thing is you can do this on all your computers.  For example I have a computer called MacBookAir, using OSX, and another one called MyWindows8 or something like that, that uses Windows.

File transfers are then initiated as usual, only you’re using a web interface instead of a standalone program.

As usual feel free to comment in the comments below.

Connecting to an iPython HTML Notebook on the Cluster Using an SSH Tunnel

Magic

I didn’t have the time or inclination to try to set up the iPython HTML notebook on the conference room computer for yesterday’s demo, but I really wanted to use the HTML notebook. What to do?

Magic.

Magic in this case means running the iPython HTML notebook on the cluster, forwarding the HTTP port that the HTML notebook uses, and displaying the session in a web browser running locally. In the rest of this post, I’ll explain each of the moving parts.

iPython HTML Notebook on the Cluster

The ipython that comes for free on the cluster doesn’t support the HTML notebook because the python/2.7 module doesn’t have tornado or pyzmq. On the plus side, you do have easy_install, so setting up these dependencies isn’t too hard.

  1. Make a directory for your personal Python packages:
    mkdir /gpfs/home/asdf1234/local
  2. In your .bashrc, add
    export PYTHONPATH=$HOME/local/lib/python2.7/site-packages
  3. python -measy_install --prefix /gpfs/home/asdf1234/local tornado
    python -measy_install --prefix /gpfs/home/asdf1234/local pyzmq

If you have a local X server, you can check to see if this works:

ssh -Y asdf1234@cluster 
ipython notebook --pylab=inline

Firefox should pop up with the HTML notebook. It’s perfectly usable like this, but I also didn’t want to set up an X server on the conference room computer. This leads us to…

Forwarding Port 8888

By default, the HTML notebook serves HTTP on port 8888. If you’re sitting in front of the computer, you get to port 8888 by using the loopback address 127.0.0.1:8888.
127.0.0.1 is only available locally. But using SSH port forwarding, we can connect to 127.0.0.1:8888 from a remote machine.

Here’s how you do that with a command-line ssh client:

ssh -L8888:127.0.0.1:8888 asdf1234@cluster

Here’s how you do it with PuTTY:

putty

Click “Add.”  You should see this:

putty2

Now open your connection and login to the remote machine. Once there, cd to the directory where your data is and type

ipython notebook --pylab=inline

If you’re using X forwarding, this will open up the elinks text browser, which is woefully incapable of handling the HTML notebook. Fortunately that doesn’t sink the demo. You’ll see something like this:

elinks

This means that the iPython HTML notebook is up and running. If you actually want to use it, howerver, you need a better browser. Fortunately, we opened up SSH with a tunnel…

Open the Notebook in Your Browser

This was the one part of the demo that wasn’t under my control. You need a modern web browser, and I just had to hope that someone was keeping the conference room computer more or less up to date. My fallback plan was to use a text-mode ipython over ssh, but the notebook is much more fun! Fortunately for me, the computer had Firefox 14.

In your URL bar, type in

http://127.0.0.1:8888

If everything works, you’ll see this:
dashboard
And you’re off to the races!

What Just Happened?

I said earlier that 127.0.0.1 is a special IP address that’s only reachable locally, i.e. on the machine you’re sitting in front of. Port 8888 on 127.0.0.1 is where ipython serves its HTML notebook, so you’d think the idea of using the HTML notebook over the network isn’t going to fly.

When you log in through ssh, however, it’s as if you are actually sitting in front of the computer you’re connected to. Every program you run, runs on that computer. Port forwarding takes this a step further and presents all traffic on port 8888 to the remote computer as if it were actually on the remote computer’s port 8888.

The Cluster and Basic UNIX Commands

In this tutorial, you will log onto a computing cluster and get comfortable with some basic UNIX commands.  This post is about 2 years old at this point! It was originally written by Jon Herman and edited by Joe Kasprzyk, most recently on 9/27/2013.

One comment before we get started. At first, this post was written to help get started on the Penn State cluster. Now, folks in our research groups may be using computers at Cornell, University of Colorado, through the NSF Xsede System, or in other places! But generally the steps are about the same.

What is a cluster anyway?

When you use Excel or Matlab on your own laptop, all the calculations are being done right on your computer’s processor. On the internet, though, we’re used to having calculations done remotely “in the cloud” on a server somewhere. For example when you upload a video to YouTube, the conversion from your video format to Flash isn’t done on your laptop, it’s done somewhere in Iowa.

Using a computing cluster is the same idea. It may be fine to run a single MOEA run on your own laptop, but what happens when you want to run 50 random trials? Or the function evaluation time is really long? Plus, your laptop may not be that powerful and you may want to turn it off and go home, or someone might spill something on it, etc.

So using a computing cluster takes all the calculations and performs them somewhere else — on the cluster! So the idea is that you upload your files to a server, and then you can actually interact with the computer remotely, submit the computing jobs, and then download the results. For example, you can compile your code on the cluster (on the initial computer that you connect to called the login node, and then submit a remote job that gets performed on the compute notes.

You’ll need to interact with the cluster in two ways.

  1. Enter commands on the command line.  Use this to submit jobs, run programs, process files, etc. There are several software packages available to do this.  If you’re on a Mac or Linux machine, you should just be able to use the terminal.  On Windows there are several options.  The first is SSH Secure Shell, which can be downloaded from the Penn State ITS center (if you’re at Penn State).  The second, which a lot of members in the group use, is Cygwin.  Cygwin installs many unix-like programs integrated within the Windows environment.  Third are a selection of different terminal programs such as Putty. On a Mac, I’ve seen people use a program called Fugu. But the workflow is similar across most of the programs:

Each of these options uses SSH to connect to the cluster. SSH stands for “secure shell” and provides remote access to the command line interface on the clusters. You will first need to define a connection—if you use the “Quick Connect” option, you will need to re-enter the connection information every time. To simplify future access, use the Profiles -> Add Profile option, then use Profiles -> Edit Profile to define the profile you just created.

A remote connection requires a host name, a user name, and a password. The host name will depend on which cluster you want to access.

Penn State Right now, the largest and most powerful cluster at Penn State is Cyberstar (hostname: cyberstar.psu.edu). You can also access smaller clusters which may be less crowded such as Lion-XO (lionxo.aset.psu.edu). A detailed list of available systems and their specifications is available here.

University of Colorado We have access to a computer called Janus. For more information, click here. Researchers in Joe Kasprzyk’s group have access to several computing allocations, for more information, email joseph.kasprzyk “at” colorado.edu.

Cornell The Reed group cluster, “TheCube”, is currently coming online. There may be an additional future post about this once it’s operational. In the meantime, contact Jon Herman at jdh366 “at” cornell.edu for more information.

When you connect, you will be prompted for your password (this is the same as your university logon). If you have already been approved for access to your chosen system, your login should be almost immediate.

Congratulations, you are now on the cluster! You should see a prompt like [username@hostname ~]$ with a blinking cursor to the right. The ~ symbol means that you are currently in your home directory. This is the UNIX command line interface—Windows also has a command line, but we rarely use it because its graphical interface is so convenient.

Let’s try out some basic commands. Commands allow you to move around your file system and move, copy, edit, and run your files. Some good ones to know starting out:

  • ls (List contents of current directory)
  • pwd (Print working directory)
  • cd newFolder (Change directory to newFolder)
  • cp filename newLocation (Copy a file to a new location)
  • mv filename newLocation (Move a file to a new location)
  • rm filename (Delete a file. This is permanent, use with care.)
  • tar -cvf zippedFolder.tar oldFolder (Compress a directory. Tar is the UNIX version of zip)
  • tar -xvf zippedFolder.tar (Uncompress a tarred folder to the directory zippedFolder/)

When moving around, remember that UNIX uses the forward slash (‘/’) to denote directories rather than the Windows backslash (‘\’). The current directory is denoted as a dot (‘.’) and the parent directory is denoted by two dots (‘..’).

From your home directory (~), the two main directories are your work and scratch folders. These are both associated with your username. Your work folder is where you will store and run most of your programs. The scratch folder offers unlimited file storage and is sometimes useful for holding large result files.

  1. Transferring files between your computer and the cluster.  The first choice is the sftp command, covered in our post about using the cluster on a Mac.  The second choice is using a program called WinSCP, which provides a graphical drag-and-drop interface for transferring files.  The instructions to do so are below.
  • Open WinSCP and connect to any of the clusters, similar to how you did with the SSH client. Note that your home directory is accessible from any of the clusters, so it doesn’t really matter which one you use with WinSCP.
  • The transfer protocol on the first screen can remain at the default, “stfp”, with the default port being 22.  Then simply type your user id and the remote host like you did with the SSH client.
  • If you are prompted to choose an interface style, use the “Commander” interface. This shows you both the local and remote directories at the same time.
  • The right-hand window will show your file system on the cluster. Use WinSCP to drag and drop files between local folders and your cluster folders. You can also drag and drop to/from your regular Windows folders.
  • WinSCP also has a simple (but useful) text editor. If you have a text file on the cluster, right-click it in WinSCP and select “Edit”. A window will open that allows you to edit the file directly on the remote machine.

This should get you started with using the cluster. You will use the SSH client for compiling and running programs, and WinSCP to transfer files with your local machine.

Software to Install on Personal Computers – For Mac Users!

It turns out there are some pros and cons to running on Macs for doing these activities.  Here are some updates on how to efficiently work on a mac:

  • You don’t need Cygwin at all since X11/XWindow is included in the operating system already!
  • You don’t need something like WinSCP, since you can use SFTP to transfer files from a local computer to the remote computer.  Here’s how:
    1. Open a terminal window on your local computer.
    2. Use cd and ls to get to the local directory on your hard drive where you have files you want to send to the remote computer.
    3. Type sftp user@host to connect to the remote computer.  Enter your password.
    4. Use cd and ls to get to the directory on the remote system where you want to put the files from your local system.
    5. Type put filename to transfer a file to the remote system. You can also use mput *abc to transfer multiple files (in this example, everything ending in abc). The asterisk is a wildcard; it matches any character, any number of times.
  • If you want to transfer in the other direction, i.e. from the remote machine to the local machine, use the get and mget commands, which work just like put and mput.
  • Summary of useful sftpcommands:
    • get filename Copy a file from the remote computer to the local computer.
    • mget filenames Copy several files from the remote computer to the local computer. Can use wildcards.
    • put filename Copy a file from the local computer to the remote computer.  Use -r if you want to upload a whole directory.  Note that the command can’t create a directory that already exists, so when you’re on the remote computer, use mkdir to make a new directory that matches the one you want to copy first.
    • mput filenames Copy several files from the local computer to the remote computer. Can use wildcards.
    • cd path Change directories on the remote computer.
    • ls List the files in the current directory on the remote computer.
    • pwd Display the path of the current directory on the remote computer.
    • lpwd Display the path of the current directory on the local computer.
    • lcd Change directories on the local computer.