PyCharm as a Python IDE for Generating UML Diagrams

This blog post is intended to provide discussion of some interesting features of a Python Integrated Development Environment (IDE) called PyCharm, which I have recently found to be useful in the development of a pure Python object-oriented simulation model. Specifically, I will focus on PyCharm’s ability to generate diagrams that help visualize the design of an object-oriented model. (Also, please see my other post on PyCharm debugging features if you’re interested in that).

Note that there are numerous Python IDEs, all of which have advantages and disadvantages. This post is not intended to be a comparison of Python IDEs, as others have done such comparisons (see this interesting write-up by Dr. Pedro Kroger). To accomplish the tasks I will describe below, I found PyCharm to be the easiest IDE/approach to use.

To give you some background, I am working on building an object oriented river basin simulation model called PySedSim. The model will have many different classes (e.g., river channels and reservoirs), many of which are related to other classes in different ways (e.g., they may inherit behavior, or methods, from other classes). Naturally, I wish to create a diagram all of these classes, methods and relationships before coding them, to structure my code development process. In such a circumstance it can be helpful to create a Unified Modeling Language (UML) diagram to visualize the design of the object-oriented model (e.g., inheritance/class relationships, and methods/attributes present in each class). Other diagram types (e.g., tree diagram or directed graph diagram) can also be helpful, depending on what kind of program you are building.

To generate a UML diagram, I started out using tools like Gliffy, which comes with a 30-day trial, as well as just creating diagrams in MS Word. I grew frustrated with these approaches because after drawing a small UML diagram, I would decide to change the layout of the model (e.g., re-name and re-arrange classes), and then have to draw a completely new diagram! So, to work around this problem, I decided a more efficient approach could be to prepare a skeleton of a Python model, replete with class and method (and even attribute) definitions, and locate a Python IDE that could automatically generate a UML diagram for me using my Python code. This way, when I inevitably decide I don’t like the way the program is designed, I can re-write the skeleton code, and the diagram will automatically adjust itself. This takes full advantage of the fact that Python can be coded and run so quickly that it is basically executable pseudo-code.

As you may have guessed, I discovered that PyCharm is a good tool for doing exactly what I have described above (generate a class diagram from Python code). Note that only the professional version of PyCharm (available for free to students) will do this, as far as I can tell. While I will focus on PyCharm in this post, using PyCharm is not the only way to generate a UML from python code. Some other options are reviewed here. I found PyCharm to be the most efficient means of generating a nicely formatted UML among these options.

Here is some more information (or references to places where you can learn more) about PyCharm:

  • You can learn a bit more about PyCharm here.
  • Some of PyCharm’s key features are described here.
  • As I mentioned, if you are a student (or instructor) with a valid university email address, you can obtain a one-year (renewable) student/teacher license to use the professional version of PyCharm for free. Read more here.
  • A comparison of the free versus professional version is available here.
  • PyCharm has a blog here with useful tips and tricks.
  • Some additional things I have liked about PyCharm so far: really nice code completion and debugging features, support for Git version control, and the ability to rearrange the order of different script files in the user interface, which, for whatever reason, I have been unable to accomplish in Spyder (another Python IDE I have used).

Below I am going to explain and show how I created a class diagram using PyCharm. The code I show is written to be interpreted in Python 2, and I have only tested this process in Windows7.

Step 1. Create some file(s)/scripts that establish different classes, and store them in a folder together. Open PyCharm, and go to File –> New Project, and select the directory where your files are stored. In this example, I am using three script files.

Step 2. Right click on one of your files in the project window. Click on Diagrams –> Show Diagram –> Python class diagram. A new .uml file will be created with a small rectangle in it representing the class you have added.

Figure 1

Step 3. Add other classes to this diagram by hitting the space bar on the screen (or right clicking anywhere on the canvas and clicking “add class to diagram”), and searching/selecting the other class files in the window that appears as below.

Figure 2

After adding the channel and reservoir classes, the diagram appears as below. Note that the Channel class contains 2 subclasses that I also added. You can rearrange them in any way that suits you.

Figure 3

Step 4. Right click on the canvas and select “Show categories”. These offer you opportunities to reveal additional information about each class. There are also buttons that appear in the inner menu on the upper left (the “m”, “i” and “f” buttons), that will let you achieve the same thing.

Figure 4

For example, selecting “Methods” and “fields” will show what methods and/or attributes are present in each class.

Figure 5

If you make some changes to your code (e.g., re-name a method in your reservoir.py file), those changes will be automatically reflected in your class diagram. Note, however, that you can actually directly re-name (called ‘refactoring’) methods and attribute values from within the UML diagram. For example, right-click on a method name in your UML diagram, click “re-factor”, and click “rename”. This will let you quickly change method names in all classes that inherit/customize this method from the current class, and even in classes from which this method is inherited. No changes will be made to identically named methods in completely separate classes. PyCharm will let you preview where the changes will be made before they are made.

Figure 6

Note that to do all of this you didn’t even need to fully develop or even run the model. You will see in my files below that most of my methods (or functions) don’t even have anything defined in them yet. I simply have kept them there as placeholders for when I start to code those methods.

If you have thoughts on other IDEs that you have found to be helpful for the purposes I describe, I would love to hear your comments.

If you want use some existing code to try this, below is the code for the three Class files I used to generate the UML in the images I showed.

From storage_element.py:

# import relevant libraries
import numpy as np
from ExcelDataImport import *

# Description: This file defines storage element class and methods

class Storage_Element:
    def __init__(self, name, T, Input_Data_File):
        self.name = name
        # Initialize as arrays critical reservoir attributes.
        Storage_Element.Array_Initialization(self, name, T)
        Storage_Element.Import_Data(self, name, T, Input_Data_File)
    def __repr__(self):                                        # Added method that prints object information
        return '[Model Element Info.: %s, %s, %s]' % (self.name, self.removed_load, self.BS_W)      # String to print
    def Array_Initialization(self, name, T):
        self.Q_in = np.zeros(T)
        self.Q_out = np.zeros(T)
    def Import_Data(self, name, T, Input_Data_File):
        if 'Incremental Flows' in Input_Data_File.sheetnames:
            self.Incremental_Flow_Junction = Excel_Data_Import(self.name, Input_Data_File, 'Incremental Flows', 0, T, max_distinct_data_types = None, data_name_offset = None)
        if 'Incremental Sediment Loads' in Input_Data_File.sheetnames:
            self.Incremental_Sed_Load_Junction = Excel_Data_Import(self.name, Input_Data_File, 'Incremental Sediment Loads', 0, T, max_distinct_data_types = None, data_name_offset = None)
    def Element_inflows(self):
        return

From channel.py:

import numpy as np
from storage_element import Storage_Element
from ExcelDataImport import *

# Description: This file defines the channel class and methods
class Channel(Storage_Element):
    def __init__(self, name, T, Input_Data_File):
        if hasattr(Storage_Element, '__init__'):
            Storage_Element.__init__(self, name, T, Input_Data_File) # If parent class has constructor method, then call that first.
        # Channel.Import_Data(self, T, Input_Data_File)
    def Import_Data(self, T, Input_Data_File):
        if 'Reach Specifications' in Input_Data_File.sheetnames:
            # Placed here because reservoirs are considered to be reaches in the unregulated simulation.
            [self.Routing_Coefficient, self.Routing_Exponent, self.Pool_Volume, self.Initial_Storage, self.alpha_2_3, self.beta_2_3, self.Initial_Sediment_Mass] = Excel_Data_Import(self.name, Input_Data_File, 'Reach Specifications', 1, 7, max_distinct_data_types = None, data_name_offset = None)
    def mass_balance(self, constant, t=None):
        self.removed_load = np.power(constant,2) + self.new_func(2)
        self.BS_W[t+1] = self.BS_W[t] + 3
        return self.removed_load
    def new_func(self,new_constant):
        trapped_load = 20 + new_constant
        return trapped_load
    def Flow_Routing(self):
        return
    def Sediment_Routing(self):
        return
class Diversion_Channel(Channel):
    # A diversion is a channel or pipe that is a regulated (at inflow) conveyance channel, unregulated at outflow.
    def __init__(self,name,T):
        if hasattr(Channel, '__init__'):
            Channel.__init__(self, name, T) # If parent class has constructor method, then call that first.
    def Element_inflows(self):
        return
    def Flow_Routing(self):
        return
    def Sediment_Routing(self):
        return
class Bypass_Channel(Channel):
    # A bypass is a channel or pipe that is a regulated (at inflow) conveyance channel, unregulated at outflow, but is at the upstream end of a reservoir.
    def __init__(self,name,T):
        if hasattr(Channel, '__init__'):
            Channel.__init__(self, name, T) # If parent class has constructor method, then call that first.
    def Element_inflows(self):
        return
    def Flow_Routing(self):
        return
    def Sediment_Routing(self):
        return

From reservoir.py:

import numpy as np
from storage_element import Storage_Element
from outlet import Outlet
from ExcelDataImport import *

# This file defines the reservoir class and methods
class Reservoir(Storage_Element):
    def __init__(self, name, T, Input_Data_File):
        if hasattr(Storage_Element, '__init__'):
            Storage_Element.__init__(self, name, T, Input_Data_File) #If parent class has constructor method, then call that first.
        Reservoir.Array_Initialization(self,T) # Initialize arrays reservoirs will have.
        Reservoir.Import_Data(self, T, Input_Data_File)
        # Reservoir.Import_Reservoir_Data(self,....)
        # Every reservoir must have outlets (orifices) of some sort
        self.Orifices = Outlet(self.name, T, Input_Data_File)
    def Array_Initialization(self, T):
        # Initialize Arrays
        self.Q_downstream = np.zeros(T)
    def Import_Data(self, T, Input_Data_File):
            # Worksheet_Names_Preferences = {} # Dictionary stores key of worksheet names, and a list of the following:
        if 'Evaporation Data' in Input_Data_File.sheetnames:
            self.Monthly_Evap_Data = Excel_Data_Import(self.name, Input_Data_File, 'Evaporation Data', 1, 12, max_distinct_data_types = None, data_name_offset = None)
    def mass_balance(self, constant, t):
        self.removed_load = np.power(constant, 2) + self.new_func(2)
        self.BS_W[t+1] = self.BS_W[t] + 3 #- self.Orifice[3].Q_overflow[t]
        return self.removed_load
    def new_func(self,new_constant):
        trapped_load = 20 + new_constant
        return trapped_load

10 thoughts on “PyCharm as a Python IDE for Generating UML Diagrams

  1. Pingback: UML 2 Class Diagrams: An Agile Introduction – Agile Modeling | Get Free Info

  2. Pingback: Debugging in Python (using PyCharm) | Water Programming: A Collaborative Research Blog

  3. Pingback: Debugging in Python (using PyCharm) – Part 2 | Water Programming: A Collaborative Research Blog

  4. Pingback: Debugging in Python (using PyCharm) – Part 3 – Water Programming: A Collaborative Research Blog

  5. Pingback: A Guide to Using Git in PyCharm – Part 1 – Water Programming: A Collaborative Research Blog

  6. Pingback: Water Programming Blog Guide (Part I) – Water Programming: A Collaborative Research Blog

  7. Very interesting topic. I used PyCharm in the past but haven´t seen this diagramming feature. I am using StarUML at the moment to design a W2P system and for sure a good reverse engineering tool would be very useful in integrating some python libraries into the project.
    The problem with reverse engineering python modules into UML is that python is mainly an imperative and not a declarative language. That means it does not enforce the definition of rich feature interfaces, parameter types, function return types, etc. That makes it hard to devise a model from a Python implementation.

  8. Pingback: Communicating model architecture with Python Diagrams – Water Programming: A Collaborative Research Blog

  9. Pingback: 12 Years of WaterProgramming: A Retrospective on >500 Blog Posts – Water Programming: A Collaborative Research Blog

Leave a comment