Tips for writing better Python code
I was a very experienced programmer when I started using Python. So I wrote some very bad Python code.
Each language has its own way of doing things. The more you learn the language, the more you adopt its approach and philosophy and the more elegant your code becomes. You learn to do more with less code, whilst making your code more readable and easier to maintain.
Let me help you with your journey into Python, by telling you a bit about mine. We’ll take some code and improve it.
(This post is based on a thread in a LinkedIn Python group, where someone posted their solution to the exercise. I responded with some suggestions and many other suggestions followed. Most of the content and code is my own. The ‘alternatives’ listed towards the end are by fellow Python developers. I have offered an accreditation to anyone whose code or ideas I used, and accredited those who requested it. With thanks to everyone who contributed to the thread.)
First version – Basic but it works
I’ll use a simple exercise problem:
“Create a function which takes a string of any length and convert every odd character to lowercase and every even character to uppercase”
Here is my starting code. This is how I would have written it twenty years ago:
def myfunc(x):
result = ''
for i in range(len(x)):
if i % 2 == 0:
result = result + x[i].lower()
else:
result = result + x[i].upper()
return result
print(myfunc('abcdefghijklmnopqrstuvwxyz'))
Python iteration
In languages like C, to process all items in a list or string you create an integer variable with a value of zero:
int i = 0;
.. and as long as i is less than the length of the list, increase it by one. And do something with the element at position i.
for (int i = 0; i < LENGTH; i++) {
// do something useful here with myList[i]
}
(It’s been a very long time since I wrote any C code. I have not tested any of it, so it may contain some errors.)
In Python you can iterate (loop) over a list or string directly:
for character in 'Hello':
print(character * 10)
HHHHHHHHHH
eeeeeeeeee
llllllllll
llllllllll
oooooooooo
We could use this to double each character in a string:
def myfunc(x):
result = ''
for character in x:
result = result + character + character
return result
print(myfunc('abcdefghijklmnopqrstuvwxyz'))
aabbccddeeffgghhiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz
For our problem we need to keep track of odd/even characters. Here are a few options:
def myfunc(x):
result = ''
even = True
for character in x:
if even:
result = result + character.lower()
else:
result = result + character.upper()
even = not even
return result
print(myfunc('abcdefghijklmnopqrstuvwxyz'))
Or:
def myfunc(x):
result = ''
index = 0
for character in x:
if index % 2 == 0:
result = result + character.lower()
else:
result = result + character.upper()
index = index + 1
return result
print(myfunc('abcdefghijklmnopqrstuvwxyz'))
Unlike many languages, Python does not have an ‘increase by 1’ operation. We can do the following instead:
index += 1
This increases the index variable by 1.
Applying this to our code, for both the index and the result variable, we get:
def myfunc(x):
result = ''
index = 0
for character in x:
if index % 2 == 0:
result += character.lower()
else:
result += character.upper()
index += 1
return result
print(myfunc('abcdefghijklmnopqrstuvwxyz'))
Testing ’empty’ values
Python has a very popular ‘short hand’ for testing ’empty’ values.
In tests, blank (zero-length) strings, empty collections (lists, tuples, dictionaries, sets) and zero values all evaluate to false.
For instance:
for value in 0, 1, [], [1, 2], '', 'Hello':
if value:
print('Non empty value', value, 'of type', type(value))
else:
print('Empty value', value, 'of type', type(value))
Empty value 0 of type <class 'int'>
Non empty value 1 of type <class 'int'>
Empty value [] of type <class 'list'>
Non empty value [1, 2] of type <class 'list'>
Empty value of type <class 'str'>
Non empty value Hello of type <class 'str'>
This can be used like this:
def print_list(some_list):
if some_list:
for item in some_list:
print(item)
else:
print('(Empty list)')
print()
print('First list - non-empty')
print_list([1, 23, 4])
print('Second list - empty')
print_list([])
First list - non-empty
1
23
4
Second list - empty
(Empty list)
We can use this in the original code. When index % 2 (remainder after division by 2) is non-zero it must be odd. Note how we need to swap the upper/lower lines around, compared to the original version:
# Original version, few lines
if index % 2 == 0:
result += character.lower()
# New version, same lines - .lower() is now .upper()
if index % 2:
result += character.upper()
The complete version is:
def myfunc(x):
result = ''
index = 0
for character in x:
if index % 2:
result += character.upper()
else:
result += character.lower()
index += 1
return result
print(myfunc('abcdefghijklmnopqrstuvwxyz'))
The output is still the same:
aBcDeFgHiJkLmNoPqRsTuVwXyZ
A note about readability
How easy it is to read and understand code is very important. As programmers we spend more time reading code – our own and others – then writing it. Every time you make a change you need to re-read the code. Before you can change a function you may need to read the functions which call it, to make sure you don’t break anything.
Whether the code change I made above, from ‘if index % 2 == 0:’ to ‘if index % 2:’ is more or less readable is debatable.
An experienced Python developer probably finds the second version more readable. She’ll know that it means ‘if index % 2 is not zero’. There is less ‘visual noise’ (stuff which doesn’t add anything to the meaning but just takes up screen space).
A less experienced developer may not understand the code. Having a little less visual noise may not be worth it in that case.
Readability – don’t play ‘code golf’
Trying to solve a problem using as few characters as possible is a fun game. It is is called ‘Code golf’. The top solutions are very impressive but completely unreadable.
My first computer had 256 bytes (yes, individual bytes) of memory. Every byte mattered, the fewer the better. My current computer has 32 GB or roughly 32,000,000,000 bytes. Modern computers have plenty of memory. We no longer need to compress our code just to make it fit.
Do play code golf for fun. Don’t play ‘code golf’ when solving real problems. Use as much space and as much code as you need. Aim for great readability.
Python naming standards
Consistency in your code really helps readability. Imagine a book which mixes upper and lower case characters at random, with 1, 2, 3 or more spaces between words and random indentation at the start of some lines. It would slow down your reading and be very frustrating.
In Python most developers use the PEP8 coding standard. This defines spacing, line breaks, variable naming and more. Our example code follows most of PEP8, apart from variable naming.
PEP8 says this about variable names:
Function names should be lowercase, with words separated by underscores as necessary to improve readability.
Variable names follow the same convention as function names.
‘myfunc’ consists of multiple words and should have an underscore, i.e. ‘my_func’.
Meaningful names
The second chapter of the classic book ‘Clean Code’ by Robert Martin is all about using meaningful names. According to Robert:
It is easy to say that names should reveal intent. What we want to impress upon you is that we are serious about this. Choosing good names takes time but saves more than it takes.
The name of a variable, function, or class, should answer all the big questions. It should tell you why it exists, what it does, and how it is used
I fully agree with this. The book gives another 14 pages of detail, but let’s leave it at this: names should be meaningful (full-of-meaning).
‘my_func’ doesn’t say anything about the function, nor does ‘x’.
Let’s give them some new names:
def convert_to_mixed_case(some_string):
converted_string = ''
index = 0
for character in some_string:
if index % 2:
converted_string += character.upper()
else:
converted_string += character.lower()
index += 1
return converted_string
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
‘Throw away’ variables which are used briefly and locally can have shorter and less meaningful names. I commonly use ‘i’ as an index in a loop. A function can be used anywhere in your code and should have a meaningful name.
‘some_string’ is quite a vague name because this is a generic function, and I’m not quite happy with it. I could have used ‘string’ but that is too close to ‘str’, one of the key Python object types. Even so, it is more meaningful than ‘x’. Maybe one day I’ll think of the perfect name.
Enumeration
So far we’re keeping track of the index separately whilst we iterate over the string. Python has a very useful, and not very well known, function which lets us simplify this.
For instance, if we had some names and we wanted to enumerate them (show them as a numbered list), we could do this:
names = 'Fred', 'Sue', 'Alex', 'Ahmed'
for index, name in enumerate(names):
print(index, name)
0 Fred
1 Sue
2 Alex
3 Ahmed
The enumerate function returns two things each time we go into the ‘for’ block: the index and the actual value. It lets us rewrite the code to this:
def convert_to_mixed_case(some_string):
converted_string = ''
for index, character in enumerate(some_string):
if index % 2:
converted_string += character.upper()
else:
converted_string += character.lower()
return converted_string
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
Notice how we no longer have to create the index variable (index = 0), nor do we need to increase it manually (index += 1).
Ternary operators
There is a lot of duplication in this code:
if index % 2:
converted_string += character.upper()
else:
converted_string += character.lower()
Can we simplify this?
Python has what is called ‘ternary operators’ (don’t worry about the name, I can never remember it myself). It looks like this:
for number in range(10):
odd_even = 'odd' if number % 2 else 'even'
print(number, odd_even)
0 even
1 odd
2 even
3 odd
4 even
5 odd
6 even
7 odd
8 even
9 odd
The ternary operator is ‘odd’ if number % 2 is not zero else ‘even’. This operation returns ‘odd’ if the number if odd. Else it returns ‘even’.
Ternary operators always look a bit strange to me. The order seems wrong. However, it reads like English: return odd if the number is odd, else return even.
Our code now looks like:
def convert_to_mixed_case(some_string):
converted_string = ''
for index, character in enumerate(some_string):
converted_string += \
character.upper() if index % 2 \
else character.lower()
return converted_string
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
Whether this version is more readable is debatable. I have broken up the line with the ternary operator to make it easier to understand.
The main reason for using a ternary operator is that it will let us use a generator expression, which we will see soon.
The join method
Strings have a join method. It takes a list (or other iterable, such as a tuple) of strings and joins them all together, using the original string as a separator. For instance:
fruits = 'apple', 'banana', 'kiwi', 'grapefruit', 'dragon fruit'
print('Original list:', fruits)
print('Joined together:', ', '.join(fruits))
Original list: ('apple', 'banana', 'kiwi', 'grapefruit', 'dragon fruit')
Joined together: apple, banana, kiwi, grapefruit, dragon fruit
The join method is very useful and I recommend you try it out. Here is how we could use it in our code:
def convert_to_mixed_case(some_string):
characters = []
for index, character in enumerate(some_string):
characters.append(
character.upper() if index % 2 else character.lower()
)
return ''.join(characters)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
We start with an empty list, then add each character to it after converting it to upper- or lowercase depending on their position.
Finally we join all the characters back into a single string, using an empty separator (empty string).
Again, whether this is an improvement is debatable and will depend on your experience level and on who may be using it in the future.
Brief side note: The opposite of the join method is the split method. Make sure to check it out if you don’t already know it. I use it all the time.
List comprehension and generator expressions
Say you’ve got a list of words and you want a new list with their uppercase version. You might do it like this:
words = ['hello', 'good', 'morning', 'afternoon', 'evening']
uppercase_words = []
for word in words:
uppercase_words.append(
word.upper()
)
print(uppercase_words)
['HELLO', 'GOOD', 'MORNING', 'AFTERNOON', 'EVENING']
Or you could use something called ‘list comprehension’ to get the same results. Like this:
words = ['hello', 'good', 'morning', 'afternoon', 'evening']
uppercase_words = [
word.upper()
for word in words
]
print(uppercase_words)
The middle 4 lines could be combined into a single line. I’ve split them up to show how it is built up. I will often add line breaks to make my code more readable. You can also write it like this:
uppercase_words = [word.upper() for word in words]
Compare this with the previous example. Much of the original code is still there, just in a different order.
Instead of telling Python how to create our list (create a blank list, then add uppercase versions one word at a time) we tell Python what we want:
- A list (square bracket)
- of the uppercase version ( word.upper() )
- of each word from the original list ( for word in words )
This also works with sets, tuples and dictionaries.
If we apply this to our code we get:
def convert_to_mixed_case(some_string):
characters = [
character.upper() if index % 2 else character.lower()
for index, character in enumerate(some_string)
]
return ''.join(characters)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
If you compare this to the previous version you’ll see that I’ve mostly just moved some code around.
We can rearrange this a bit more to get the following:
def convert_to_mixed_case(some_string):
return ''.join(
[
character.upper() if index % 2 else character.lower()
for index, character in enumerate(some_string)
]
)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
We no longer create the temporary ‘characters’ variable. Instead we feed the result of the list comprehension straight into the join method.
Generator expressions versus list comprehension
An iterable is something which can give us one element at a time, typically in a ‘for’ loop. For instance, a string can give us one character at a time. Common iterables are lists, tuples, dictionary keys and strings.
List comprehension takes an iterable and uses it to create a new list. In the example above each word was converted to upper case.
If we have a very long list, say 100,000,000 words, then converting them all to uppercase would take a long time. If we only use, say the first 10,000 words, then we’ll have converted the other 99,900,000 words for nothing and wasted a lot of processing time.
We could just use the original list and convert each word only when we need to.
But, just so I can explain this, let’s say that we want to create something called ‘uppercase_words’ first. That way we don’t have to think about how we created them anymore. Maybe, instead of calling .upper() we used something more complicated. Or maybe we use our ‘uppercase_words’ as a step in a longer processes or in multiple places.
A ‘generator expression’ lets us create an iterable which only does the work (in this case, conversion to uppercase) one element at a time, only when we ask for the next one.
It is a example of ‘lazily’ performing an operation. We can create a whole chain of operations without any of them actually being done. Each element is processed as late as possible, only when we really need it.
A generator expression looks exactly the same as a list comprehension, apart from using round brackets instead of square brackets.
In our code this would look like this:
def convert_to_mixed_case(some_string):
return ''.join(
(
character.upper() if index % 2 else character.lower()
for index, character in enumerate(some_string)
)
)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
Notice how we now have two round brackets after ‘join’. We can safely remove one set of brackets:
def convert_to_mixed_case(some_string):
return ''.join(
character.upper() if index % 2 else character.lower()
for index, character in enumerate(some_string)
)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
Done
The last version, above, is how I write this function now. This it the end point of our journey.
Some alternatives
We may have reached our destination, but let’s explore a few alternatives.
This article was prompted by a LinkedIn post. Someone posted their solution to this problem. I suggested a few changes and then many others jumped in and offered their suggestions. It was fascinating to see so many different approaches. Here are a few more.
Alternative 1: Functional Programming
Functional Programming (FP) is a style of programming which aims to avoid unexpected side effects. Python is not a pure FP language but does allow you to write FP-style code. I have very little experience in FP and find the following code hard to read:
from functools import reduce
def convert_to_mixed_case(some_string):
return reduce(
lambda acc, next_part:
acc + (next_part.upper() if len(acc) % 2 else next_part),
some_string.lower())
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
Alternative 2: The ‘map’ function
(Suggested by Igor Farias)
Python’s ‘map’ function is the predecessor to list comprehension and still popular amongst many programmers. Personally I prefer using list comprehension and find it easier to read. Using ‘map’ the code looks like this:
def convert_to_mixed_case(some_string):
return ''.join(
map(
lambda character, index:
character.upper() if index % 2
else character.lower(),
some_string,
range(len(some_string))
)
)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
Alternative 3: Using slicing
Using slicing you can replace every other element in a list. For instance:
>>> some_numbers = list(range(20))
>>> print(some_numbers)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> some_numbers[: : 2] = [0] * 10
>>> some_numbers
[0, 1, 0, 3, 0, 5, 0, 7, 0, 9, 0, 11, 0, 13, 0, 15, 0, 17, 0, 19]
Notice how every second number has been replaced with 0. We can use this in our function, like this:
def convert_to_mixed_case(some_string):
result = list(some_string.upper())
result[::2] = some_string[::2].lower()
return ''.join(result)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
This doesn’t work for strings because strings can’t be changed (‘immutable’). So the string is converted to a list at the start and converted back to the string at the end.
To me this version is less readable. It is supposedly much faster. However, as Donald Knuth said:
Premature optimization is the root of all evil (or at least most of it) in programming.
When programming, first create something that works. Then make it work well (cleaner code, test code, etc). Then, if you really need the extra speed, start improving its performance. Often when you do this your code gets more complex and less readable. You may save a fraction of a second in CPU time but at the cost of minutes or hours of your own time – writing the initial code and maintaining the more complex version.
Alternative 4: Split and zip
I couldn’t resist using some of the ideas in the previous version to try something else.
We create two separate lists – one with the odd characters and another with the even characters. And convert them to upper- or lowercase. Finally we ‘zip’ the two lists back together. Unfortunately the syntax for doing this is not very obvious. It looks like this:
import itertools
def convert_to_mixed_case(some_string):
evens = some_string[::2].lower()
odds = some_string[1::2].upper()
return ''.join(itertools.chain.from_iterable(zip(evens, odds)))
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
Alternative 5: Separate function to process single characters
We can move the ternary into a separate function. This would look like this:
def even_to_upper_odd_to_lower(index, character):
return character.upper() if index % 2 else character.lower()
def convert_to_mixed_case(some_string):
return ''.join(
even_to_upper_odd_to_lower(index, character)
for index, character in enumerate(some_string)
)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
This simplifies the main function and gives each function its own level of abstraction. One function deals with the whole string whilst the other handles individual characters.
Now that we’ve got a character-level functions I would drop the ternary and do this: (as suggested by a fellow Python trainer, Darko Stankovski)
def even_to_upper_odd_to_lower(index, character):
if index % 2:
return character.upper()
else:
return character.lower()
def convert_to_mixed_case(some_string):
return ''.join(
even_to_upper_odd_to_lower(index, character)
for index, character in enumerate(some_string)
)
print(convert_to_mixed_case('abcdefghijklmnopqrstuvwxyz'))
This is probably my new preferred approach, especially because I’ve just been reading about levels of abstraction. For an experienced Python developer this code is very easy to ready
Conclusion
This was an interesting exploration of Python and an example of how to make your code more ‘pythonic’.
For me it was interesting to look back at where I started with Python and how far I’ve come. As a Python trainer I’m fortunate that I’m constantly learning as I teach and there’s always more to learn. In a year or so I’ll look back at today’s code and think “Who created this? It can be so much better”.
According to the Zen of Python (in the REPL type ‘import this’):
There should be one — and preferably only one — obvious way to do it.
As you can see, there are actually many different ways to do something in Python.
The next line is:
Although that way may not be obvious at first unless you’re Dutch.
Even though I am Dutch myself, which one is the ‘obvious’ way here seems debatable to me.
I hope you found this useful. Have a go, try out the code. Try out any new syntax, read the documentation and make it your own.
(Note: All the Python code in this article works, at least in Python 3.9 but probably in all version of Python 3. All versions of the convert_to_mixed_case function give the same results, even if I haven’t always shown the output)