One of the (few!) frustrating things I find when I’m either reading my own or someone else’s Python code is when I come across a function call like this(1):
forest = RandomForestClassifier(10, 2, n_estimators, 1)
Arrrgh! Unless you are sufficiently up to speed with the function its annoying to have to pull up the documentation to understand what the arguments 10, 2 and 1 actually refer to. Obviously the variable n_estimators has a more useful name. This way of calling a function is using positional arguments – i.e. the position of the argument in the function call is vital for the function to know which of the functions parameters to assign the value of the argument to. That was a huge mouthful of words, so let’s use a (simplified) example to explain what arguments and parameters mean in Python:
forest = RandomForestClassifier(10, 2, n_estimators, 1) # call the function def RandomForestClassifier(max_depth, min_samples_split, n_estimators, random_state) # function definition here
So 10, 2, n_estimators and 1 are arguments passed to the function RandomForestClassifier, and in the function definition, max_depth, min_samples_split, n_estimators and random_state are the parameters which get assigned the value from the corresponding argument – and they must be in position.
The other possibility in Python is to use keyword arguments, in which you explicitly state the parameter you wish to pass the argument to, using a keyword (basically the parameter name), like this (2):
forest = RandomForestClassifier(max_depth = 10, min_samples_split=2, n_estimators = n_estimators, random_state = 1)
Arguments, parameters, keywords, whatever they are supposed to be called, example (2) here is way easier to read than (1) above. You can see more of what is happening, and you have no need to read the docs to see what the arguments being passed to the function actually are used for. I’m planning on writing all my code in this way from now on – which also goes with the zen of python – explicit is better than implicit.
I recently discovered that, new in Python 3, you can now force the use of explicitly named parameters as above in your function definition using a *:
def RandomForestClassifier(*, max_depth, min_samples_split, n_estimators, random_state) # function definition here
Now calling this function in example (1) would throw an error, but example (2) would work. I think this is a really helpful addition to Python 3 which I’m sure will improve the readability of my own code.
Leave a Reply