Piyush BansalPiyush Bansal

*args, **kwargs and function parameters in python.

I come from a C/C++ background, so when I first started programming in python and saw a code that looked somewhat like this -

def func(first_param, *args):
  # Do something with the arguments here.

I was thinking to myself - Ah! pointers. And when I saw something like this

def func(first_param, *args, **kwargs):
  # Do something with the arguments here.

I was again like - *Ah! Double Pointers*. I was obviously wrong, since after a few moments I realised that python doesn’t pass by reference (or pass by value, it’s a different evaluation strategy which is sometimes called Pass by object). So, these can’t be pointers.

But let us stop here for a moment (and digress? No, not really!) and try to see what Pass by object really means. Look at this piece of code and predict what the output should be -

>>> def make_sandwich(bread):
...   bread.add('eggs')
...   bread = set(['lettuce', 'mayo', 'chicken', 'mayo'])
...   # Damn it set, I wanted extra mayo :|
...

>>> loaf = set(['wheat', 'soda'])
>>> make_sandwich(loaf)

If python passed everything by value, the value of loaf shouldn’t have changed, and loaf should have been equal to set([‘wheat’,’soda’]).If instead python passed everything by reference the value of loaf should have been set([‘lettuce’,’mayo’,’chicken’]). But guess what, the value of loaf is

>>> loaf
set(['eggs', 'wheat', 'soda'])

What happens is, bread becomes a new name for the same value set([‘wheat’, ‘soda’]) that loaf is a name for. When it mutates that value, loaf sees the change. When it rebinds bread to a different value, loaf is still naming the original value. So, it prints out set([‘eggs’, ‘wheat’, ‘soda’]). You can read more about it in this wonderfully written article by David Goodger.

Okay, coming back to our original course of discussion -

What do the *args and **kwargs keywords do ?

Variable Length Argument Lists

How would you write a function in python that accepts an arbitrary number of arguments ? This is exactly where *args and **kwargs come into picture. *args (for lack of a better variable name, programmers usually go with choice of args, you can use *anything) is used to pass a non keyworded variable-length argument list. Let us look at an example -

>>> def hello_args(formal_arg, *args):
...   print 'Positional arg: %s'%formal_arg
...   for arg in args:
...     print 'Additional arg: %s'%arg
...

>>> hello_args('arg1', 'argument', 'another argument', 'yet another argument',
...     'as many arguments as you want')

Here is the output :

Positional arg: arg1
Additional arg: argument
Additional arg: another argument
Additional arg: yet another argument
Additional arg: as many arguments as you want

Similarly **kwargs (or **anything) is used to pass a keyworded variable-length argument list. Now what do I mean by keyworded - anything that has a collection of key:value pairs, a mapping basically, like a dictionary. Here is a simple example to explain **kwargs usage -

>>> def hello_kwargs(formal_arg, **kwargs):
...   print 'Positional arg: %s'%formal_arg
...   for arg in kwargs:
...     print 'Another key-value pair: %s: %s'%(arg, kwargs[arg])
...

>>> hello_kwargs('arg1', argument='value', another_argument='another value',
...     yet_another_argument='yet another value')

Here is the output :

Positional arg: arg1
Another key-value pair: argument: value
Another key-value pair: another_argument: another value
Another key-value pair: yet_another_argument: yet another value

Unpacking Argument Lists

In order to be able to use *args and **kwargs when calling functions, it is important to understand a simple yet important concept - Unpacking Argument List. This piece of code shall make the unpacking clear.

>>> range(1, 5)  # Normal call with separate arguments.
[1, 2, 3, 4]
>>> args = [1, 5]
>>> range(*args)    # Call with arguments unpacked from list.
[1, 2, 3, 4]

Similarly one can unpack the arguments from a dictionary - this time using ** rather than using * as in previous example.

>>> def make_sandwich(bread='brown', cheese='raw', lettuce='fresh', meat='extra'):
...  print 'I am making a super tasty sandwich for you !'
...  print 'You asked for %s bread, there you go !'%bread
...  print 'You shall have %s cheese, %s lettuce and %s meat.'%(cheese, lettuce,
...      meat)
...
>>> order = {'bread': 'parmesan-oregano', 'cheese': 'toasted',
...     'lettuce': 'fresh', 'meat': 'extra'}
>>> make_sandwich(**order)    # We're passing an unpacked dictionary here.

Here is your sandwich, erm… output, I mean.

I am making a super tasty sandwich for you !
You asked for parmesan-oregano bread, there you go !
You shall have toasted cheese, fresh lettuce and extra meat
Note that python allows usage of *args and **kwargs only in a specific order. All positional arguments come before *args, and *args comes before **kwargs.

A complex example using *args and **kwargs

This example requires that you know what decorators are, and how they’re used in python. Since decorators are kind of an interesting syntactic sugar constructs, I will write a post about them as well, soon. Now, let us write a function that accepts all arguments - such a function is called a catch-all function. We’re then going to use this function as a memoise function that implements caching. The benefit of having a catch-all function is that it doesn’t depend on underlying function’s definition (arguments, I mean), it has one job - to cache, and that it will for any type of functions that accepts whatever number of whatever type of arguments !

import functools

def memoize(f):
"""Memoize function f."""
cache = f.cache = {}

@functools.wraps(f)
def memoizer(*args, **kwargs):
  key = str(args) + str(kwargs)
  if key not in cache:
    cache[key] = f(*args, **kwargs)
  return cache[key]
return memoizer

And that’s it. Notice the relative position of *args and *kwargs in the above function. Let us try and write a fibonacci function that returns nth fibonacci number.

@memoize
def fibonacci(n):
  return 1 if n==0 or n==1 else fibonacci(n-1) + fibonacci(n-2)

>>> fibonacci(35):
14930352

And obviously it’s much faster than the one I wrote in previous blog post about generators since the one written with generators didn’t cache the previously computed values. Say you wanted a list of first n fibonacci numbers, the function written with generators in previous blog post would recompute lower fibonacci values each time, rather than using those already computed values from cache. This is the soul of a set of solutions referred to as Dynamic Programming. This caching function makes all kind of Dynamic Programming solutions easily implementable in python; you can do it just by decorating the intended function. Now that IS some syntactic sugar !

Thanks for reading!

Comments