Interactive use of MDF in IPython¶

MDF has a number of ‘magic’ functions that make it easier to use interactively in the IPython environment. IPython ‘magic’ functions are ones that are invoked with a % before the name and are not available outside of IPython as they intended only for interactive work.

To access the magic functions you first need to import everything from the mdf.lab module:

from mdf.lab import *

This will print some brief text telling you that to get more help you need to use the %mdf_help magic function:

%mdf_help

Once mdf.lab is imported an ‘ambient’ context is created, so you can evaluate nodes without specifying any particular context - as you would inside a node function. For example:

from random import random

# define a new node
@evalnode
def rand():
    while True:
        yield random()

# call it and it will be evaluated in the ambient context
rand() # returns a random number

Values of nodes in the current ambient context can be accessed in the normal way by calling the nodes, and can be set using the value property

In [1]: x = varnode(default=1)

In [2]: x()
Out[5]: 1

In[3]: x.value = 2

In [4]: x()
Out[4]: 2

As time-dependent nodes are important in mdf it is easy to set and advance the current date (the mdf.now node) in IPython using the magic functions %mdf_now and %mdf_advance

# get the current time set in the ambient context (this is just the same as calling 'now()')
In [5]: %mdf_now
Out[5]: datetime.datetime(2012, 5, 2, 0, 0)

# set the current date
In [6]: %mdf_now 2005-01-01
Out[6]: datetime.datetime(2005, 1, 1, 0, 0)

# advance the date
In [7]: %mdf_advance

In [8]: %mdf_now
Out[9]: datetime.datetime(2005, 1, 3, 0, 0)

Notice that the magic functions understand dates as literals so there’s no need to construct datetime objects.

%mdf_advance optionally takes some nodes and returns the values of those nodes after the timestep. This can be useful for tracking values as you step through a few timesteps

In [9]: %mdf_advance rand
Out[9]: 0.67507466071023625

In [10]: %mdf_advance rand now
Out[10]: [0.67751015258066294, datetime.datetime(2005, 1, 5, 0, 0)]

Working with Timeseries¶

Nodes can be evaluated over a time range to produce time series of results, which can be plotted, stored as pandas DataFrames or exported to Excel.

The main functions used to construct timeseries of results are:

%mdf_df for creating dataframes

%mdf_plot for plotting using matplotlib

%mdf_xl for exporting to Excel

All these functions take two dates (start and end) followed by a list of nodes (note the use of T as a shortcut for today)

In [11]: %mdf_plot 2005-01-01 T rand

Nodes can be defined interactively either by writing new functions or simply using the nodetype method syntax (nodetype_method_syntax) to build up series of operation quickly

In [12]: %mdf_plot 2005-01-01 T rand.cumprodnode() rand.nansumnode()

This could also be written as

In [13]: a = rand.cumprodnode()

In [14]: b = rand.nansumnode()

In [15]: %mdf_plot 2005-01-01 T a b

In addition to these functions there is also %mdf_dfs which returns a list of dataframes (one for each node) and %mdf_wp which returns a widepanel constructed from dataframes for each node. These can be useful when evaluating multiple nodes at the same time but when you don’t want the results to get merged into a single dataframe.

Working with Data¶

Most often data is loaded as a pandas DataFrame, Series or WidePanel. To use those effectively in mdf we normally define a node that is the ‘current’ row or item from that dataset. As time advances that node updates to reveal the data from the underlying structure.

The datanode() function can be used to construct such a node from any DataFrame, WidePanel or series that is indexed by date.

The following example shows how to access data in a pandas DataFrame:

    from mdf import datanode
    import pandas as pa

# load some data
df = pa.DataFrame.from_csv("data_file.csv")

# create a node whose value is the row from the dataframe for 'now'
df_node = datanode("x", df)

This df_node node is like any other mdf node

In [18]: %mdf_plot 2000-01-01 T df_node

Applying Functions¶

Usually when writing code using mdf new nodes are written whenever a value derived from other nodes is required

@evalnode
def df_node_sq():
    return math.pow(df_node(), 2)

For trivial functions such as the one above it can be inconvenient to have to write these for each desired node.

The applynode() function can be used to create new nodes that apply a function to other nodes. The above example can be re-written as follows:

In [19]: df_node_sq = applynode(math.pow, df_node, 2)

In [20]: %mdf_plot 2000-01-01 T df_node_sq

Accessing the Context and Shifting¶

When you want to evaluate a node with another node set to a specific value or overriden you use a shifted context (see Shifted Contexts).

You can get and set the current ambient context using the %mdf_ctx magic function. This allows you to get the current context, create a shifted context and then set that shifted context as the current context.

In [21]: ctx = %mdf_ctx

In [22]: x = varnode(default=1)

In [23]: shifted_ctx = ctx.shift({x : 2})

In [24]: %mdf_ctx shifted_ctx
Out[24]: <ctx 1: 2012-05-03 [x=2] at 123373200>

In [25]: x()
Out[25]: 2

The functions %mdf_df, %mdf_plot and %mdf_xl also take an optional set of shifts. This makes it easy to get results from a shift without having the get, shift and set the current context.

In [26]: df_node_pow = applynode(math.pow, df_node, x)

In [27]: %mdf_plot 2000-01-01 T df_node_pow [x=0.5]

Using the MDF Viewer¶

The mdf viewer can be used explore the dependencies between nodes and plot or export values over time.

The mdf viewer can be opened from ipython with the magic command %mdf_show.

In [28]: %mdf_show df_node_pow

More nodes can be added to the open viewer using the same command.

In [29]: %mdf_show rand

To plot or export nodes select the nodes you want (use Shift or Ctrl to select multiple nodes) and then right click and select plot from the context menu. The same context menu may be used to export values to Excel or to render a graphical representation of the graph (requires Graphviz to be installed).

Once the viewer is open you can select one or more nodes and then use the magic command %mdf_selected to get your current selection in your IPython session

In [29]: %mdf_selected
Out[29]:
[(<ctx 0: 2005-02-11 at 115387248>,
  <<type 'mdf.nodes.MDFEvalNode'> [name=rand] at 0x30fee48>)]

This returns a list of contexts and node objects that correspond to what is selected in the viewer.

The magic functions %mdf_plot, %mdf_df and %mdf_xl can also be used to plot or get results for the currently selected nodes. To use the currently selected nodes don’t specify any nodes at all on the command line.

%mdf_plot 2005-01-01 T