Draw DAGs with TikZ

I’m working my way through the Statistical Rethinking 2022 course by Richard McElreath and it contains a lot of Directed Acyclic Graphs (DAGs).

It got me wondering: how do I draw DAGs like those in the book?

The answer is TikZ, according to a tweet from McElreath where he calls it the “graphical language of defeat”. It really is not an easy language to pick up — the TikZ user manual is 1,321 pages long!

Thankfully, we don’t need to know much about TikZ to draw DAGs. I walk through the bare minimum you need to know in this post.

What is TikZ?

TikZ (pronounced “ticks”) is a program for drawing graphical elements in LaTeX.

The name TikZ is a recursive acronym for “TikZ ist kein Zeichenprogramm,” which, translated from German, means “TikZ is no drawing program.”

Don’t let it fool you — TikZ very much is a drawing program. Just look at all these drawings made with TikZ!

For a gentle introduction to TikZ for drawing network diagrams (like DAGs) I recommend Crash Course to TikZ – Basics and Crash Course to Tikz – Positioning by Rachel Menghua Wu. The Wikibooks entry to LaTeX/PGF/TikZ is also helpful to understand the style options available in TikZ.

If you are making TikZ drawings for the web (as I have done in this post), you should configure your system to output TikZ drawings to Scalable Vector Graphics (SVGs). To do so, follow the directions in How to automatically convert TikZ images to SVG (with fonts!) from knitr by Andrew Heiss.¹

Baby’s First DAG

Let’s start with a very simple DAG: X affects Y .

Within \tikz, we specify three components. First, a node with \node we call x at the origin (0,0) which we label with $X$ . Second, another node with \node we call y to the right of x at (1,0) which we label $Y$ . Third, a path with \path between them as represented by an arrow -> from x to y.

\tikz{
    \node (x) at (0,0) {$X$};
    \node (y) at (1,0) {$Y$};
    \path[->] (x) edge (y);
}

Graph in which a variable X affects a variable Y. Black foreground text, weird arrowhead.

First impressions? Yeah, we can do better.

An SVG produced by TikZ has black foreground text and a transparent background by default.

Let’s lighten the foreground text. Define a new color which we call offwhite, use the HTML color model (so we can provide the color as a hexadecimal triplet), and finally, use the hexidecimal triplet F2EDED (the color I use for text throughout this website).

Now, add offwhite to each \node and \path to color the variables and arrow, respectively.

\definecolor{offwhite}{HTML}{F2EDED}
\tikz{
    \node[offwhite] (x) at (0,0) {$X$};
    \node[offwhite] (y) at (1,0) {$Y$};
    \path[->, offwhite] (x) edge (y);
}

Graph in which a variable X affects a variable Y. Off-white foreground text, weird arrowhead.

Ok, a little better.

But that default arrowhead is… kinda ugly. We can change it by setting > to stealth within \tikzset.²

\definecolor{offwhite}{HTML}{F2EDED}
\tikzset{> = stealth}
\tikz{
    \node[offwhite] (x) at (0,0) {$X$};
    \node[offwhite] (y) at (1,0) {$Y$};
    \path[->, offwhite] (x) edge (y);
}

Graph in which a variable X affects a variable Y. Off-white foreground text, better-looking arrowhead.

Not bad.

Three’s Company

How about a DAG with three nodes?

Suppose X affects Y and Z is a confounder.

\definecolor{offwhite}{HTML}{F2EDED}
\tikzset{> = stealth}
\tikz{
    \node[offwhite] (x) at (0,0) {$X$};
    \node[offwhite] (y) at (2,0) {$Y$};
    \node[offwhite] (z) at (1,1) {$Z$};
    \path[->, offwhite] (x) edge (y);
    \path[->, offwhite] (z) edge (x);
    \path[->, offwhite] (z) edge (y);
}

Graph in which a variable X affects a variable Y, and a third variable Z is a confounder that affects both X and Y

Nice!

Rather than set the style individually on each node and path, we can instead define styles within \tikzset that apply to every node and every path with /.append style.

For every node, we set text to offwhite. For every path, we set arrows to ->, draw to offwhite (which colors the arrow “body”) and fill to offwhite (which colors the arrowhead).

\definecolor{offwhite}{HTML}{F2EDED}
\tikzset{
    > = stealth,
    every node/.append style = {
        text = offwhite
    },
    every path/.append style = {
        arrows = ->,
        draw = offwhite,
        fill = offwhite
    }
}
\tikz{
    \node (x) at (0,0) {$X$};
    \node (y) at (2,0) {$Y$};
    \node (z) at (1,1) {$Z$};
    \path (x) edge (y);
    \path (z) edge (x);
    \path (z) edge (y);
}

Graph in which a variable X affects a variable Y, and a third variable Z is a confounder that affects both X and Y

Looking good.

Positioning

You may consider using the positioning library, which you can load with \usetikzlibrary.

Instead of specifying the exact position of nodes, you position each node in relation to another node with right, left, above right, above left, below right, and below left.

For example, in the previous DAG, we can define x, position z above and to the right of x with above right = of x, and position y below and to the right of z with below right = of z.

\usetikzlibrary{positioning}
\definecolor{offwhite}{HTML}{F2EDED}
\tikzset{
    > = stealth,
    every node/.append style = {
        text = offwhite
    },
    every path/.append style = {
        arrows = ->,
        draw = offwhite,
        fill = offwhite
    }
}
\tikz{
    \node (x) {$X$};
    \node (z) [above right = of x] {$Z$};
    \node (y) [below right = of z] {$Y$};
    \path (x) edge (y);
    \path (z) edge (x);
    \path (z) edge (y);
}

Graph in which a variable X affects a variable Y, and a third variable Z is a confounder that affects both X and Y

Structurally, this is exactly the same as the previous DAG. However, the text is smaller and the arrows are longer, almost as if we have “zoomed out” on the previous DAG.³

Here’s a more complicated DAG where the positioning library proves useful.

\usetikzlibrary{positioning}
\definecolor{offwhite}{HTML}{F2EDED}
\tikzset{
    > = stealth,
    every node/.append style = {
        text = offwhite
    },
    every path/.append style = {
        arrows = ->,
        draw = offwhite,
        fill = offwhite
    }
}
\tikz{
    \node (a) {$A$};
    \node (z) [right = of a] {$Z$};
    \node (b) [right = of z] {$B$};
    \node (x) [below left = of z] {$X$};
    \node (y) [below right = of z] {$Y$};
    \node (c) [below right = of x] {$C$};
    \path (a) edge (x);
    \path (a) edge (z);
    \path (b) edge (y);
    \path (b) edge (z);
    \path (c) edge (x);
    \path (c) edge (y);
    \path (x) edge (y);
    \path (z) edge (x);
    \path (z) edge (y);
}

Graph in which a variable X affects a variable Y, a third variable C affects X and Y, a fourth variable Z also effects X and Y, a fifth variable A affects X and Z, and finally, a sixth variable B affects Y and Z

In practice you will find there are times to favor the positioning library and times where it’s better to stick to the manual approach of positioning nodes directly.

Unobserved Variables

Throughout the Statistical Rethinking book, McElreath uses circles to represent variables that are unobserved.

To draw a circle around a node, we set draw to the color offwhite (the default is none which means no border), shape to circle (the default is square), and inner sep to 1pt (so the circle radius is not too large).

We don’t want every node to appear this way, so define a style called hidden (you can name it whatever you want) with these settings that we apply only to unobserved variables in the DAG.

\usetikzlibrary{positioning}
\definecolor{offwhite}{HTML}{F2EDED}
\tikzset{
    > = stealth,
    every node/.append style = {
        draw = none,
        text = offwhite
    },
    every path/.append style = {
        arrows = ->,
        draw = offwhite,
        fill = offwhite
    },
    hidden/.style = {
        draw = offwhite,
        shape = circle,
        inner sep = 1pt
    }
}
\tikz{
    \node (x) {$X$};
    \node[hidden] (z) [above right = of x] {$Z$};
    \node (y) [below right = of z] {$Y$};
    \path (x) edge (y);
    \path (z) edge (x);
    \path (z) edge (y);
}

Graph in which a variable X affects a variable Y, and a third variable Z is a confounder that affects both X and Y but Z is unobserved

And here’s a more complicated DAG (from Homework 3) with an unobserved variable. Because the positions of nodes don’t easily map to a grid, we don’t use the positioning library and instead specify the exact positions of each node.

\definecolor{offwhite}{HTML}{F2EDED}
\tikzset{
    > = stealth,
    every node/.append style = {
        text = offwhite
    },
    every path/.append style = {
        arrows = ->,
        draw = offwhite,
        fill = offwhite
    },
    hidden/.style = {
        draw = offwhite,
        shape = circle,
        inner sep = 1pt
    }
}
\tikz{
    \node (a) at (0,0) {$A$};
    \node (s) at (0,2) {$S$};
    \node (x) at (1,1) {$X$};
    \node (y) at (2.5,1) {$Y$};
    \node[hidden] (u) at (1.5,2) {$U$};
    \path (a) edge (s);
    \path (a) edge (x);
    \path (a) edge (y);
    \path (s) edge (x);
    \path (s) edge (y);
    \path (x) edge (y);
    \path (u) edge (s);
    \path (u) edge (y);
}

Graph in which a variable X affects a variable Y, a third variable S affects both X and Y, a fourth variable A affects S, X, and Y, and finally, a fifth variable U affects S and Y but U is unobserved

That’s all, folks

You should have everything you need to draw DAGs like those in Statistical Rethinking. If there are other DAG features you’d like to see, let me know on Twitter.

Until then, have a wonderful DAG.

1. I ran into difficulties trying to connect dvisvgm to Ghostscript on macOS. I found that reinstalling MacTex with “Ghostscript Dynamic Library” enabled didn’t work for me — dvisvgm wouldn’t recognize the dynamic library at /usr/local/share/ghostscript/9.53.3/lib/libgs.dylib.9.53 despite properly pointing to it with the LIBGS environment variable. However, you can brew install ghostscript and then point dvisvgm to the dynamic library at /usr/local/Cellar/ghostscript/9.55.0/lib/libgs.dylib.9.55 and that works. No clue why. In any case, I never would have figured out all these steps on my own. Thanks Andrew! ↩

2. Why is it called stealth? My best guess is that the resulting arrowhead looks vaguely like a Stealth Bomber. ↩

3. I haven’t found a way to “zoom back in”. If you know how, do tell. ↩

February 9, 2022 @nsgrantham

Neal Grantham

Draw DAGs with TikZ

What is TikZ?

Baby’s First DAG

Three’s Company

Positioning

Unobserved Variables

That’s all, folks