graph basics

2021-11-19

 | 

~13 min read

 | 

2579 words

Basic Terms

Graphs are comprised of nodes and edges.

Nodes are also called vertices.

Edges connect two nodes. Sometimes nodes are directed and/or weighted.

A directed graph is one in which two nodes are connected by an edge in a specific direction and not (necessarily) the inverse, i.e., you might be able to go from node A to node B, but not from node B to node A.

Neighbor nodes are nodes that are accessible from a vertex.

Adjacency lists

sampleAdjacencyList.js
{
    a: ['b','c']
    b: ['d'],
    c: ['e'],
    d: [],
    e: ['b'],
    f: ['d']
}

Hashmaps make for great adjacency lists because of their constant time lookup for keys. Adjacency lists also make for great ways to represent graphs.

Depth First traversal vs. Breadth first traversal

Implementing Depth-first searches will use a stack while breadth-first uses queues.

DFS example

depthFirstAdjacencyList.js
const adjacencyList = {
  a: ["b", "c"],
  b: ["d"],
  c: ["e"],
  d: ["f"],
  e: [],
  f: [],
}

function dfsPath(graph, root) {
  const path = []
  const stack = [root]
  while (stack.length) {
    const cur = stack.pop()
    path.push(cur)
    graph[cur].forEach((adjacentNode) => stack.push(adjacentNode))
  }
  return path
}

dfsPath(adjacencyList, "a") // [ 'a', 'c', 'e', 'b', 'd', 'f' ]
dfsPathRecursive.js
function dfsRecursive(graph, root) {
  return visitNodes(graph, [root])
}

function visitNodes(graph, stack) {
  const path = []
  const cur = stack.pop()
  path.push(cur)
  graph[cur].forEach((adjacentNode) => stack.push(adjacentNode))
  stack.forEach(() => path.push(...visitNodes(graph, stack)))
  return path
}

dfsRecursive(adjacencyList, "a") // [ 'a', 'c', 'e', 'b', 'd', 'f' ]

I could simplify this further and not use an internal method:

dfsRecursivePathSimpler.js
function dfsRecursivePathSimpler(graph, cur, stack = []) {
  const path = []
  path.push(cur)
  graph[cur].forEach((adjacentNode) => stack.push(adjacentNode))
  stack.forEach(() => {
    const source = stack.pop()
    path.push(...visitNodes(graph, source, stack))
  })
  return path
}

Here I’m taking advantage of a default value for a stack. The public API would remain the same: provide a graph and a root node. From there, the function will take care of the rest.

Note, if all I wanted to do was visit a node and print it, the recusive could be even simpler:

function dfsPrint(graph, node) {
  console.log(node)
  for (let neighbor of graph[node]) {
    dfsPrint(graph, neighbor)
  }
}

BFS Example

bfsPath.js
const adjacencyList = {
  a: ["b", "c"],
  b: ["d"],
  c: ["e"],
  d: ["f"],
  e: [],
  f: [],
}

function bfsPath(graph, root) {
  const path = []
  const queue = [root]

  while (queue.length) {
    const first = queue.shift() // get the current
    path.push(first) // put it in the queue
    graph[first].forEach((neighbor) => queue.push(neighbor))
  }

  return path
}

Breadth first traversals will almost always be implemented iteratively rather than recursively. This makes sense as the recursion itself will create a stack and for breadth first, we want to be working with queues.

Comparison

Look at how similar the BFS and DFS code is for iterative approaches:

function dfsPath(graph, root) {
  const path = []
  const stack = [root]

  while (stack.length) {
    const cur = stack.pop()
    path.push(cur)
    graph[cur].forEach((adjacentNode) => stack.push(adjacentNode))
  }

  return path
}

function bfsPath(graph, root) {
  const path = []
  const queue = [root]

  while (queue.length) {
    const first = queue.shift()
    path.push(first)
    graph[first].forEach((neighbor) => queue.push(neighbor))
  }

  return path
}

The only real difference is what data structure we’re using - queues versus stacks, both of which are approximated here using an array.

Has Path

What about solving for whether or not two nodes are connected within a graph?

For the time being, we’ll assume that the graph is directed and acyclic. It can be describe with the adjacency list:

const adjacencyList = {
  f: ["g", "i"],
  g: ["h"],
  h: [],
  i: ["g", "k"],
  j: ["i"],
  k: [],
}

We want to write a function that takes a graph, a source, and a destination and will return true/false if we are able to travel from the source to the destination.

hasPath.js
function hasPathDFS(graph, source, destination) {
  let stack = [source]

  while (stack.length) {
    const current = stack.pop()
    if (current === destination) return true
    graph[current].forEach((neighbor) => stack.push(neighbor))
  }

  return false
}

function hasPathDFSRecursive(graph, source, destination) {
  if (source === destination) return true

  for (let neighbor of graph[source]) {
    if (hasPathDFSRecursive(graph, neighbor, destination)) {
      return true
    }
  }
  return false
}

function hasPathBFS(graph, source, destination) {
  let queue = [source]

  while (stack.length) {
    const current = stack.shift()
    if (current === destination) return true
    graph[current].forEach((neighbor) => stack.push(neighbor))
  }

  return false
}

Undirected Graphs

Undirected graphs may be represented as a list of edges, e.g.,

edges.js
const edges = [
  ["i", "j"],
  ["k", "i"],
  ["m", "k"],
  ["k", "l"],
  ["o", "n"],
  ["k", "j"],
]

How might we convert that into an adjacency list?

buildGraph.js
function buildGraph(edges) {
  return edges.reduce((acc, cur) => {
    const [left, right] = cur
    if (!(left in acc)) acc[left] = []
    if (!(right in acc)) acc[right] = []

    acc[left].push(right)
    acc[right].push(left)

    return acc
  }, {})
}

Given our edges, this produces an adjacency list that looks like:

const adjacencyList = {
  i: ["j", "k"],
  j: ["i", "k"],
  k: ["i", "m", "l", "j"],
  m: ["k"],
  l: ["k"],
  o: ["n"],
  n: ["o"],
}

If we were to visualize this it might look like:

i ------- j
|       /
|     /
|   /
| /
k ------- l
|
|
m

n ------- o

Note that this graph, being undirected has many cycles. Some are apparent: j -. Others are “trivial”: o -.

In either case, as we’re navigating, we’ll need to guard against this.

One way to accomplish that is to track which nodes we’ve already visited.

Let’s write some code to this effect:

hasPathInPossibleCycle.js
function hasPathInPossibleCycleBFS(graph, source, destination) {
  let queue = [source]
  let visited = new Set()

  while (queue.length) {
    const current = queue.shift()
    if (current === destination) return true
    visited.add(current)
    graph[current].forEach((neighbor) => {
      if (!visited.has(neighbor)) {
        queue.push(neighbor)
      }
    })
  }

  return false
}

function hasPathInPossibleCycleDFS(graph, source, destination) {
  let visited = new Set()
  let pathFound = false
  function visitNode(current) {
    if (current === destination) {
      return (pathFound = true)
    }
    visited.add(current)
    graph[current].map((neighbor) => {
      if (!visited.has(neighbor)) {
        return visitNode(neighbor)
      }
    })
  }
  visitNode(source)

  return pathFound
}

By breaking this down, we can simplify further:

function hasPathInPossibleCycleDFS(edges, source, destination) {
  function hasPath(graph, current, destination, visited) {
    if (current === destination) return true
    if (visited.has(current)) return false
    visited.add(current)

    for (let neighbor of graph[current]) {
      if (hasPath(graph, neighbor, destination, visited)) return true
    }

    return false
  }

  function buildGraph(edges) {
    return edges.reduce((acc, cur) => {
      const [left, right] = cur
      if (!(left in acc)) acc[left] = []
      if (!(right in acc)) acc[right] = []
      acc[left].push(right)
      acc[right].push(left)
      return acc
    }, {})
  }

  let graph = buildGraph(edges)
  return hasPath(graph, source, destination, new Set())
}

The most major change is in hasPath which is cleaned up by using a for of loop instead of the map. This is helpful because of the ability to return out of the loop rather than in a map where the return affects the element being iterated (and therefore forced me to use a closure with the hasPath variable).

Notice also that I’ve allowed the API to be more flexible to allow edges and then build the graph internally.

Connected Components

The purpose of this sort of question is to understand how many distinct graphs are there.

That is, if a node can reach another node by way of an edge, they are part of the same component.

If they cannot, they are separate components.

Since we don’t know a priori which nodes are part of the same component, we’ll need to try navigating the tree using each node as the root for our search.

For this reason, as well as to avoid infinite loops, we’ll want to track which nodes we’ve visited already.

Let’s look at an undirected graph and count components.

One implementation might look like this:

connectedComponents.js
function connectedComponents(graph) {
  let components = 0
  const visited = new Set()

  function visitNode(node, source) {
    if (visited.has(String(node))) return
    visited.add(String(node))
    if (!source) {
      components++
    }
    for (const neighbor of graph[node]) {
      visitNode(neighbor, node)
    }
  }

  for (const node in graph) {
    visitNode(node)
  }
  return components
}

A slight twist on this would allow us to rely less on closure to access the node:

connectedComponentsAlt.js
function connectedComponents(graph) {
  let components = 0
  const visited = new Set()

  for (const node in graph) {
    if (visitNode(graph, node, visited)) {
      components += 1
    }
  }
  return components
}

function visitNode(graph, node, visited) {
  if (visited.has(String(node))) return false
  visited.add(String(node))

  for (const neighbor of graph[node]) {
    visitNode(graph, neighbor, visited)
  }
  return true
}

Whether or not you prefer separating the functions or not, one thing to notice about the alternative approach is the use of a boolean return value to determine whether or not to increment. The first solution increments immediately on the first node of a component. This alternative approach increments only after all of the nodes have been visited (as the for...of loop has completed).

Comparing Components

In the previous section, we counted the number of components, but what if we wanted to compare the components some how? Perhaps find the largest or the smallest component in the graph?

For this problem, my first approach included tracking all of the nodes in a different format. This is not strictly necessary and potentially takes up a lot of additional space - so if the problem doesn’t need it, it may not be worth it:

largestComponent.js
function largestComponent(graph) {
  let components = new Map() // <entry, nodes[]}
  const visited = new Set()

  for (const node in graph) {
    visitNode(graph, node, visited, components, node)
  }

  let size = 0
  components.forEach((nodes) => (size = Math.max(size, nodes.length)))
  return size
}

function visitNode(graph, node, visited, components, entry) {
  const strKey = String(node)
  if (visited.has(strKey)) return
  visited.add(strKey)

  if (components.has(entry)) {
    components.set(entry, [...components.get(entry), strKey])
  } else {
    components.set(entry, [strKey])
  }
  for (const neighbor of graph[node]) {
    visitNode(graph, neighbor, visited, components, entry)
  }
}

While the Map didn’t blow up the space complexity (it’s still O(n)), we can do better since this problem doesn’t actually care about what’s in the largest component, only it’s size.

So, let’s just track the count of the nodes:

largestComponentAlt.js
function largestComponent(graph) {
  const visited = new Set()
  let maxSize = 0
  for (const node in graph) {
    maxSize = Math.max(maxSize, visitNode(graph, node, visited, 1))
  }

  return maxSize
}

function visitNode(graph, node, visited) {
  const strKey = String(node)
  if (visited.has(strKey)) return 0
  visited.add(strKey)
  let size = 1
  for (const neighbor of graph[node]) {
    size += visitNode(graph, neighbor, visited)
  }
  return size
}

If the node’s already been visited, we want to return a 0. If it’s the first time we’ve seen a node say its size is one. Then because we want to group all of the nodes that comprise a component, we’ll add up all of the node visits and return it after the for ... of loop in visitNode.

Shortest Path

Switching gears a bit, let’s think about Path finding algorithms. We can start with trying to find the shortest distance between two nodes in a graph.

Since we have the option to use either breadth first or depth first, it’s worth taking a moment to consider if there’s one that’s more advantageous in this context.

Since we’re trying to find the shortest path to a node, if we use depth-first, we’ll need to find all of the paths to that node and compare them. How else could we be certain that it’s the shortest path?

On the other hand, with a breadth first traversal, the first time we see that node, we know it’s the shortest path because we’ve been looking only at the next neighbors before proceeding!

shortestPath.js
function shortestPath(edges, source, destination) {
  const graph = buildGraph(edges)
  let neighbors = [{ node: source, distance: 0 }]

  const visited = new Set()
  while (neighbors.length) {
    const { node, distance } = neighbors.shift()
    if (node === destination) return distance
    visited.add(node)
    graph[node].forEach(
      (neighbor) =>
        !visited.has(neighbor) &&
        neighbors.push({ node: neighbor, distance: distance + 1 }),
    )
  }
  throw new Error(`No Path between ${source} and ${destination}`)
}

function buildGraph(edges) {
  return edges.reduce((acc, cur) => {
    const [left, right] = cur
    if (!(left in acc)) acc[left] = []
    if (!(right in acc)) acc[right] = []

    acc[left].push(right)
    acc[right].push(left)

    return acc
  }, {})
}

Matrices And Graphs

Matrices are easily represented as a graph. Think about graph paper! They’re dots connected by lines.

This allows us to think of some visual representations quite naturally as graphs.

Imagine a map indicating land (L) and water (W):

W L W W L W
L L W L L L
W L W W W L
W W W W W W
W L W L L W
W W W L W W

This is a 6x6 graph:

  0 1 2 3 4 5
0 W L W W L W
1 L L W L L L
2 W L W W W L
3 W W W W W W
4 W L W L L W
5 W W W L W W

Which means it could be represented by arrays:

const map = [
  ["W", "L", "W", "W", "L", "W"],
  ["L", "L", "W", "L", "L", "L"],
  ["W", "L", "W", "W", "W", "L"],
  ["W", "W", "W", "W", "W", "W"],
  ["W", "L", "W", "L", "L", "W"],
  ["W", "W", "W", "L", "W", "W"],
]

Island Count

So, how might we find the number of islands?

We’ll use tools we’ve already seen!

  1. Depth first traversal of nodes
  2. A way to track the number of islands
  3. Tracking which nodes have been visited

We’ll accomplish all of this by leveraging the position of each plot (i.e., the x-y coordinates).

First, some pseudo-code:

islandCount-pseudo.js
function countIslands(map) {
  // start island count at 0
  // for each plot on the map
  // if it's land _and_ it hasn't been visited
  // visit the node and increment the island count
  // explore all adjacent parcels of land
  // return island count
}

An implementation might look like:

islandCount.js
export function countIslands(map) {
  let islandCount = 0
  let visited = new Set() // the set can be comprised of a string reference ot the element, ex. "0-0" is the first element in the map; "1-3" is in the second row and 4th column
  const maxRow = map.length - 1
  if (!map[0]) return islandCount // a little defensiveness for empty maps
  const maxColumn = map[0].length - 1

  function exploreIsland(row, column) {
    if (visited.has(`${row}-${column}`)) return

    visited.add(`${row}-${column}`)

    // navigate to all adjacent _land_ plots that are on the map
    if (row - 1 >= 0 && map[row - 1][column] === "L")
      exploreIsland(row - 1, column) // left
    if (row + 1 <= maxRow && map[row + 1][column] === "L")
      exploreIsland(row + 1, column) // right
    if (column + 1 <= maxColumn && map[row][column + 1] === "L")
      exploreIsland(row, column + 1) // up
    if (column - 1 >= 0 && map[row][column - 1] === "L")
      exploreIsland(row, column - 1) // down
  }

  // traverse the map
  for (let row = 0; row <= maxRow; row += 1) {
    for (let column = 0; column <= maxColumn; column += 1) {
      if (map[row][column] === "L") {
        if (!visited.has(`${row}-${column}`)) islandCount += 1 // only increment on the _first_ parcel of land seen for the island
        exploreIsland(row, column)
      }
    }
  }
  return islandCount
}

Wrap Up

I hope this guide proves to be a friendly introduction to graphs!



Hi there and thanks for reading! My name's Stephen. I live in Chicago with my wife, Kate, and dog, Finn. Want more? See about and get in touch!