Core hierarchy algorithm¶

 bpt_canonical(graph[, edge_weights, …]) Computes the canonical binary partition tree, also called binary partition tree by altitude ordering or connectivity constrained single min/linkage clustering of the given graph. saliency(tree, altitudes, leaf_graph[, …]) The saliency map of the input hierarchy $$(tree, altitudes)$$ for the leaf graph $$g$$ is an array of edge weights $$sm$$ for $$g$$ such that for each pair of adjacent vertices $$(i,j)$$ in $$g$$, $$sm(i,j)$$ is equal to the ultra-metric distance between $$i$$ and $$j$$ corresponding to the hierarchy. quasi_flat_zone_hierarchy(graph, edge_weights) Computes the quasi flat zone hierarchy of the given weighted graph. simplify_tree(tree, deleted_vertices[, …]) Creates a copy of the given tree and deletes the vertices $$i$$ of the tree such that $$deletedVertices[i]$$ is True. canonize_hierarchy(tree, altitudes[, …]) Removes consecutive tree nodes with equal altitudes. Transforms a tree into a binary tree.
bpt_canonical(graph, edge_weights=None, sorted_edge_indices=None, return_altitudes=True, compute_mst=True)[source]

Computes the canonical binary partition tree, also called binary partition tree by altitude ordering or connectivity constrained single min/linkage clustering of the given graph.

Definition:

The following definition is adapted from:

Cousty, Jean, Laurent Najman, Yukiko Kenmochi, and Silvio Guimarães. “Hierarchical segmentations with graphs: quasi-flat zones, minimum spanning trees, and saliency maps.” Journal of Mathematical Imaging and Vision 60, no. 4 (2018): 479-502.

Let $$G=(V,E)$$ be an undirected graph, let $$\prec$$ be a total order on $$E$$, and let $$e_k$$ be the edge in $$E$$ that has exactly $$k$$ smaller edges according to $$\prec$$: we then say that $$k$$ is the rank of $$e_k$$ (for $$\prec$$). The canonical binary partition hierarchy of $$G$$ for $$\prec$$ is defined as the sequence of nested partitions:

• $$P_0 = \{\{v\}, v\in V\}$$, the finest partion is composed of every singleton of $$V$$; and

• $$P_n = (P_{n-1} \backslash \{P_{n-1}^x, P_{n-1}^y\}) \cup (P_{n-1}^x \cup P_{n-1}^y)$$ where $$e_n=\{x,y\}$$ and $$P_{n-1}^x$$ and $$P_{n-1}^y$$ are the regions of $$P_{n-1}$$ that contain $$x$$ and $$y$$ respectively.

At the step $$n$$, we remove the regions at the two extremities of the $$n$$-th smallest edge and we add their union. Note that we may have $$P_n = P_{n-1}$$ if both extremities of the edge $$e_n$$ were in a same region of $$P_{n-1}$$. Otherwise, $$P_n$$ is obtained by merging two regions of $$P_{n-1}$$.

The canonical binary partition tree is then the tree representing the merging steps in this sequence, it is thus binary. Each merging step, and thus each non leaf node of the tree, is furthermore associated to a specific edge of the graph, called a building edge that led to this merge. It can be shown that the set of all building edges associated to a canonical binary partition tree is a minimum spanning tree of the graph for the given edge ordering $$\prec$$.

The map that associates every non leaf node of the canonical binary partition tree to its building edge is called the mst_edge_map. In practice this map is represented by an array of size $$tree.num\_vertices() - tree.num\_leaves()$$ and, for any internal node $$i$$ of the tree, $$mst\_edge\_map[i - tree.num\_leaves()]$$ is equal to the index of the building edge in $$G$$ associated to $$i$$.

The ordering $$\prec$$ can be specified explicitly by providing the array of indices sorted_edge_indices that sort the edges, or implicitly by providing the array of edge weights edge_weights. In this case, sorted_edge_indices is set equal to hg.arg_sort(edge_weights, stable=True). If edge_weights is an array with more than 1 dimension, a lexicographic ordering is used.

If requested, altitudes associated to the nodes of the canonical binary partition tree are computed as follows:

• if edge_weights are provided, the altitude of a non-leaf node is equal to the edge weight of its building edge; and

• otherwise, the altitude of a non-leaf node is equal to the rank of its building edge.

The altitude of a leaf node is always equal to 0.

Example:

The above figure corresponds to the following code (note that vertex indices start at 0 in the code):

>>> g = hg.UndirectedGraph(5)
>>> g.add_edges((0, 0, 1, 1, 1, 2, 3),
>>>             (1, 2, 2, 3, 4, 4, 4))
>>> edge_weights = np.asarray((4, 6, 3, 7, 11, 8, 5))
>>> tree, altitudes = hg.bpt_canonical(g, edge_weights)
>>> tree.parents()
array([6, 5, 5, 7, 7, 6, 8, 8, 8])
>>> altitudes
array([0, 0, 0, 0, 0, 3, 4, 5, 7])
>>> tree.mst_edge_map
array([2, 0, 6, 3])
>>> tree.mst.edge_list()
(array([1, 0, 3, 1]), array([2, 1, 4, 3]))


An object of type UnidrectedGraph is not necessary:

>>> edge_weights = np.asarray((4, 6, 3, 7, 11, 8, 5))
>>> sources = (0, 0, 1, 1, 1, 2, 3)
>>> targets = (1, 2, 2, 3, 4, 4, 4)
>>> tree, altitudes = hg.bpt_canonical((sources, targets, 5), edge_weights)
>>> tree.parents()
array([6, 5, 5, 7, 7, 6, 8, 8, 8])
>>> altitudes
array([0, 0, 0, 0, 0, 3, 4, 5, 7])
>>> tree.mst_edge_map
array([2, 0, 6, 3])

Complexity:

The algorithm used is based on Kruskal’s minimum spanning tree algorithm and is described in:

Laurent Najman, Jean Cousty, Benjamin Perret. Playing with Kruskal: Algorithms for Morphological Trees in Edge-Weighted Graphs. ISMM 2013: 135-146.

If sorted_edge_indices is provided the algorithm runs in quasi linear $$\mathcal{O}(n \alpha(n))$$, with $$n$$ the number of elements in the graph and with :mathalpha the inverse of the Ackermann function. Otherwise, the computation time is dominated by the sorting of the edge weights which is performed in linearithmic $$\mathcal{O}(n \log(n))$$ time.

Parameters:
• graph – input graph or triplet of two arrays and an integer (sources, targets, num_vertices) defining all the edges of the graph and its number of vertices.

• edge_weights – edge weights of the input graph (may be omitted if sorted_edge_indices is given).

• sorted_edge_indices – array of indices that sort the edges of the input graph by increasing weight (may be omitted if edge_weights is given).

• return_altitudes – if True an array representing the altitudes of the tree vertices is returned. (default: True).

• compute_mst – if True and if the input is a graph object computes an explicit undirected graph representing the minimum spanning tree associated to the hierarchy, accessible through the CptBinaryHierarchy Concept (e.g. with tree.mst). (default: True).

Returns:

a tree (Concept CptBinaryHierarchy if the input is a graph object), and, if return_altitudes is True, its node altitudes

canonize_hierarchy(tree, altitudes, return_node_map=False)[source]

Removes consecutive tree nodes with equal altitudes.

The new tree is composed of the inner nodes $$n$$ of the input tree such that $$altitudes[n] \neq altitudes[tree.parent(n)]$$ or $$n = tree.root(n)$$.

For example, applying this function to the result of bpt_canonical() on an edge weighted graph is the same as computing the quasi_flat_zone_hierarchy() of the same edge weighted graph.

If return_node_map is True, an extra array that maps any vertex index $$i$$ of the new tree, to the index of the corresponding vertex in the original tree is returned.

Parameters:
• tree – input tree

• altitudes – altitudes of the vertices of the tree

• return_node_map – if True, also return the node map.

Returns:

a tree (Concept CptHierarchy if input tree already satisfied this concept) its node altitudes, and, if requested, its node map.

quasi_flat_zone_hierarchy(graph, edge_weights)[source]

Computes the quasi flat zone hierarchy of the given weighted graph. The nodes of the quasi flat zone hierarchy corresponds to the connected components of all the possible thresholds of the edge weights.

Parameters:
• graph – input graph

• edge_weights – edge weights of the input graph

Returns:

a tree (Concept CptHierarchy) and its node altitudes

simplify_tree(tree, deleted_vertices, process_leaves=False)[source]

Creates a copy of the given tree and deletes the vertices $$i$$ of the tree such that $$deletedVertices[i]$$ is True.

The returned node_map is an array that maps any node index $$i$$ of the new tree, to the index of the corresponding node in the original tree.

Parameters:
• tree – input tree

• deleted_vertices – boolean valuation of the input tree nodes

• process_leaves – If False, a leaf vertex $$v$$ will never be removed disregarding the value of $$deletedVertices[v]$$. If True, leaves node may be removed. Note that in this case, a reordering of the nodes may be necessary, which is a more complex and slower operation.

Returns:

a tree (Concept CptHierarchy if input tree already satisfied this concept) and the node map

tree_2_binary_tree(tree)[source]

Transforms a tree into a binary tree.

Each non-leaf node of the input tree must have at least 2 children!

Whenever a non-leaf node $$n$$ with $$k > 2$$ children is found:

• an extra node $$m$$ is created;

• the first 2 children of $$n$$ become children of the new node $$m$$; and

• the new node $$m$$ becomes the first child of $$n$$.

The number of children of $$n$$ is thus reduced by 1. This operation is repeated $$k-2$$ times, i.e. until $$n$$ has only 2 children.

The returned node_map is an array that maps any node index $$i$$ of the new tree, to the index of the corresponding node in the original tree.

Complexity:

This algorithms runs in linear time $$O(tree.num\_vertices())$$.

Examples:

Compute the watershed hierarchy by area of an edge weighted graph and get the corresponding binary hierarchy. The returned node_map enables to recover the altitudes of the new hierarchy from the altitudes of the input hierarchy.

tree, altitudes = watershed_hierarchy_by_area(graph, edge_weights)
new_tree, node_map = tree_2_binary_tree(tree)
new_altitudes = altitudes[node_map]

Parameters:

tree – Input tree

Returns:

a tree (Concept CptHierarchy if input tree already satisfied this concept) and the node map

saliency(tree, altitudes, leaf_graph, handle_rag=True)[source]

The saliency map of the input hierarchy $$(tree, altitudes)$$ for the leaf graph $$g$$ is an array of edge weights $$sm$$ for $$g$$ such that for each pair of adjacent vertices $$(i,j)$$ in $$g$$, $$sm(i,j)$$ is equal to the ultra-metric distance between $$i$$ and $$j$$ corresponding to the hierarchy.

Formally, this is computed using the following property: $$sm(i,j) = altitudes(lowest\_common\_ancestor_{tree}(i,j))$$.

Complexity: $$\mathcal{O}(n\log(n) + m)$$ with $$n$$ the number of vertices in the tree and $$m$$ the number of edges in the graph.

Parameters:
• tree – input tree (Concept CptHierarchy)

• altitudes – altitudes of the vertices of the tree

• leaf_graph – graph whose vertex set is equal to the leaves of the input tree (deduced from CptHierarchy)

• handle_rag – if tree has been constructed on a rag, then saliency values will be propagated to the original graph, hence leading to a saliency on the original graph and not on the rag

Returns:

1d array of edge weights