Clustering Info Container
idendrogram.ClusteringData
This class is used as a container to store underlying clustering data which may be used by callback functions in generating the dendrogram. Ensures expensive operations are calculated only once.
Example
#your clustering workflow
Z = scipy.cluster.hierarchy.linkage(...)
threshold = 42
cluster_assignments = scipy.cluster.hierarchy.fcluster(Z, threshold=threshold, ...)
#dendrogram creation
dd = idendrogram.idendrogram()
cdata = idendrogram.ClusteringData(
linkage_matrix=Z,
cluster_assignments=cluster_assignments
)
dd.set_cluster_info(cdata)
Source code in idendrogram/clustering_data.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
|
__init__(linkage_matrix, cluster_assignments, leaders=None, rootnode=None, nodelist=None)
Set underlying clustering data that may be used by callback functions in generating the dendrogram. Ensures expensive operations are calculated only once.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
linkage_matrix |
np.ndarray
|
Linkage matrix as produced by
|
required |
cluster_assignments |
np.ndarray
|
A one dimensional array of length N that contains flat cluster assignments for each observation. Produced by |
required |
leaders |
Tuple[np.ndarray, np.ndarray]
|
Root nodes of the clustering produced by |
None
|
rootnode |
sch.ClusterNode
|
rootnode produced by |
None
|
nodelist |
List[sch.ClusterNode]
|
nodelist produced by |
None
|
Example
#your clustering workflow
Z = scipy.cluster.hierarchy.linkage(...)
threshold = 42
cluster_assignments = scipy.cluster.hierarchy.fcluster(Z, threshold=threshold, ...)
#dendrogram creation
dd = idendrogram.idendrogram()
cdata = idendrogram.ClusteringData(
linkage_matrix=Z,
cluster_assignments=cluster_assignments,
)
dd.set_cluster_info(cdata)
Source code in idendrogram/clustering_data.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
|
get_cluster_assignments()
Returns flat cluster assignment array.
Returns:
Name | Type | Description |
---|---|---|
cluster_assignments |
np.ndarray
|
A one dimensional array of length N that contains flat cluster assignments for each observation. Produced by |
Source code in idendrogram/clustering_data.py
114 115 116 117 118 119 120 |
|
get_cluster_id(linkage_id)
Returns flat cluster ID for a given linkage ID
Parameters:
Name | Type | Description | Default |
---|---|---|---|
linkage_id |
int
|
Node linkage ID |
required |
Returns:
Type | Description |
---|---|
Optional[int]
|
Optional[int]: CLuster ID if a node is within one cluster; None otherwise. |
Source code in idendrogram/clustering_data.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
|
get_leaders()
A wrapper for scipy.cluster.hierarchy.leaders. Returns the root nodes in a hierarchical clustering.
Returns:
Type | Description |
---|---|
Tuple[np.ndarray, np.ndarray]
|
[L, M] (see SciPy's documentation for details) |
Source code in idendrogram/clustering_data.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
|
get_linkage_matrix()
Returns stored linkage matrix.
Returns:
Name | Type | Description |
---|---|---|
linkage_matrix |
np.ndarray
|
Linkage matrix as produced by scipy.cluster.hierarchy.linkage or equivalent. |
Source code in idendrogram/clustering_data.py
107 108 109 110 111 112 |
|
get_merge_map()
Returns a dictionary that maps pairs of linkage matrix IDs to the linkage matrix ID they get merged into.
Returns:
Name | Type | Description |
---|---|---|
dict |
dict
|
Dictionary tuple(ID, ID) -> merged_ID |
Source code in idendrogram/clustering_data.py
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
|
get_tree()
A wrapper for scipy.cluster.hierarchy.to_tree. Converts a linkage matrix into an easy-to-use tree object.
Returns:
Type | Description |
---|---|
Tuple[sch.ClusterNode, List[sch.ClusterNode]]
|
Tuple[scipy.cluster.hierarchy.ClusterNode, List[scipy.cluster.hierarchy.ClusterNode]]: [rootnode, nodelist] (see SciPy's documentation for details) |
Source code in idendrogram/clustering_data.py
151 152 153 154 155 156 157 158 159 160 161 162 |
|