Compute the AMD curve across a range of cluster numbers
Source:R/compute_amd_curve.R
compute_amd_curve.RdThis function computes the Average Membership Deviation (AMD) curve for
fuzzy c-means clustering across a sequence of cluster numbers k.
For each k, multiple random initialisations are performed and the
AMD value is computed as:
Usage
compute_amd_curve(
data,
its,
nin,
nsp,
seeds = NULL,
verbose = TRUE,
plot_curve = FALSE,
open_device = TRUE,
scale_data = FALSE,
iter_max = 100,
m = 2,
preselect_top_sd = NULL
)Arguments
- data
A numeric matrix or data frame of samples (rows) × features (columns).
- its
Number of random initialisations per value of
k.- nin
Minimum number of clusters to evaluate.
- nsp
Maximum number of clusters to evaluate.
- seeds
Optional numeric vector of seeds for deterministic behaviour. Must have length
its * (nsp - nin + 1). IfNULL, random seeds are drawn.- verbose
Logical; print progress messages.
- plot_curve
Logical; if
TRUE, plot the AMD curve.- open_device
Logical; if
TRUE, open a new graphics device for the plot.- scale_data
Logical; if
TRUE, standardise features before clustering.- iter_max
Maximum number of iterations for fuzzy c-means.
- m
Fuzziness parameter for fuzzy c-means (default 2).
- preselect_top_sd
Optional integer; if provided, only the top-SD features are retained before clustering (useful for very high-dimensional data).
Value
A list with components:
- k_opt
The optimal number of clusters (maximising AMD peak).
- max
Vector of AMD peak values for each
k.- mean
Vector of mean AMD values across repetitions.
- raw
Matrix of AMD values (rows = repetitions, columns =
k).