Supporting Functions for DC-ML
I will be using some tools to support my data mining functions. I will put them here for your reference.
SelectData
This function filters a set of data by rows. The default is every 4 in 5 is selected as training data. Every 5th row is for validation data.
dcrML.Help.SelectData
=LAMBDA(array, selectTrain, [headers], [ratioTrain], [ratioValidate],
LET(
ratioTrain, IF(ISOMITTED(ratioTrain), 4, ratioTrain),
ratioValidate, IF(ISOMITTED(ratioValidate), 1, ratioValidate),
selectTrain, IF(ISOMITTED(selectTrain), TRUE, selectTrain),
ratioTotal, ratioTrain + ratioValidate,
selected, IF(selectTrain,
FILTER(array, MOD(ROW(array),ratioTotal) < ratioTrain),
FILTER(array, MOD(ROW(array),ratioTotal) >= ratioTrain)
),
IF(ISOMITTED(headers),
selected,
VSTACK(headers, selected)
)
)
)GetHeaders
This function is overloaded. If dataHeaders are provided, it returns them. However if none provided, it returns a sequential headers: "Feature 1", "Feature 2", ... unless a different headerName prefix is provided.
dcrML.Help.GetHeaders
=LAMBDA(arrayData, [dataHeaders], [headerName],
LET(
headerName, IF(ISOMITTED(headerName), "Feature ", headerName),
numCols, COLUMNS(arrayData),
IF(ISOMITTED(dataHeaders),
headerName & TOROW(SEQUENCE(numCols)),
dataHeaders
)
)
)
Stay tune for data mining topics here in DC-DEN!

Comments
Post a Comment