Supporting Functions for DC-ML
I will be using some tools to support my data mining functions. I will put them here for your reference.
SelectData
This function filters a set of data by rows. The default is every 4 in 5 is selected as training data. Every 5th row is for validation data.
dcrML.Help.SelectData =LAMBDA(array, selectTrain, [headers], [ratioTrain], [ratioValidate], LET( ratioTrain, IF(ISOMITTED(ratioTrain), 4, ratioTrain), ratioValidate, IF(ISOMITTED(ratioValidate), 1, ratioValidate), selectTrain, IF(ISOMITTED(selectTrain), TRUE, selectTrain), ratioTotal, ratioTrain + ratioValidate, selected, IF(selectTrain, FILTER(array, MOD(ROW(array),ratioTotal) < ratioTrain), FILTER(array, MOD(ROW(array),ratioTotal) >= ratioTrain) ), IF(ISOMITTED(headers), selected, VSTACK(headers, selected) ) ) )
GetHeaders
This function is overloaded. If dataHeaders are provided, it returns them. However if none provided, it returns a sequential headers: "Feature 1", "Feature 2", ... unless a different headerName prefix is provided.
dcrML.Help.GetHeaders =LAMBDA(arrayData, [dataHeaders], [headerName], LET( headerName, IF(ISOMITTED(headerName), "Feature ", headerName), numCols, COLUMNS(arrayData), IF(ISOMITTED(dataHeaders), headerName & TOROW(SEQUENCE(numCols)), dataHeaders ) ) )
Stay tune for data mining topics here in DC-DEN!
Comments
Post a Comment