Seasonal Forecasting One of the most common applications of predictive analytics is to forecast time-based data. This analytic uses R’s ordinary least squares regression algorithm to fit the best curve that captures the general trend and season variability of numeric data
so it can use to predict future values. R Script
Documentation Back to Contents | Forecast returns the forecasted value: RScript<_RScriptFile="SeasonalForecasting.R", _InputNames="Target, Trend, Season", StringParam9="">(Target, Trend, Season)
|
ARIMA One of the most common applications of predictive analytics is to forecast time-based data. This analytic uses the Auto-Regressive Integrated with Moving Average (ARIMA) algorithm to project a sequence of values ahead into the future, based on the assumption
that data points taken over time may have an internal structure that can be measured. While ARIMA tends to be esoteric and complex, this analytic uses the “auto.arima” function from R’s “forecast” package to search through a variety
of possible models in order to find the best one. Not only does this script generate the expected forecast values, it also provides outputs based on confidence bands, nominally set at 80% and 95% confidence levels. R Script
Documentation Back to Contents | Forecast returns the forecasted value: RScript<_RScriptFile="ARIMA.R", _InputNames="Target", SortBy=(Month), NumericParam1=12, NumericParam2=12, NumericParam3=80, NumericParam4=95, StringParam8="", StringParam9="">(Target)
|
ForecastLo1 returns the forecasted lower value of the first confidence band: RScript<_RScriptFile="ARIMA.R", _InputNames="Target", _OutputVar="ForecastLo1", SortBy=(Month), NumericParam1=12, NumericParam2=12, NumericParam3=80, NumericParam4=95, StringParam8="", StringParam9="">(Target)
|
ForecastHi1 returns the forecasted upper value of the first confidence band: RScript<_RScriptFile="ARIMA.R", _InputNames="Target", _OutputVar="ForecastHi1", SortBy=(Month), NumericParam1=12, NumericParam2=12, NumericParam3=80, NumericParam4=95, StringParam8="", StringParam9="">(Target)
|
ForecastLo2 returns the forecasted lower value of the second confidence band: RScript<_RScriptFile="ARIMA.R", _InputNames="Target", _OutputVar="ForecastLo2", SortBy=(Month), NumericParam1=12, NumericParam2=12, NumericParam3=80, NumericParam4=95, StringParam8="", StringParam9="">(Target)
|
ForecastHi2 returns the forecasted upper value of the second confidence band: RScript<_RScriptFile="ARIMA.R", _InputNames="Target", _OutputVar="ForecastHi2", SortBy=(Month), NumericParam1=12, NumericParam2=12, NumericParam3=80, NumericParam4=95, StringParam8="", StringParam9="">(Target)
|
k-Means Clustering Using the K-Means algorithm, this analytic clusters records "by their nature" so that records within a cluster have more in common with each other than with those records in the other clusters. Each cluster is defined by a central point, it's "mean". R Script
Documentation Back to Contents | Cluster returns the cluster to which the record belongs: RScript<_RScriptFile="kMeansClustering.R", _InputNames="Vars", NumericParam1=4, NumericParam2=10, StringParam9="">(Vars)
|
k-Medoids Clustering Using the K-Medoids algorithm, this analytic clusters records "by their nature" so that records within a cluster have more in common with each other than with those records in the other clusters. Each cluster is defined by a prototypical record, it's "medoid". R Script
Documentation
StorePerformance.mstr
Cluster&RegionRevenue.xlsx Back to Contents | Cluster returns the cluster to which the record belongs: RScript<_RScriptFile="kMedoidsClustering.R", _InputNames="Vars", NumericParam1=4, NumericParam2=10, StringParam9="">(Vars)
|
Medoids returns the cluster if a record is the mediod of that cluster, 0 otherwise: RScript<_RScriptFile="kMedoidsClustering.R", _InputNames="Vars", _OutputVar="Medoids", NumericParam1=4, NumericParam2=10, StringParam9="">(Vars)
|
k-Nearest Neighbors k-Nearest Neighbors (kNN) is a simple classification technique that is unique in the sense that no model is explicitly trained. In the kNN process, two datasets are read in: the training dataset in which the dependent variable is already known, and the test
dataset in which the dependent variable is unknown. Classifications for the test set are made by determining the k most similar records in the training dataset (known as neighbors) and returning the majority vote amongst those neighbors. R Script
Documentation Back to Contents | Class returns the predicted class as a string: RScript<_RScriptFile="kNN.R", _InputNames="ID, Target, Training, Vars", BooleanParam9=TRUE, NumericParam1=1, StringParam9="kNN">(ID, Target, Training, Vars)
|
ClassId returns the predicted class as a number: RScript<_RScriptFile="kNN.R", _InputNames="ID, Target, Training, Vars", _OutputVar="ClassId", BooleanParam9=TRUE, NumericParam1=1, StringParam9="kNN">(ID, Target, Training, Vars)
|
Naive Bayes Naïve Bayes is a simple classification technique wherein the Naïve assumption that the effect of the value of each variable is independent from all other variables is made. For each independent variable, the algorithm then calculates the conditional
likelihood of each potential class given the particular value for that variable and then multiplies those effects together to determine the probability for each class. The class with the highest probability is returned as the predicted class. R Script
Documentation Back to Contents | Class returns the predicted class as a string:/div> RScript<_RScriptFile="NaiveBayes.R", _InputNames="Target, Vars", BooleanParam9=TRUE, StringParam9="NaiveBayes", NumericParam1=1>(Target, Vars)
|
ClassId returns the predicted class as a number: RScript<_RScriptFile="NaiveBayes.R", _InputNames="Target, Vars", _OutputVar="ClassId", BooleanParam9=TRUE, StringParam9="NaiveBayes", NumericParam1=1>(Target, Vars)
|
Neural Network Neural Network is an advanced machine learning classification technique wherein a model is constructed that aims to simulate the thought process performed by the human brain. A model consists of “neurons” that are interconnected by an activation
function. Every record is then passed through the network from the appropriate input neuron to the proper output neuron through a series of weights and transformations defined by the activation function. R Script
Documentation Back to Contents | Class returns the predicted class as a string: RScript<_RScriptFile="NeuralNetwork.R", _InputNames="Target, Vars", StringParam9="NeuralNetwork", BooleanParam9=TRUE, NumericParam1=3, NumericParam2=42>(Target, Vars)
|
ClassId returns the predicted class as a number: RScript<_RScriptFile="NeuralNetwork.R", _InputNames="Target, Vars", _OutputVar="ClassId", StringParam9="NeuralNetwork", BooleanParam9=TRUE, NumericParam1=3, NumericParam2=42>(Target, Vars)RScript<_RScriptFile="NeuralNetwork.R", _InputNames="Target, Vars", _OutputVar="ClassId", _Params="FileName='NeuralNetwork', TrainMode=TRUE, NumLayer=3, Seed=42>(Target, Vars)
|
Pairwise Variable Correlation PairwiseCorr measures the correlation between pairs of numeric variables to show how they behave with respect to each other. The primary output of this analytic is a correlation plot and a correlation table that contain the correlations of the variables
when taken in pairs. R Script
Documentation Back to Contents | Result returns "Ok" when the correlations were calculated with no errors: RScript<_RScriptFile="PairwiseCorr.R", _InputNames="Labels, Vars", BooleanParam1=TRUE, NumericParam1=0, StringParam8="PairwiseCorr", StringParam9="PairwiseCorr">(Labels, Vars)
|
Random Forests Random Forest is an advanced classification technique wherein the training dataset is used to construct many independent decision trees. Every record is then passed into each individual decision tree for classification, and the class that is predicted by
the majority of those decision trees is returned as the predicted class for that record. R Script
Documentation Back to Contents | Class returns the predicted class as a string: RScript<_RScriptFile="RandomForest.R", _InputNames="Target, Vars", BooleanParam9=TRUE, StringParam9="RandomForest", NumericParam1=750, NumericParam2=3, NumericParam3=42>(Target, Vars)
|
ClassId returns the predicted class as a number: RScript<_RScriptFile="RandomForest.R", _InputNames="Target, Vars", _OutputVar="ClassId", BooleanParam9=TRUE, StringParam9="RandomForest", NumericParam1=750, NumericParam2=3, NumericParam3=42>(Target, Vars)
|
Stepwise Logistic Regression Stepwise Logistic Regression is a variant on classical Logistic Regression in which variables are only included in the model if they have a significant effect. R Script
Documentation Back to Contents | Probability returns the predicted probability for each record: RScript<_RScriptFile="StepwiseLogisticRegression.R", _InputNames="Target, Vars", StringParam9="", BooleanParam1=TRUE>(Target, Vars)
|
Stepwise Regression Stepwise Linear Regression is a variant on classical Linear Regression in which variables are only included in the model if they have a significant effect. R Script
Documentation Back to Contents | Forecast returns the predictions from the model: RScript<_RScriptFile="StepwiseRegression.R", _InputNames="Target, Vars", StringParam9="StepwiseRegression", BooleanParam1=TRUE>(Target, Vars)
|
Survival Analysis On a long enough timeline, the survival rate for everything drops to zero, including events such as a component failure or a customer being lost. This analytic uses the Cox Regression algorithm to quantify the effect that each independent variable has on
the likelihood that such an event will occur at some point in the future. R Script
Documentation Back to Contents | Risk returns the risk of an event occurring relative to the average : RScript<_RScriptFile="Survival.R", _InputNames="Time, Status, Vars", BooleanParam9=FALSE, StringParam9="">(Time, Status, Vars)
|