{smcl}
{* *! version 0.0.1 22feb2024}{...}
{pstd}
The first phase in all cross-validation work is splitting the data into either: 
(1) Training and Test sets; or (2) Training, Validation, and Test sets.  However, 
not all train/test (TT) or train/validation/test (TVT) splits are the same.  In 
the training set, the number of folds created with further affect the number of 
times and the sample size used when fitting the data.  This may seem like an odd 
description, but all data splitting is a form of K-Fold cross-validation.  The 
difference is determined by the number of folds.  In the traditional TT split 
case that we may be familiar with, there is a single K-Fold created for the 
training set.  If you want to use 5- or 10-fold cross-validation, the training 
set would be split into 5 or 10 approximately equal sized groups.  In the other 
extreme, leave-one-out (LOO) cross-validation generates as many folds as the 
number of sampling units in the training set.  So, the data splitting begins 
by first determining whether you want two or three sets of data to work with and 
then moves into determining how many pieces you want the training set to have. 

{pstd} 
You can control whether to use TT or TVT splits based on the number of arguments 
you initially pass to the command.  If you pass a single proportion, a TT split 
will result and if you include two proportions a TVT split will result.  To 
define how many folds the training set will have, use the {opt kfold} option.  
By default, the {opt kfold} option is set to 1.  {bf:CAUTION:} if you want to 
use LOO cross-validation, the number of folds must be equal to to number of 
sampling units in the training set.  Additionally, you need to use the {opt loo} 
option for {help splitit}; however, the {help xvloo} prefix will manage all of 
this for you.  See the note at the end of this section for more information.

{pstd}
Next, the splitting process needs to determine how the units will be allocated 
among these sets: simple random sampling w/o replacement (SRS) or clustered 
random sampling w/o replacement (CRS).  By default, the splitting process will 
use SRS.  If you pass a value to the {opt uid} parameter, CRS will be used.  
The {opt tpoint} option, although documented and available, should likely not be 
used at this time for most use cases.  It will implement CRS for panel data if 
the data are {help xtset} but creates an additional variable based on the time 
point passed to {opt tpoint} to define the records that should be used for 
forecasting.  At this time, forecasting methods are not supported by the 
{help crossvalidate} package, but the option exists for users who wish to handle 
those use cases on their own.

{pstd}
{bf:General Advise:} we suggest users opt for TVT splits over TT splits.  A test 
set should only be used to evaluate the model's performance a single time after 
all "training" and hyperparameter tuning is completed.  Using the evaluation 
results from your test set while still adjusting the model and/or its parameters 
will likely lead to tuning the model to the test set.  In that case, the 
evaluation metrics will be overly optimistic compared to what should be 
reasonably expected for completely new data.

{pstd}
{bf:Leave-one-out cross-valiation} should generally only be used when you have a 
small to moderately sized dataset.  With large datasets, the model is fitted to 
the data n - 1 times, with a sample size of n - 1, where n is the number of 
sampling units in the training set; this is analogous to using {help jackknife}.  
The amount of time it will take to get results can rapidly increase.  
Additionally, we encourage you to use the {opt difficult} option for models that 
use {help ml:maximum likelihood}, as well as specifying the number of iterations 
and/or convergence criterion in your estimation command.  This can mitigate the 
risk of encountering a flat region or saddle point in the likelihood that may 
stall or halt progress in your model fitting otherwise.

{pstd}
The {opt loo} option is required to implement LOO splitting of the training set 
due to the manner in which the folds are created.  For all other instances where 
the number of folds, k, is greater than one, we use {help xtile} to generate 
approximately equal sized folds in the training set.  However, LOO requires 
exactly one sampling unit to be omitted from each fold.  In this case, ties that 
may result from {help xtile} can lead to the incorrect number of sampling units 
being omitted from each fold.  The {opt loo} option is used to implement 
alternative logic that ensures a single sampling unit will be omitted from each 
fold.