/* MAIN TESTING FILE FOR ARTCAT artcat_test.do IRW and EMZ, 1jun2022 */ local path c:\ian\git\artcat // CHANGE TO YOUR FILE LOCATION adopath ++ `path'/package adopath ++ `path'/moreado cd `path'/package cap log close set more off set linesize 79 version 14 foreach type in float double { set type `type' // 1. We compared results with those given by \citet{Whitehead93}. Exact agreement was achieved. log using artcat_compare_with_Whitehead_`type', replace text do artcat_compare_with_Whitehead log close // 2. We compared results for a binary outcome in a superiority trial with those given by \texttt{artbin} and \texttt{power} across a range of probabilities and allocation ratios. Close, but not exact, agreement was achieved, except in a few well understood cases. log using artcat_compare_with_artbin_`type', replace text do artcat_compare_with_artbin, nostop log close log using artcat_compare_with_power_`type', replace text do artcat_compare_with_power, nostop log close // 3. We checked error messages in a number of impossible cases, for example negative odds ratio. log using artcat_check_errormsgs_`type', replace text do artcat_check_errormsgs, nostop log close // 4. We compared results with those given by the R package dani \citep{Quartagno2019b}. This calculates sample sizes for a binary outcome on the odds ratio scale for non-inferiority trials and implicitly uses the AA method. Exact agreement was achieved for the AA method. log using artcat_compare_with_dani_`type', replace text do artcat_compare_with_dani log close } // 5. We re-ran the test script, implementing the above tests, in Stata versions 13 and 16, and with the default variable type (\texttt{set type}) as \texttt{float} and as \texttt{double}. /* 6. We did various tests of internal consistency of the program. We compared different ways of stating the same problem (e.g. interchanging C and E arms, and reversing the order of the categories) and verified the same answer was achieved. We calculated the power $p$ for a sample size $n$, then calculated the sample size for power $p$, and checked that this equalled the original $n$. We changed options that should change the sample size and verified that they did change the sample size. NB this is done outside the type loop as one of its test involves changing type */ log using artcat_test_consistency, replace text do artcat_test_consistency log close // 7. The simulations reported in Section \ref{sec:sim} also test the software. /* NB Why we didn't test against other programs: ssi and niss: these for non-inferiority or equivalence designs, which are not comparable to artcat (do not use same null hypotheses) sampsi: this is no longer a part of Stata (since v13) */ /*** END OF MAIN TESTING PROGRAM FOR ARTCAT ***/